INFORMS Logo
 

Web Mining


Session: TC03
Date/Time: Tuesday 14:15-15:45
Type: Invited
Sponsor:
Track:
Cluster: Data Mining
Room:
Chair: Filippo Menczer
Chair Address: University of Iowa, MS Dept., S320 PBB, Iowa City, IA 52242
Chair E-mail: filippo-menczer@uiowa.edu
Chair:
Chair Address:
Chair E-mail:

TC03.1 Enhanced Search: CLEVER & the Future
  • David Gibson; University of California, Dept. of Comp. Sci., 581 Soda Hall, Berkeley, CA 94720; dag@cs.berkeley.edu

The next generation of search engines will integrate the Web, presenting it as a single well-authored document, with sophisticated indexes, by combining lexical retrieval with hyperlink analysis. I shall present a theoretical model for combining the eigenvector-based techniques of latent semantic analysis with the clever link analysis algorithm.

TC03.2 Mining Entities via Information Extraction
  • David Eichmann; University of Iowa, Sch. of Library & IS, 3067 Library Bldg., Iowa City, IA 52242; david-eichmann@uiowa.edu

Typical approaches to indexing Web pages fail to capitalize on the rich, if ill-structured, nature of the actual text. Information extraction techniques can provide significant aids in increasing the precision of search results. We describe our approach to search result clustering based upon recognition of named entities.

TC03.3 Online Information Finding & WebFind: A Database-Inspired Solution
  • Alvaro Monge; California State University, Dept. of CE/CS, 532 ECS Bldg., Long Beach, CA 90840; monge@cecs.csulb.edu

In databases, known information from one entity gives access to other entities. This join can be extended to online information, with portals as the entities. By integrating information from portals, the search for information yields higher quality results. We describe how this idea is used in WebFind and other tools.

TC03.4 Mining the Web with Query-Driven Adaptive Crawlers

Effective crawling strategies are essential for topic specific search engines, where crawlers must maximize the number of relevant pages visited given finite bandwidth resources. We describe various different methods to evaluate crawlers and apply these metrics to compare 3 crawling algorithms based on similarity ranking, link analysis and adaptive agents.


For information on individual presentations, please contact the authors directly.

Return to Conference home page
Questions on membership, subscriptions and the like should go to INFORMS Customer Service. 
Questions/comments of a general nature about this Web site should go to Editor, IOL. 
Copyright © Institute for Operations Research and the Management Sciences