Query
Processing in Web Search Engines: Techniques
and Challenges
By
-
Prof. Torsten Suel
Principal
Research Scientist, Yahoo!Research
Associate
Professor, Department of Computer and Information
Science, Polytechnic University, Brooklyn,
New York
-
|
Date:
April 30, 2008 (Wednesday) |
Time:
2:00p.m. - 3:00p.m. |
Venue:
Rm. 1009 William MW Mong Engineering Building,
CUHK |
Abstract
:
Large
web search engines have to answer thousands of queries
per second with interactive response times. Due to
the size of the data sets involved, usually in the
range of many terabytes, a single query may require
the processing of hundreds of megabytes or gigabytes
of index data. To keep up with this immense workload,
large search engines employ clusters of thousands
of machines, and various techniques such as caching,
index compression, and index and query pruning are
used to improve scalability. In this talk, we provide
a brief introduction to query processing in large
web search engines. We outline the basic architecture
and query execution framework, describe various techniques
for query optimization, and discuss recent developments
and open research challenges in this research area.
The talk will be largely self-contained and should
be accessible to anybody with a general background
in Computer Science.
Biography
:
Torsten
Suel is a Principal Research Scientist at Yahoo! Research,
and an Associate Professor in the Department of Computer
and Information Science at Polytechnic University
in Brooklyn, NY. He received a Diplom degree from
the Technical University of Braunschweig (Germany),
and a Ph.D. from the University of Texas at Austin.
After postdoctoral research at the NEC Research Institute,
UC Berkeley, and Bell Labs, he joined Polytechnic
University in the Fall of 1998. His main research
interests are in the areas of web search engines and
web data mining, algorithms, databases, and distributed
systems.
|