信息检索与搜索引擎Introduction to Information RetrievalGESC1007
Philippe Fournier-Viger
Full professor
School of Natural Sciences and Humanities
******@
Spring 2021
1
Last week
We have discussed:
Evaluation in an information retrieval system
Today:
Web search engines
Second assignment
About the final exam
2
Course schedule (日程安排)
3
Week 1
Introduction (Chapter 1)
Boolean retrieval
Week 2
Term vocabulary and posting lists (Chapter 2)
Week 3
Dictionaries and tolerant retrieval (Chapter 3)
Week 4
Index construction (Chapter 4)
Week 5
Scoring, term weighting, the vector space model (Chapter 6)
Week 6
A complete search system (Chapter 7)
Week 7
Evaluation in information retrieval
Week 8
Web search engines, advanced topics, conclusion
Final exam (to be announced)
Web Search engines
4
The Web
What is special about the Web?
The number of documents (very large)
Lack of coordination in the creation of documents,
Diversity of backgrounds and motives of content creators.
5
The Web
The Web is a set of webpages (网页)
Webpages are created using a language called HTML
6
Webpage
HTML
-a-Simple-Web-Page-with-HTML
The Web
Webpages are stored on servers (服务器)
To access a webpage, one must use a software called a Web browser (浏览器)
7
Browser
SERVERof
HITSZ
Internet
Home
The Web
Webpages are stored on servers (服务器)
To access a webpage, one must use a software called a Web browser (浏览器)
8
Browser
SERVERof
HITSZ
Internet
Webpages are sent over the internet using the HTTP protocol (HTTP协议)
Home
The Web
The idea of the Web: each webpage contain links to other webpages (hyperlinks - 超链接).
Each webpage has an address (URL) .
Creating a simple webpage is not very difficult.
Webpages have become one of the best way to supply and consume information.
9
The Web
Billions of webpages containing information.
But if we cannot search this informati
信息检索与搜索引擎课件 来自淘豆网www.taodocs.com转载请标明出处.