Syllabus
Course Code: Elective-IV PE-CS-D405 Course Name: Information Retrieval |
||
MODULE NO / UNIT | COURSE SYLLABUS CONTENTS OF MODULE | NOTES |
---|---|---|
1 | Introduction: Goals and history of IR. The impact of the web on IR. The role of artificial
intelligence (AI) in IR. Basic IR Models: Boolean and vector-space retrieval models; ranked
retrieval; text-similarity metrics; TF-IDF (term frequency/inverse document frequency)
weighting; cosine similarity. Basic Tokenizing Indexing, and Implementation of Vector-Space Retrieval: Simple tokenizing, stop-word removal, and stemming; inverted indices; efficient processing with sparse vectors; python implementation. |
|
2 | Experimental Evaluation of IR: Performance metrics: recall, precision, and F-measure;
Evaluations on benchmark text collections. Query Operations and Languages: Relevance feedback; Query expansion; Query languages. |
|
3 | Text Representation: Word statistics; Zipf's law; Porter stemmer; morphology; index term
selection; using thesauri. Metadata and markup languages (SGML, HTML, XML). Web Search: Search engines; spidering ; metacrawlers; directed spidering; link analysis (e.g. hubs and authorities, Google PageRank); shopping agents. |
|
4 | Text Categorization and Clustering: Categorization algorithms: naive Bayes; decision
trees; and nearest neighbor. Clustering algorithms: agglomerative clustering; k-means;
expectation maximization (EM). Applications to information filtering; organization; and
relevance feedback. Recommender Systems: Collaborative filtering and content-based recommendation of documents and products |