Skip to content

A Mini-Wikipedia search engine, which uses Block-Sort-Based-Indexing to create the inverted index of a given wikipedia dump, queries on the index and retrieves top 10 results via relevance ranking of the documents(implemented via tf-idf scoring)

Notifications You must be signed in to change notification settings

RArora28/Wikipedia-Search-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

To setup Pystemmer, follow /PyStemmer-1.0.1/README

To create the inverted index of some dump, run: python wiki_indexer.py <input_wiki_file_name_dump> <output_file>

To search in the input dump: python query.py

About

A Mini-Wikipedia search engine, which uses Block-Sort-Based-Indexing to create the inverted index of a given wikipedia dump, queries on the index and retrieves top 10 results via relevance ranking of the documents(implemented via tf-idf scoring)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published