You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Mini-Wikipedia search engine, which uses Block-Sort-Based-Indexing to create the inverted index of a given wikipedia dump, queries on the index and retrieves top 10 results via relevance ranking of the documents(implemented via tf-idf scoring)
To setup Pystemmer, follow /PyStemmer-1.0.1/README
To create the inverted index of some dump, run: python wiki_indexer.py <input_wiki_file_name_dump> <output_file>
To search in the input dump: python query.py
About
A Mini-Wikipedia search engine, which uses Block-Sort-Based-Indexing to create the inverted index of a given wikipedia dump, queries on the index and retrieves top 10 results via relevance ranking of the documents(implemented via tf-idf scoring)