This project is a web-based application that recommends books based on a user's query. The app utilizes NLP techniques to tokenize and analyze book descriptions, and applies TF-IDF and Latent Semantic Indexing (LSI) models to find and recommend similar books. Flask is used for the backend server and web interface.
Note: If you're a contributer please read CONTRIBUTING.md
- Book Similarity Search : Given a search query, the system finds the top 5 books that are most relevant based on their descriptions.
- TF-IDF and LSI Models : Uses trained TF-IDF and LSI models to analyze book descriptions and generate similarity scores.
- Book Descriptions : Provides a truncated description of each recommended book (first three sentences).
- Relevance Score : Displays the relevance of each recommended book as a percentage based on the user's query.
- Python 3.7+
- Flask
- Pandas
- Spacy
- Gensim
-
Clone this repository:
git clone https://github.com/your-username/book-recommendation-system.git cd book-recommendation-system
-
Create a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install required Python packages:
pip install -r requirements.txt
-
Download the Spacy English language model:
python -m spacy download en_core_web_sm
-
Place the dataset
Book_Dataset_1.csv
in the project root directory. -
If models (
models.pickle
) are not available, they will be trained automatically when the app is first run. -
Start the Flask application:
python app.py
-
Open your browser and navigate to
http://127.0.0.1:5000/
to access the application.
- On the home page, enter a search query related to a book (e.g., "mystery novel").
- The app will return a list of books that are most relevant to your query, showing the book titles, truncated descriptions, and relevance scores.
- Click on the images or links to learn more about each book.
The dataset used is Book_Dataset_1.csv
, which contains the following columns:
- Title : The title of the book.
- Book_Description : The description or plot summary of the book.
- Image_Link : A URL to the book cover image.
- Search Model : The system uses TF-IDF and LSI models for search queries. You can modify the
num_topics
parameter in the LSI model to adjust the granularity of topics.
- "Science fiction"
- "Romantic novels"
- "Historical mystery"
- Added templates base.html, index.html, login.html, register.html and results.html
- Implemented MySQL based authentication system
- A database must be created in mysql and db name must be added to frontend.py as needed
- All mysql connection parameters must be updated to makr the connection with the database