- Introduction
- Installations
- Project Components
- File Descriptions
- Licensing, Authors, Acknowledgements, etc.
In this project, we will analyze disaster data to build a model for an API that classifies disaster messages. The data contains real messages that were sent during disaster events. We will create a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency. This project will include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data. Below are a few screenshots of the web app.
Distribution of Message Genres:
Distribution of Message Categories:
You need to install Python3 and the following packages:
- pandas
- tqdm
- numpy
- sklearn
- nltk
- sqlalchemy
- pickle
- flask
- plotly
There are three componentsof this project:
1. ETL Pipeline In a Python script, process_data.py does data cleaning pipeline that:
- Loads the
messages
and categoriesdatasets
- Merges the two datasets
- Cleans the data
- Stores it in a SQLite database
2. ML Pipeline In a Python script, train_classifier.py writes a machine learning pipeline that:
- Loads data from the SQLite database
- Splits the dataset into training and test sets
- Builds a text processing and machine learning pipeline
- Trains and tunes a model using GridSearchCV
- Outputs results on the test set
- Exports the final model as a pickle file
3. Flask Web App This components will display the results in a Flask web app.
- app
| - template
| |- master.html # main page of web app
| |- go.html # classification result page of web app
|- run.py # Flask file that runs app
- data
|- disaster_categories.csv # data to process
|- disaster_messages.csv # data to process
|- process_data.py
|- InsertDatabaseName.db # database to save clean data to
- models
|- train_classifier.py
|- classifier.pkl # saved model
- screenshots
|- screenshot1.png
|- screenshot2.png
|- screenshot3.png
- README.md
to run the app, go to the app folder then python run.py
Thanks for Udacity for providing this fun project :)!