Skip to content

LamOne1/Data-Science-Nanodegree_Project-2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Science-Nanodegree_Project#2

Table of contents:

  1. Introduction
  2. Installations
  3. Project Components
  4. File Descriptions
  5. Licensing, Authors, Acknowledgements, etc.

Introduction

In this project, we will analyze disaster data to build a model for an API that classifies disaster messages. The data contains real messages that were sent during disaster events. We will create a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency. This project will include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data. Below are a few screenshots of the web app.

Distribution of Message Genres: image

Distribution of Message Categories: image

Test example: image

Installations

You need to install Python3 and the following packages:

  • pandas
  • tqdm
  • numpy
  • sklearn
  • nltk
  • sqlalchemy
  • pickle
  • flask
  • plotly

Project Components

There are three componentsof this project:

1. ETL Pipeline In a Python script, process_data.py does data cleaning pipeline that:

  1. Loads the messages and categories datasets
  2. Merges the two datasets
  3. Cleans the data
  4. Stores it in a SQLite database

2. ML Pipeline In a Python script, train_classifier.py writes a machine learning pipeline that:

  1. Loads data from the SQLite database
  2. Splits the dataset into training and test sets
  3. Builds a text processing and machine learning pipeline
  4. Trains and tunes a model using GridSearchCV
  5. Outputs results on the test set
  6. Exports the final model as a pickle file

3. Flask Web App This components will display the results in a Flask web app.

File Descriptions

- app
| - template
| |- master.html  # main page of web app
| |- go.html  # classification result page of web app
|- run.py  # Flask file that runs app

- data
|- disaster_categories.csv  # data to process 
|- disaster_messages.csv  # data to process
|- process_data.py
|- InsertDatabaseName.db   # database to save clean data to

- models
|- train_classifier.py
|- classifier.pkl  # saved model 

- screenshots
|- screenshot1.png
|- screenshot2.png 
|- screenshot3.png

- README.md

to run the app, go to the app folder then python run.py

Licensing, Authors, Acknowledgements, etc.

Thanks for Udacity for providing this fun project :)!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published