Forest fires ar e a major environmental issue, creating economic and ecological damage while endangering human lives. The objective of this potential project would be to identify Which state and counties are the most fire-prone and to predict the cause of a wildfire. This model enables the appropriate organizations to take preventative action, such as cutting firebreaks, as well as informing planning and preparedness activities, such as where to store fire retardant. I have evaluated Decision Tree,Random Forest Classifier and Gradient Boosting Decision Tree models. Using the Random forest classifier model trained in this project, we can predict the cause of these wildfires, at least to an accuracy of 58% or better. Reducing the number of labels(Fire Cause classes) significantly improves the prediction score to 80% for the random forest classifier model.
https://www.kaggle.com/rtatman/188-million-us-wildfires
pre-processed and post processed Data used in this project: https://github.com/lasyabheemendra/Sprigboard-DatascienceProjects/tree/master/Capstone1_US-Wildfire-Prediction/Data