This project will make use of Pandas, and NumPy for the data exploration phase as well as using Matplotlib and Seaborn to form visualizations. We will then be using Scikit-Learn to model our multi-variate linear regression. We will incorporate some dummies datasets creation to deal with categorical data as well as log-transformation methodology to deal with the continuous features of the dataset.
The purpose of this project is to come up with ways in which to maximize profitability for sellers attempting to sell a home in King County,WA. We will search for actionable insights that will serve guidance to these sellers, but we need a thorough understanding of the dynamics of the housing market in order to drive our calculated decisions.
Recommendation # 1:
My first recommendation to sellers would be to make living space square footage their focal point. Correlation between square footage of living space and price of the home is fairly high compared to the other features. It is clear that larger homes mandate higher asking prices. Selling homes on the larger-end of the spectrum are guaranteed to generate the most revenue.
Recommendation # 2:
My second recommendation would be to pay particular attention to the locality of the home. House prices are clustered according to zipcode. Many factors and variables, tied into the zipcode, may influence the price either positively or negatively and we must be mindful of that.
Recommendation #3:
My third recommendation would be to attend to the grade given by King County to the home. It is very influential in the price of the home. In general, as the grade increases, the price increases as well. This highlights the positive linear correlation between the two.
Sidenote: The grade distribution follows a normal curve, which suggests that they are being issued in a forthright and diligent manner. If interested it would be engaging to see what goes into the grading component of the homes. But that's a project for another time.
This correlational heatmap was used throughout the project to guide me in the feature selection process and may be very helpful and finding other interesting correlational to experiment with.
Please take a look at the jupyter notebook file included with this repository. I include bonus recommendations and future work/research to keep in mind if you hope to expand on my work.
1. Make living space square footage your number one feature to look out for.
2. Location is an extremely important feature when evaluating the price of a home.
3. The grade given to a home by the King County Housing Department is very influential in the price.