This is the code repository for Essential Statistics for Non-STEM Data Analysts, published by Packt.
Get to grips with the statistics and math knowledge needed to enter the world of data science with Python
Errata Page 13 section "Data Imputation"
It lookss like as follows: df2[columnName] = df2[columnName].apply(replace-question_mark)
It should look like as follows: df2[columnName] = df2[columnName].apply(replace_question_mark)
Statistics remain the backbone of modern analysis tasks, helping you to interpret the results produced by data science pipelines. This book is a detailed guide covering the math and various statistical methods required for undertaking data science tasks.
This book covers the following exciting features:
- Find out how to grab and load data into an analysis environment
- Perform descriptive analysis to extract meaningful summaries from data
- Discover probability, parameter estimation, hypothesis tests, and experiment design best practices
- Get to grips with resampling and bootstrapping in Python
- Delve into statistical tests with variance analysis, time series analysis, and A/B test examples
If you feel this book is for you, get your copy today!
All of the code is organized into folders. For example, Chapter02.
The code will look like the following:
import pandas as pd
df = pd.read_excel("PopulationEstimates.xls",skiprows=2)
df.head(8) margin: 0
Following is what you need for this book: This book is an entry-level guide for data science enthusiasts, data analysts, and anyone starting out in the field of data science and looking to learn the essential statistical concepts with the help of simple explanations and examples. If you’re a developer or student with a non-mathematical background, you’ll find this book useful. Working knowledge of the Python programming language is required.
With the following software and hardware list you can run all code files present in the book (Chapter 1-13).
Chapter | Software required | OS required |
---|---|---|
1 | Google Colab or Jupyter Notebook | Windows, Mac OS X, and Linux (Any) |
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.
Rongpeng Li is a data science instructor and a senior data scientist at Galvanize, Inc. He has previously been a research programmer at Information Sciences Institute, working on knowledge graphs and artificial intelligence. He has also been the host and organizer of the Data Analysis Workshop Designed for Non-STEM Busy Professionals at LA.
Click here if you have any feedback or suggestions.
If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.