Skip to content

josephlozano/cef_data_modeling_analytics

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview of Data Analytics Project for Chamberlin Education Foundation

This repository contains a series of Jupyter notebooks designed to download, transform, and analyze school-level data relevant to the Chamberlin Education Foundation, which supports schools in California. The data utilized in this project is sourced from the California Department of Education, and it aims to provide insights into various aspects of school performance and demographics.

Structure of the Repository

The project is organized into three main phases, each handled by specific Jupyter notebooks:

Phase 1: Data Preparation

The download_format_files_to_csv.ipynb notebook automates the download and conversion of data from text files to more accessible CSV formats. This data is stored in designated folders for subsequent analysis. Detailed data sources include:

Phase 2: Data Transformation

Several notebooks are dedicated to transforming historical data files into a unified format, similar to stacking records in a database:

  • create_caaspp_dataset.ipynb
  • create_chronic_absenteeism_dataset.ipynb
  • create_expulsion_dataset.ipynb
  • create_suspension_dataset.ipynb
  • create_cum_enrollment_dataset.ipynb
  • create_enrollment_el_dataset.ipynb

These notebooks produce six comprehensive datasets, which are stored in the final_long_datasets_domain folder.

Phase 3: Data Integration

This phase consists of two key notebooks:

  • create_union_of_datasets.ipynb: Combines the datasets from Phase 2 into a single, long format dataset, resulting in the metric_values_fact.csv.
  • create_merge_enrollment.ipynb: Merges enrollment-related datasets to create a wide dataset format, enhancing the analysis capabilities.

Accessing the Data

The data analyzed here primarily pertains to schools served by the Chamberlin Education Foundation. A complete list of these schools can be found in the cef_school_list.csv file, included in this repository.

Purpose of the Project

The goal of this project is to provide data modles which can then be accessed via a BI tool like Tableau, which can enable stakeholders with actionable insights into school performance and demographic trends over multiple years, aiding stakeholders in making informed decisions to support educational initiatives.

How to Use This Repository

Non-technical users interested in exploring the processed data can refer to the final CSV files generated by the notebooks. Technical users can execute the notebooks to understand the detailed steps involved in the data processing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%