Skip to content

Latest commit

 

History

History
95 lines (75 loc) · 3.77 KB

README.md

File metadata and controls

95 lines (75 loc) · 3.77 KB

IBM Attrition Predictor

bg

Dataset

https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset

Features

  • Data Preprocessing and EDA
  • Model Training and Evaluation (Logistic Regression, MLP, XG-Boost)
  • Training Pipeline
  • Inference Pipeline
  • Data Ingestion and Transformation
  • Model Trainer
  • Hyperparameter Tuning
  • Automatic Data Augmentation
  • Docker Image Creation Script
  • CI/CD Workflow (GitHub Actions to Amazon ECR to Amazon EC2)
  • Reverse Proxy Setup for HTTPS Requests
  • SSL & TLS Certificates

Installation

  1. Clone the repository:
    git clone https://github.com/yourusername/IBM_Attrition_Predictor.git
  2. Navigate to the project directory:
    cd IBM_Attrition_Predictor
  3. Create a virtual environment:
    python -m venv venv
  4. Activate the virtual environment:
    • On Windows:
      venv\Scripts\activate
    • On macOS/Linux:
      source venv/bin/activate
  5. Install the required packages:
    pip install -r requirements.txt

Usage

Data Preprocessing and EDA

  1. Open the Jupyter notebook for EDA:
    jupyter notebook src/notebooks/EDA.ipynb
  2. Run the cells to preprocess the data and perform exploratory data analysis.

Model Training

  1. Open the Jupyter notebook for model training:
    jupyter notebook src/notebooks/models.ipynb
  2. Run the cells to train the models and evaluate their performance.

Model Deployment

  1. Navigate to the backend directory:
    cd backend
  2. Run the backend server:
    fastapi dev main.py

10 9

CI/CD Workflow

This project includes a CI/CD workflow using GitHub Actions to build and deploy Docker images to Amazon ECR and then to an Amazon EC2 instance. The workflow also sets up a reverse proxy on the server to handle HTTPS requests and manage SSL & TLS certificates.

Configuration

The configuration settings are stored in the config/config.yaml file. You can modify this file to change the settings for the project.

Logging

Logs are stored in the logs directory. Each log file is named with the timestamp of when it was created.

License

This project is licensed under the MIT License.