IBM Attrition Predictor

Dataset

https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset

Features

Data Preprocessing and EDA
Model Training and Evaluation (Logistic Regression, MLP, XG-Boost)
Training Pipeline
Inference Pipeline
Data Ingestion and Transformation
Model Trainer
Hyperparameter Tuning
Automatic Data Augmentation
Docker Image Creation Script
CI/CD Workflow (GitHub Actions to Amazon ECR to Amazon EC2)
Reverse Proxy Setup for HTTPS Requests
SSL & TLS Certificates

Installation

Clone the repository:

git clone https://github.com/yourusername/IBM_Attrition_Predictor.git

Navigate to the project directory:
```
cd IBM_Attrition_Predictor
```
Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
- On Windows:
```
venv\Scripts\activate
```
- On macOS/Linux:
```
source venv/bin/activate
```
Install the required packages:
```
pip install -r requirements.txt
```

Usage

Data Preprocessing and EDA

Open the Jupyter notebook for EDA:

jupyter notebook src/notebooks/EDA.ipynb

Run the cells to preprocess the data and perform exploratory data analysis.

Model Training

Open the Jupyter notebook for model training:
```
jupyter notebook src/notebooks/models.ipynb
```
Run the cells to train the models and evaluate their performance.

Model Deployment

Navigate to the backend directory:
```
cd backend
```
Run the backend server:
```
fastapi dev main.py
```

CI/CD Workflow

This project includes a CI/CD workflow using GitHub Actions to build and deploy Docker images to Amazon ECR and then to an Amazon EC2 instance. The workflow also sets up a reverse proxy on the server to handle HTTPS requests and manage SSL & TLS certificates.

Configuration

The configuration settings are stored in the config/config.yaml file. You can modify this file to change the settings for the project.

Logging

Logs are stored in the logs directory. Each log file is named with the timestamp of when it was created.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

IBM Attrition Predictor

Dataset

Features

Installation

Usage

Data Preprocessing and EDA

Model Training

Model Deployment

CI/CD Workflow

Configuration

Logging

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

IBM Attrition Predictor

Dataset

Features

Installation

Usage

Data Preprocessing and EDA

Model Training

Model Deployment

CI/CD Workflow

Configuration

Logging

License