Image Classification using AWS SageMaker

In this project, I use AWS Sagemaker to train a pretrained model that can perform image classification by using the Sagemaker profiling, debugger, hyperparameter tuning and other good ML engineering practices. This was done on a dog breed dataset.

Dataset

The provided dataset is the dogbreed classification dataset which can be found in the classroom. The project is designed to be dataset independent so if there is a dataset that is more interesting or relevant to your work, you are welcome to use it to complete the project.

Hyperparameter Tuning

The pre-trained model that I chose for this project was Inception v3. This is an image recognition model that has been shown to attain greater than 78.1% accuracy on the ImageNet dataset. The model is the culmination of many ideas developed by multiple researchers over the years. It is based on the original paper: "Rethinking the Inception Architecture for Computer Vision" by Szegedy, et. al.
The model itself is made up of symmetric and asymmetric building blocks, including convolutions, average pooling, max pooling, concatenations, dropouts, and fully connected layers. Batch normalization is used extensively throughout the model and applied to activation inputs. Loss is computed using Softmax.

A high-level diagram of the model is shown in the following screenshot:

The following hyperparameters were selected for tuning:
- Learning rate - learning rate defines how fast the model trains. A large learning rate allows the model to learn faster, with a small learning rate it takes a longer time for the model to learn but with more accuracy. The range is from 0.001 to 0.1.
- Batch size - batch size is the number of examples from the training dataset used in the estimate of the error gradient. Batch size controls the accuracy of the estimate of the error gradient when training neural networks. The batch-size we choose between two numbers 64 and 128.
- epoch - epochs is the number of times that the learning algorithm will work through the entire training dataset. The epochs we choose between two numbers 2 and 5.
The best hyperparameters selected were: {'batch-size': '128', 'lr': '0.003484069065132129', 'epochs': '2'}

All Training Jobs

Hyperparameters Tuning Jobs

Best Hyperparameters Tuning Job

Debugging and Profiling

Debugging Report

Two plots show dependence between loss and step: first one shows the train_loss/steps, the second one shows the test_loss/steps.

Train loss plot:

Test loss plot:

Debugging and Profile Rules:

Profiling Results:

Results

As we see there are some anomalous behaviour in the debugging output:

In the train_loss/steps as steps are increased the loss is decreased. The graph is smooth.
In the test_loss/steps as steps are increased we cannot say the loss is decreased. The graph isn't smooth.

Also noticed that:

No rules were triggered during the process
The average step duration was 13.1s

Here are some ways that may help to fix the anomalous behaviour seen:

Adding more hyperparameters to tune.
Increasing hyperparameter ranges for hpo tuning.
Increasing max_jobs for hpo tuning.
Adding more Fully Connected layers to the pretrained model.

Model Deployment

The model is deployed using inference.py script.

Deployed the Endpoint

Instructions on how to Query the Endpoint with a Sample Input

The dog images I use must be downloaded from here.
Test images I use are stored in the dogImages/test/ folder.
Scripts to predict on the Endpoint:
- store image path in image_path
- prepare image and store it as payload: response = object.get() payload = response['Body'].read()
- run prediction: response = predictor.predict(payload, initial_args={"ContentType": "image/jpeg"}) response= json.loads(response.decode()) predicted_dog_breed_idx = np.argmax(response,1)[0]

Test prediction

In train_and_deploy.ipynb I run 4 test predictions, and the predictions are pretty accurate.

Here are some examples of predictions:

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.ipynb_checkpoints		.ipynb_checkpoints
ProfilerReport/profiler-output		ProfilerReport/profiler-output
images		images
.gitattributes		.gitattributes
CODEOWNERS		CODEOWNERS
LICENSE.txt		LICENSE.txt
README.md		README.md
hpo.py		hpo.py
inference.py		inference.py
train_and_deploy.ipynb		train_and_deploy.ipynb
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Classification using AWS SageMaker

Dataset

Hyperparameter Tuning

All Training Jobs

Hyperparameters Tuning Jobs

Best Hyperparameters Tuning Job

Debugging and Profiling

Debugging Report

Train loss plot:

Test loss plot:

Debugging and Profile Rules:

Profiling Results:

Results

Model Deployment

Deployed the Endpoint

Instructions on how to Query the Endpoint with a Sample Input

Test prediction

About

Releases

Packages

Languages

License

Khalizo/Deep-Learning-Image-Classification-Using-AWS-Sagemaker

Folders and files

Latest commit

History

Repository files navigation

Image Classification using AWS SageMaker

Dataset

Hyperparameter Tuning

All Training Jobs

Hyperparameters Tuning Jobs

Best Hyperparameters Tuning Job

Debugging and Profiling

Debugging Report

Train loss plot:

Test loss plot:

Debugging and Profile Rules:

Profiling Results:

Results

Model Deployment

Deployed the Endpoint

Instructions on how to Query the Endpoint with a Sample Input

Test prediction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages