This repo does content moderation and content tagging.
This list will be continually updated
- Moderating videos using CLIP and Nudenet models to identify violence and sexual content
- Content tagging videos using CLIP and Nudenet models to identify their primary and secondary categories.
- Wrapped the code base with ML deployment framework, Rayserve.
There needs to be a .env
file with following parameters.
DownloadNumCPUPerReplica=0.2
DownloadNumReplicas=1
DownloadMaxCon=100
PreprocessNumCPUPerReplica=1
PreprocessNumReplicas=1
PreprocessMaxCon=100
NudenetNumCPUPerReplica=0.8
NudenetNumReplicas=1
NudenetMaxCon=100
MFDNumCPUPerReplica=0.8
MFDNumReplicas=1
MFDMaxCon=100
ClipNumCPUPerReplica=0.1
ClipNumGPUPerReplica=0.14
ClipNumReplicas=1
ClipMaxCon=100
ComposedNumCPUPerReplica=0.1
ComposedNumReplicas=1
ComposedMaxCon=100
SnowflakeResultsQueue=content_moderation_tagging-results_dev
RawResultsQueue=content_moderation_tagging-raw-results_dev
AiModelBucket=datalake-dev
For DS Team internal testing, we also need to add the following env vars to the .env
file:
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=us-east-2
and uncomment these lines in tasks.py
:
# from dotenv import load_dotenv
# load_dotenv("./.env")
To prepare the conda environment to test the script:
pip install -r requirements.txt
pip install git+https://github.com/openai/CLIP.git --no-deps
pip install -U "ray[default]==1.11.1"
pip install "ray[serve]"
pip install pytest
DownloadNumCPUPerReplica=0.2
DownloadNumReplicas=1
DownloadMaxCon=100
PreprocessNumCPUPerReplica=1
PreprocessNumReplicas=1
PreprocessMaxCon=100
NudenetNumCPUPerReplica=0.8
NudenetNumReplicas=2
NudenetMaxCon=100
MFDNumCPUPerReplica=0.8
MFDNumReplicas=2
MFDMaxCon=100
ClipNumCPUPerReplica=0.1
ClipNumGPUPerReplica=0.14
ClipNumReplicas=4
ClipMaxCon=100
ComposedNumCPUPerReplica=0.1
ComposedNumReplicas=1
ComposedMaxCon=100
SnowflakeResultsQueue=content_moderation_tagging-results_dev
RawResultsQueue=content_moderation_tagging-raw-results_dev
AiModelBucket=datalake-dev
- Ensure there are environment variables or
.env
file, see section above for environment variables. - Ensure GPU for docker is enabled. See section below.
- Once the container is able to detect the GPU, we can follow the normal process of
docker-compose build
docker-compose up
To enable the GPU for Docker, make sure Nvidia drivers for the system are installed. Refer link for details
Commands which can help install Nvidia drivers are:
unbuntu-drivers devices
sudo ubuntu-drivers autoinstall
Then nvidia-docker2 tools needs to be installed. To install follow the below instructions. Refer link for details
curl https://get.docker.com | sh && sudo systemctl --now enable docker
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
- Test if the code is working as expected. Firstly on terminal, do:
ray start --head --port=6300
- Then, deploy the ray services:
python serve_tasks/tasks.py
- Finally you can conduct the pytest and shutdown the ray cluster.
cd ./test
python -m pytest test_ray_deployments.py
ray stop --force
To be updated.