This project collects new Job Listings from this website and notify about them via slack.
The Slack message:
This repository is split into two parts:
- The Scraper code:
job_scraper
folder. - The CI/CD pipeline: responsible for deploying the scraper and pushing changes to the code, which lives inside
cicd_infrastructure
.
You need the following to deploy the project:
- Terraform
- AWS account
- AWS Cli
- AWS Profile
This creates a profile with keys to access the AWS account - Slack Workspace
- Slack App
1 - Slack
- Create a slack channel to receive jobs notifications
- Create a slack webhook from your Slack App
Save the generated webhook url somewhere as you will need to use it later
2 - AWS
- Create bucket to store deployment artifacts
- Store slack webhook on secrets manager
I create a "Other type of secrets" and put the name of the secret and the Secret key the same.
E.g:
Secret Name: mpenz-ws-slack-webhook
Secret Key: mpenz-ws-slack-webhook
Secret Value: https://hooks.slack.com/services/....
3 - Github
- Clone the Repository
- Create personal Access Token
This is needed to allow your AWS account to connect with your clone github repo. Follow from step 1 to 6. Make sure you save it somewhere safe but with quick access.
4 - Configure cloned repository
File | Variable | Description |
---|---|---|
job_filters.json | classification | Classification from Website. 6281 = Technology |
keyword | What to filter for. E.g: data, cloud, test | |
slack_webhook_secret | Name of secret created as part of (2 - AWS) | |
cicd_infrastructure\terraform-backend.tfvars | region | target AWS Region where to deploy the infrastructure |
profile | AWS Profile name configured on your computer | |
bucket | Deployment artifacts. Created as part of (2 - AWS) | |
cicd_infrastructure\terraform-deployment.tfvars | aws_region | target AWS Region where to deploy the infrastructure |
aws_profile | AWS Profile name configured on your computer | |
artifacts_bucket | Deployment artifacts. Created as part of (2 - AWS) | |
github_repository_owner | your github account (as you cloned the repository) | |
github_repository_name | job-scraper | |
serverless.yml | deploymentBucket | Deployment artifacts. Created as part of (2 - AWS) |
Required:
Init
terraform init -backend-config="terraform-backend.tfvars"
Apply
terraform apply -var-file="terraform-deployment.tfvars"
Optional:
Plan
terraform plan -var-file="terraform-deployment.tfvars"
Destroy
terraform destroy -var-file="terraform-deployment.tfvars"
Once terraform finishes applying the infrastructure you can find the CI/CD pipeline on your AWS account. If you log into your AWS account you can check the Code Pipeline service.
If you want to run tests
and linting
locally use poetry
-
Install
poetry
on your machine. For example, using brew on mac
brew install poetry
-
Install
poetry install
-
Enter shell mode
poetry shell
-
Run linting
flake8 --statistics
-
Run tests
pytest -v --disable-pytest-warnings --cov=job_scraper