Create anonymization profiles for vehicles.
Read the vehicle catalogue originating from the vehicle registry from Pulsar. For the vehicles with APC devices onboard, compute an anonymization profile based on the seating and standing capacity of the vehicle model. Send the profiles to Pulsar.
This repository has been created as part of the Waltti APC project.
- Install
poetry
. poetry install
- Create the environment variables you need. See below for the reference.
poetry run python src/waltti_apc_vehicle_anonymization_profiler/main.py
Environment variable | Required? | Default value | Description |
---|---|---|---|
HEALTH_CHECK_PORT |
❌ No | 8080 |
Which port to use to respond to health checks. |
IS_FRESH_START |
❌ No | false |
Whether to start calculating all profiles from scratch. If false, we read already generated profiles from PRODUCER_TOPIC before figuring out which vehicle models found by PULSAR_CATALOGUE_READERS need profiles computed. If true, we do not look at PRODUCER_TOPIC and compute every profile needed by the vehicle models relevant to us found by PULSAR_CATALOGUE_READERS . If set to true when there are many different kinds of vehicles producing APC data, expect a very long wait. |
PINO_LOG_LEVEL |
❌ No | info |
The level of logging to use. One of "fatal", "error", "warn", "info", "debug", "trace" or "silent". Each level is mapped to a corresponding Python logging level. Even though we do not use pino in a Python project, we use the same environment variable name and levels as the other Waltti-APC services so the deployment configuration looks consistent. |
PULSAR_BLOCK_IF_QUEUE_FULL |
❌ No | true |
Whether the send operations of the producer should block when the outgoing message queue is full. If false, send operations will immediately fail when the queue is full. |
PULSAR_CACHE_READER_NAME |
✅ Yes | The name of the reader for reading already computed profiles from PULSAR_PRODUCER_TOPIC . |
|
PULSAR_CATALOGUE_READERS |
✅ Yes | An array of objects to generate Pulsar vehicle catalogue readers from. The list is given in the form of a stringified JSON array of objects in the shape [{"feedPublisherId": feedPublisherId, "name": pulsarReaderName, "topic": pulsarTopic}, ...] . An example could be [{\"feedPublisherId\":\"fi:kuopio\",\"name\":\"vehicle-anonymization-profiler-catalogue-reader-fi-kuopio\",\"topic\":\"persistent://apc/source/vehicle-catalogue-fi-kuopio\"}, ...] . The topics contain the vehicle registry snapshots. As we are using a Reader, the topic must have some retention configured, e.g. a week. Otherwise the messages might be deleted before reading. The name will be the name of the Pulsar reader. |
|
PULSAR_COMPRESSION_TYPE |
❌ No | ZSTD |
The compression type to use in the topic where messages are sent. Must be one of Zlib , LZ4 , ZSTD or SNAPPY . |
PULSAR_OAUTH2_AUDIENCE |
✅ Yes | The OAuth 2.0 audience. | |
PULSAR_OAUTH2_ISSUER_URL |
✅ Yes | The OAuth 2.0 issuer URL. | |
PULSAR_OAUTH2_KEY_PATH |
✅ Yes | The path to the OAuth 2.0 private key JSON file. | |
PULSAR_PRODUCER_TOPIC |
✅ Yes | The topic to send vehicle anonymization profile messages to. | |
PULSAR_SERVICE_URL |
✅ Yes | The service URL. | |
PULSAR_TLS_VALIDATE_HOSTNAME |
✅ Yes | Whether to validate the hostname on its TLS certificate. This option exists because some Apache Pulsar hosting providers cannot handle Apache Pulsar clients setting this to true . |