Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Buildhub metadata field #439

Open
peterbe opened this issue Apr 30, 2018 · 2 comments
Open

Buildhub metadata field #439

peterbe opened this issue Apr 30, 2018 · 2 comments
Labels

Comments

@peterbe
Copy link
Contributor

peterbe commented Apr 30, 2018

One thing we discovered in the https://bugzilla.mozilla.org/show_bug.cgi?id=1456244 is the obvious problem that since Buildhub isn't a web server, it's really hard to find out; "What version of Buildhub do we use?" to which the counter-question is "Do you mean the scraper or the Lambda?"

The information is useful when reconciling the state of deployments. At the moment, you either trust that...

  1. Making a release automatically upgraded the scraper.
  2. That OPs has upgraded Stage/Prod Lambda/cron (check the latest "Upgrade This That for Buildhub v1.2.3" bug's status)

Another option is to instead inject an extra piece of information with each HTTP POST/PUT to Kinto that we don't pass along to kinto-elasticsearch.
For example:

Instead of:

kinto_client.create_record(record)

We do:

with open('./version.json') as f:
  record['_metadata'] = {
    'environment': 'cron',  # or 'lambda' or 'sqs'
    'timestamp': time.time(),
    'version': json.load(f.read()),
  }
kinto_client.create_record(record)

Then you can look up what environment/tool/version wrote the latest record from https://buildhub.prod.mozaws.net/v1/buckets/build-hub/collections/releases/records?_limit=3

  • It's more data to store. More bytes.
  • Perhaps it should be called _buildhub_metadata so it's never confused with any of the data coming in from that buildhub.json whence it's ready.
  • How do you even get the version.json stuff in the Lambda code? One way would be to generate the file into the buildhub python package as package data. E.g. from buildhub import version; print(version.get_data())
  • It still doesn't reduce down to 1 URL you can read (e.g. Whatsdeployed.io) but it's important to appreciate that Buildhub is two distinct things.

Another option is to write this information to another collection in the same kinto. E.g. Instead of

kinto_client.create_record(record)

we do:

kinto_client.create_record(record)
with open('./version.json') as f:
  metadata = {
    'environment': 'cron',  # or 'lambda' or 'sqs'
    'timestamp': time.time(),
    'version': json.load(f.read()),
  }
  kinto_client_metadata.create_record(metadata)

Ideas? Thoughts?

@peterbe
Copy link
Contributor Author

peterbe commented Apr 30, 2018

@leplatrem Any opinions that spring to mind?

@leplatrem
Copy link
Collaborator

That would be neat indeed to diagnose data holes or issues on updates...

I like the idea of using an alternate collection. Timestamp should come from the server (instead of time.time() though in order to be able to use them in _since queries etc.

An alternative would be to track the versions that were used at the collection metadata level, to avoid growing the DB size.

Something like (please forgive the bad naming...):

{
    "data": {
        "schema": {...},
        "timestamp-by-versions": {
            "1.17": 789101,
            "1.16": 023455,
            "1.15": 123456
        }
    }
}

And then you can obtain records /v1/buckets/build-hub/collections/releases?min_last_modified={a}&max_last_modified={b} with a= timestamp-by-versions[N] and b = timestamp-by-versions[N+1]

In order to maintain that list, you could compare the current version with the latest known one and do something like:

# client = KintoClient(bucket='build-hub', collection='releases')
# ...

metadata, permissions = client.get_collection()

if VERSION not metadata.get("timestamp-by-versions", {}):
    timestamp = client.get_records_timestamp()
    metadata["timestamp-by-versions"] = timestamp
    client.patch_collection(data=metadata)

You'd loose the ability to track the environment (cron, lambda, sqs) though... Unless you add some complexity into that metadata object...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants