Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest Level 2A WV02 MSI_L2A into cumulus #389

Open
5 of 18 tasks
jsrikish opened this issue Oct 3, 2024 · 4 comments
Open
5 of 18 tasks

Ingest Level 2A WV02 MSI_L2A into cumulus #389

jsrikish opened this issue Oct 3, 2024 · 4 comments
Assignees

Comments

@jsrikish
Copy link
Collaborator

jsrikish commented Oct 3, 2024

Ingest granules in collection WV02_MSI_L2A to CBA Prod by discovering/ingesting from MCP account.

  • Checkout and pull main: git checkout main && git pull
  • Create new branch: git checkout -b issue389/ingest-wv02-msi_l2a
  • Create new rule app/stacks/cumulus/resources/rules/WV02_MSI_L2A/v1/WV02_MSI_L2A___1.json:
    • name: "WV02_MSi_L2A___1"
    • provider: "maxar"
    • meta.providerPathFormat: "'css/nga/WV02/2A/'yyyy/DDD"
    • meta.startDate: "2009-11-15T00:00:00Z"
    • meta.endDate: "2022-01-01T00:00:00Z"
  • Enter Docker with your environment (ex: DOTENV=.env.cba.prod make bash)
  • Add the collection: cumulus collections add --data app/stacks/cumulus/resources/collections/WV02_MSI_L2A___1.json
  • Add the rule: cumulus rules add --data app/stacks/cumulus/resources/rules/WV02_MSI_L2A/v1/WV02_MSI_L2A___1.json
  • Enable the rule: cumulus rules enable --name WV02_MSI_L2A___1
  • Run the rule: cumulus rules run --name WV02_MSI_L2A___1

Acceptance criteria

  • The MapRun of the DiscoverAndQueueGranules execution triggered by running the rule should show xxx iterations (3 of these years are leap years) --( 2009 Nov has xxx only and then begins in Dec)
  • After some successful executions of IngestAndPublishGranules, thumbnails are visible in the Earthdata Search results (sort results with oldest first, as those will be the first ingested, and confirm that the URL for the thumbnail shows the hostname as data.csdap.earthdata.nasa.gov [note: csdap, not csda])
  • It is possible to download files in the file list for a granule shown in Earthdata Search (again, hostname should include csdap, not csda) -- Cognito auth should be triggered
  • After a few minutes (not more than 15 minutes?), granules and granule files can be found in Kibana Prod or this link for the correct time of the rule execution
  • All granules in WV02_Pan_L2A have been ingest into CBA Prod, with the exception of perhaps a small percentage of errors.

To determine how many granules have been processed, first enter the Docker container:

DOTENV=.env.cba-prod make bash

In the container, run the following:

DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0 -? status=completed

(note: due to a Cumulus bug, sometimes the status does not get properly updated. Try running these to match the numbers)

DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0
DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0 -? status=queued
DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0 -? status=running
DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0 -? status=completed
DEBUG=1 cumulus granules list -? collectionId=WV02_MSI_L2A___1 --limit=0 -? status=failed

You should see output similar to the following:

...
RESPONSE: {
  statusCode: 200,
  body: '{"meta":{"name":"cumulus-api","stack":"cumulus-prod","table":"granule","limit":0,"page":1,"count":8592},"results":[]}',
  headers: {
    'x-powered-by': 'Express',
    'access-control-allow-origin': '*',
    'strict-transport-security': 'max-age=31536000; includeSubDomains',
    'content-type': 'application/json; charset=utf-8',
    'content-length': '114',
    etag: 'W/"72-O2wUXhu+Q9J1hqdDrb0fcsZeFHo"',
    date: 'Fri, 01 Dec 2023 21:29:19 GMT',
    connection: 'close'
  },
  isBase64Encoded: false
}
[]

In particular, look at the value for body and within it, locate the value of "count". In the output above, the count should match the Earthdata Search granule count obtained in the very first step.

@jsrikish jsrikish self-assigned this Oct 3, 2024
@jsrikish jsrikish changed the title Ingest Level 2A WV02 MSI into cumulus Ingest Level 2A WV02 MSI_L2A into cumulus Oct 8, 2024
@hbparache
Copy link
Collaborator

Here for debugging: #415

@jsrikish
Copy link
Collaborator Author

jsrikish commented Nov 8, 2024

2010- has been ingested on the 6th. Does not show up in search.earthdata; Helen contacted ESDIS and it is the result of a bug which is under investigation.
2011-2015 were restored from GLACIER; Missing Checksums were calculated and inserted into DB
JSON files were generated; No. of json files written from the summary of MAXAR_CONVERSION DAG = 500*19+124=9624

9624 granules will be ingested for 2011-2015 when the Earthdata bug is fixed by ESDIS

@jsrikish
Copy link
Collaborator Author

2009-2015 # of ingested granules (WV02 Level 2A ):
2009 - 39
2010 - 2302
2011 - 1220
2012 - 1254
2013 - 1579
2014 - 3822
2015 - 1604
Restoration is in progress for 2016-2022

@jsrikish
Copy link
Collaborator Author

2016-2021 # of granules ingested:
2016 - 1,108
2017 - 1,022
2018 - 414
2019 - 152
2020 - 70
2021 - 1

There were errors in IngestAndPublish step in the state machine at PostToCMR :
Error: Post to cmr
{
"cause": {
"errorType": "UnexpectedFileSize",
"errorMessage": "verifyFile WV02_20180201104041_103001007742D500_18FEB01104041-M2AS-501922291100_01_P001.tar failed: Actual file size 419840 did not match expected file size 337920",
"trace": [
"UnexpectedFileSize: verifyFile WV02_20180201104041_103001007742D500_18FEB01104041-M2AS-501922291100_01_P001.tar failed: Actual file size 419840 did not match expected file size 337920",
" at GranuleFetcher.verifyFile (/var/task/web
......
}

Need to investigate these failures. Size in DB records and S3 match for the granule

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants