Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

look into options for harvesting metadata from GLOS #162

Open
fostermh opened this issue Jun 3, 2022 · 6 comments
Open

look into options for harvesting metadata from GLOS #162

fostermh opened this issue Jun 3, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@fostermh
Copy link
Member

fostermh commented Jun 3, 2022

No description provided.

@fostermh
Copy link
Member Author

fostermh commented Jun 3, 2022

GLOS's new seagull application is built on ERSI products and thus is using geoportal as it's backend metadata server. metadata records can be requested using the standard geoportal rest API. for example:

https://seagull-geoportal.glos.org/geoportal/rest/metadata/search?start=10&num=1&searchText=sys.schema.key:iso19115-2

Note that while in this example I am requesting the metadata in iso19115-2 format and is under the hits/hits[]/_source/sys_xml_clob json path. The rest of the returned json is the same as if we had added 'f=pjson' to the query string. requesting metadata in XML format ('f=xml') results in RSS feed XML which is not what we are looking for.

I have tried every incarnation of date range search I can find but non of them work on this instance of geoportal. It seems like square brackets are not allowed and non of the DATE casts seem to work

@fostermh
Copy link
Member Author

@fostermh
Copy link
Member Author

this appears to give the xml https://seagull-geoportal.glos.org/geoportal/rest/metadata/item/e05de7ae36ee459395ba2d9b9e39e3dd/xml
so if we can get a list of dataset id's we could pull the xml in a standard way using a waf harvester

@fostermh
Copy link
Member Author

trying to harvest using CSW seems to be broken, I get 'Error gathering the identifiers from the CSW server [Document is XML.

@fostermh fostermh added this to the v1.5.0 milestone Apr 28, 2023
@fostermh
Copy link
Member Author

it looks like the following query will give us the records modified since this date. https://seagull-geoportal.glos.org/geoportal/opensearch?q=&modified=2023-01-20/*&f=json

@fostermh
Copy link
Member Author

@fostermh fostermh modified the milestones: v1.5.0, v1.6.0 Oct 10, 2023
@fostermh fostermh added the enhancement New feature or request label Nov 3, 2023
@fostermh fostermh removed this from the v1.6.0 milestone May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

1 participant