-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch '239_dev_headless' into dev_ec
- Loading branch information
Showing
28 changed files
with
1,583 additions
and
174 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
context: | ||
cache: true | ||
contextmaps: | ||
- file: ./configs/schemaorg-current-https.jsonld | ||
prefix: https://schema.org/ | ||
- file: ./configs/schemaorg-current-https.jsonld | ||
prefix: http://schema.org/ | ||
gleaner: | ||
mill: true | ||
runid: runX | ||
summon: true | ||
millers: | ||
graph: true | ||
minio: | ||
address: oss.geocodes-aws-dev.earthcube.org | ||
port: 443 | ||
ssl: true | ||
bucket: dvtest | ||
region: "" | ||
accesskey: worldsbestaccesskey | ||
secretkey: worldsbestsecretkey | ||
sources: | ||
- sourcetype: sitemap | ||
name: headless | ||
logo: https://opentopography.org/sites/opentopography.org/files/ot_transp_logo_2.png | ||
url: https://earthcube.github.io/GeoCODES-Metadata/site/sitemaps/headless.xml | ||
headless: true | ||
pid: https://www.re3data.org/repository/r3d100010655 | ||
propername: TEST HEADLESS SOURCES | ||
domain: http://wwwearthcube.org/headless/ | ||
active: true | ||
credentialsfile: "" | ||
other: {} | ||
headlesswait: 0 | ||
delay: 0 | ||
identifierpath: ' "$.distribution.contentUrl"' | ||
apipagelimit: 0 | ||
identifiertype: identifiersha | ||
fixcontextoption: 0 | ||
acceptcontenttype: application/ld+json, text/html | ||
jsonprofile: application/ld+json | ||
- sourcetype: sitemap | ||
name: mixed | ||
logo: http://ds.iris.edu/static/img/layout/logos/iris_logo_shadow.png | ||
url: http://ds.iris.edu/files/sitemap.xml | ||
headless: false | ||
pid: https://www.re3data.org/repository/r3d100010268 | ||
propername: TEST MIXED SOURCES | ||
domain: http://wwwearthcube.org/headless/ | ||
active: true | ||
credentialsfile: "" | ||
other: {} | ||
headlesswait: 0 | ||
delay: 0 | ||
identifierpath: "" | ||
apipagelimit: 0 | ||
identifiertype: identifiersha | ||
fixcontextoption: 0 | ||
acceptcontenttype: application/ld+json, text/html | ||
jsonprofile: application/ld+json | ||
summoner: | ||
after: "" | ||
delay: null | ||
headless: http://127.0.0.1:9222 | ||
mode: full | ||
threads: 5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
minio: | ||
address: 0.0.0.0 | ||
port: 9000 | ||
accessKey: worldsbestaccesskey | ||
secretKey: worldsbestsecretkey | ||
ssl: false | ||
bucket: gleaner | ||
gleaner: | ||
runid: runX # this will be the bucket the output is placed in... | ||
summon: true # do we want to visit the web sites and pull down the files | ||
mill: true | ||
context: | ||
cache: true | ||
contextmaps: | ||
- prefix: "https://schema.org/" | ||
file: "./configs/schemaorg-current-https.jsonld" | ||
- prefix: "http://schema.org/" | ||
file: "./configs/schemaorg-current-https.jsonld" | ||
summoner: | ||
after: "" # "21 May 20 10:00 UTC" | ||
mode: full # full || diff: If diff compare what we have currently in gleaner to sitemap, get only new, delete missing | ||
threads: 5 | ||
delay: # milliseconds (1000 = 1 second) to delay between calls (will FORCE threads to 1) | ||
headless: http://127.0.0.1:9222 # URL for headless see docs/headless | ||
millers: | ||
graph: true | ||
# will be built from sources.csv | ||
#sitegraphs: | ||
#- name: aquadocs | ||
# url: https://oih.aquadocs.org/aquadocs.json | ||
# headless: false | ||
# pid: http://hdl.handle.net/1834/41372 | ||
# properName: AquaDocs | ||
# domain: https://aquadocs.org | ||
#sitemaps: | ||
#- name: samplesearth | ||
# url: https://samples.earth/sitemap.xml | ||
# headless: false | ||
# pid: https://www.re3data.org/repository/samplesearth | ||
# properName: Samples Earth (DEMO Site) | ||
# domain: https://samples.earth |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
--- | ||
minio: | ||
address: oss.geocodes-aws-dev.earthcube.org | ||
port: 443 | ||
accessKey: worldsbestaccesskey | ||
secretKey: worldsbestsecretkey | ||
ssl: true | ||
bucket: dvtest # can be overridden with MINIO_BUCKET | ||
sparql: | ||
endpoint: https://graph.geocodes-dev.earthcube.org/blazegraph/namespace/earthcube/sparql | ||
s3: | ||
bucket: dvtest # sync with above... can be overridden with MINIO_BUCKET... get's zapped if it's not here. | ||
domain: us-east-1 | ||
|
||
#headless field in gleaner.summoner | ||
headless: http://127.0.0.1:9222 | ||
sourcesSource: | ||
type: csv | ||
location: sources.csv | ||
#location: https://docs.google.com/spreadsheets/d/e/2PACX-1vTt_45dYd5LMFK9Qm_lCg6P7YxG-ae0GZEtrHMZmNbI-y5tVDd8ZLqnEeIAa-SVTSztejfZeN6xmRZF/pub?gid=1340502269&single=true&output=csv | ||
# this can be a remote csv | ||
# type: csv | ||
# location: https://docs.google.com/spreadsheets/d/{key}/gviz/tq?tqx=out:csv&sheet={sheet_name} | ||
# TBD -- Just use the sources in the gleaner file. | ||
# type: yaml | ||
# location: gleaner.yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
minio: | ||
address: oss.geocodes-aws-dev.earthcube.org | ||
port: 443 | ||
ssl: true | ||
bucket: dvtest | ||
region: "" | ||
accesskey: worldsbestaccesskey | ||
secretkey: worldsbestsecretkey | ||
objects: | ||
bucket: dvtest | ||
domain: us-east-1 | ||
prefix: | ||
- summoned/headless | ||
- summoned/mixed | ||
- org | ||
prefixoff: [] | ||
sparql: | ||
endpoint: https://graph.geocodes-dev.earthcube.org/blazegraph/namespace/earthcube/sparql | ||
authenticate: false | ||
username: "" | ||
password: "" | ||
txtaipkg: | ||
endpoint: http://0.0.0.0:8000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
minio: | ||
accesskey: worldsbestaccesskey | ||
address: 0.0.0.0 | ||
bucket: gleaner | ||
port: 9000 | ||
secretkey: worldsbestsecretkey | ||
ssl: false | ||
objects: | ||
bucket: gleaner | ||
domain: us-east-1 | ||
# prefix will be built using the sources.csv, and additional values | ||
prefix: | ||
- orgs | ||
- summoned/obps | ||
- prov/obps | ||
- summoned/aquadocs | ||
- prov/aquadocs | ||
- milled/marinetraining | ||
- prov/marinetraining | ||
- milled/obis | ||
- prov/obis | ||
- milled/oceanexperts | ||
- prov/oceanexperts | ||
sparql: | ||
endpoint: http://192.168.86.45:32775/blazegraph/namespace/lipd/sparql | ||
txtaipkg: | ||
endpoint: http://0.0.0.0:8000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
hack,SourceType,Active,Name,ProperName,URL,Headless,IdentifierType,IdentifierPath,Domain,PID,Logo | ||
3,sitemap,TRUE,headless,TEST HEADLESS SOURCES,https://earthcube.github.io/GeoCODES-Metadata/site/sitemaps/headless.xml,TRUE,identifiersha," ""$.distribution.contentUrl""",http://wwwearthcube.org/headless/,https://www.re3data.org/repository/r3d100010655,https://opentopography.org/sites/opentopography.org/files/ot_transp_logo_2.png | ||
4,sitemap,TRUE,mixed,TEST MIXED SOURCES,http://ds.iris.edu/files/sitemap.xml,FALSE,identifiersha,,http://wwwearthcube.org/headless/,https://www.re3data.org/repository/r3d100010268,http://ds.iris.edu/static/img/layout/logos/iris_logo_shadow.png |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.