-
Notifications
You must be signed in to change notification settings - Fork 24
Command line interface
Marten edited this page Dec 11, 2020
·
3 revisions
Command Line Interface application is a Java application designed to run harvesting tasks outside harvester web application. It might be usefull for any use case when the harvesting job shall be integrated with the operating system, for example to schedule job through Unix cron tab mechanism.
java -jar harvest.jar [options] [-file <file>] | [-task <task definition>]
where:
- options are:
-h,--help: to display help for the harvest command
-v,--version: to display harvester version number
-V,--verbose: to receive more detailed information during harvesting
- commands are:
-f,--file <arg>: to perform harvesting by the task defined in a file, you can either use a file exported from the harvester application or create the file manually.
-t,--task <task definition>: to perform harvesting by the task defined at the command line, this will be involve entering all parameters of the task.
{
"source" : {
"type" : <Type of the adaptor: WAF | CSW | UNC | GPTSRC | AGS | AGP-IN | CKAN>,
"label" : <Short name of the source>,
"properties" : {
<properties custom for each adaptor type>
}
},
"destinations" : [{
"action" : {
"type" : <Type of the adaptor : FOLDER | GPT | AGP-OUT>,
"label" : <Short name of the destination>,
"properties" : {
<properties custom for each adaptor type>
}
}
}
]
"incremental" : <Perform incremental harvesting if possible: true|false; default is false>,
"ignoreRobots" : <Ignore directived from robots.txt if present: true|false; default is false>
}
{
"source" : {
"type" : "CSW",
"label" : "GEOSUR",
"properties" : {
"csw-host-url" : "http://www.geosur.info/geoportal/csw",
"csw-profile-id" : "urn:ogc:CSW:2.0.2:HTTP:OGCCORE:ESRI:GPT"
}
},
"destinations" : [{
"action" : {
"type" : "GPT",
"label" : "GPT",
"properties" : {
"gpt-host-url" : "http://localhost:8080/geoportal",
"cred-username" : "user name",
"cred-password" : "user password",
"gpt-cleanup" : "true"
}
}
}
],
"incremental" : false,
"ignoreReobots" : true
}
- With the -t option, user should supply command line with the task definition rather than path to that task definition file, so:
java –jar harvest.jar –t “<task definition>”
- Please note that there is a double quote with the actual task definition in it, there will be a difference between Linux/Unix and Windows on how the command will look like, let’s say for simplicity, the task looks like this (JSON)
{“source”: {}, “destinations”: []}
On Linux (Unix) platform the content of the task can be provided in a single quote, the command will be look like:
java –jar harvest.jar –t ‘{“source”: {}, “destinations”: []}’
On Windows platform, the double quotes will have to be escaped, the command will be look like:
java –jar harvest.jar –t “{““source””: {}, ““destinations””: []}”
WAF - web accessible folder,
"waf-host-url" - harvesting root URL
"waf-pattern" - pattern to filter harvested files (GLOB pattern syntax); default: **.xml
"cred-username" - user name (optional)
"cred-password" - user password (optional)
CSW - Catalog Service for the Web
"csw-host-url" - CSW service URL
"csw-profile-id" - CSW profile id
"cred-username" - user name (optional)
"cred-password" - user password (optional)
UNC - Uniform Naming Convention folder
"unc-root-folder" - root folder (must be accessible by the harvester)
"unc-pattern" - pattern to filter harvested files (GLOB pattern syntax); default: **.xml
GPTSRC - Geoportal Server 2.0
"gpt-host-url" - Geoportal Server URL
"gpt-index" - name of the index to harvest (optional)
"cred-username" - user name
"cred-password" - user password
AGS - ArcGIS server services
"ags-host-url" - ArcGIS Server host URL (before rest/services)
"ags-enable-layers" - true to harvest layers from map service (default: false)
AGP-IN - ArcGIS Portal (or ArcGIS Online)
"agp-host-url" - ArcGIS Portal host URL (before sharing)
"agp-folder-id" - folder id or folder name (optional)
"cred-username" - user name
"cred-password" - user password
CKAN - CKAN
"ckan-host-url" - CKAN host URL
"ckan-apikey" - CKAN API key (optional)
FOLDER - local folder
"folder-root-folder" - root folder
"folder-cleanup" - cleanup data
GPT - Geoportal server 2.0
"gpt-host-url" - Geoportal Server URL
"gpt-index" - name of the index to harvest (optional)
"gpt-force-add" - force adding records instead checking if they exist
"gpt-cleanup" - cleanup data
"cred-username" - user name
"cred-password" - user password
AGP-OUT - ArcGIS Portal (or ArcGIS Online)
"agp-host-url" - ArcGIS portal host URL (before sharing)
"agp-folder-id" - folder id or folder name (optional)
"agp-folder-cleanup" - cleanup data
"cred-username" - user name
"cred-password" - user password