Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix visual bugs, improve documentation and add Postgres Harvester #111

Merged
merged 7 commits into from
Nov 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 151 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@
This CKAN extension provides functions and templates specifically designed to extend `ckanext-scheming` and `ckanext-dcat` and includes RDF profiles and Harvest enhancements to adapt CKAN Schema to multiple metadata profiles as: [GeoDCAT-AP](./ckanext/schemingdcat/schemas/geodcat_ap/es_geodcat_ap_full.yaml) or [DCAT-AP](./ckanext/schemingdcat/schemas/dcat_ap/eu_dcat_ap_full.yaml).

> [!WARNING]
> This project requires [mjanez/ckanext-dcat](https://github.com/mjanez/ckanext-dcat) (for newer releases) or [ckan/ckanext-dcat](https://github.com/ckan/ckanext-dcat) (older), along with [ckan/ckanext-scheming](https://github.com/ckan/ckanext-scheming) and [ckan/ckanext-spatial](https://github.com/ckan/ckanext-spatial) to work properly. If you want to use custom schemas with multilingual support, you need to use `ckanext-fluent`. A fixed version is available at [mjanez/ckanext-fluent](https://github.com/mjanez/ckanext-fluent).
> This project requires [mjanez/ckanext-dcat](https://github.com/mjanez/ckanext-dcat) (for newer releases) or [ckan/ckanext-dcat](https://github.com/ckan/ckanext-dcat) (older), along with [ckan/ckanext-scheming](https://github.com/ckan/ckanext-scheming) and [ckan/ckanext-spatial](https://github.com/ckan/ckanext-spatial) to work properly.
> * If you want to use custom schemas with multilingual support, you need to use `ckanext-fluent`. A fixed version is available at [mjanez/ckanext-fluent](https://github.com/mjanez/ckanext-fluent).
> * If you want to use custom harvesters, you need to use `ckanext-harvest`, an improved, more private version is avalaibe at [mjanez/ckanext-harvest](https://github.com/mjanez/ckanext-harvest).

> [!TIP]
> It is **recommended to use with:** [`ckan-docker`](https://github.com/mjanez/ckan-docker) deployment or only use [`ckan-pycsw`](https://github.com/mjanez/ckan-pycsw) to deploy a CSW Catalog.
Expand Down Expand Up @@ -124,7 +126,10 @@ To use custom schemas in `ckanext-scheming`:
```

### Harvest
Add the [custom Harvesters](#harvesters) to the list of plugins as you need:
**Requirement**:
- If you want to use custom harvesters, you need to use `ckanext-harvest`, an improved, more private version is avalaibe at [mjanez/ckanext-harvest](https://github.com/mjanez/ckanext-harvest).

Next add the [custom Harvesters](#harvesters) to the list of plugins as you need:

```ini
ckan.plugins = ... spatial_metadata ... dcat ... schemingdcat ... harvest ... schemingdcat_ckan_harvester schemingdcat_csw_harvester ...
Expand Down Expand Up @@ -800,6 +805,147 @@ The `ckan schemingdcat` command offers utilites:

ckan schemingdcat download-rdf-eu-vocabs

### SQL Harvester
The plugin includes a harvester for local databases using the custom schemas provided by `schemingdcat` and `ckanext-scheming`.

To use it, you need to add the `schemingdcat_postgres_harvester` plugin to your options file:

```ini
ckan.plugins = harvest schemingdcat schemingdcat_datasets ... schemingdcat_ckan_harvester schemingdcat_postgres_harvester
```

The SQL Harvester supports the following options:

### Schema Generation Guide
This guide will help you generate a schema that is compatible with our system. The schema is a JSON object that defines the mapping of fields in your database to the fields in our system.

#### Field Mapping Structure
The `dataset_field_mapping`/`distribution_field_mapping` is structured as follows (multilingual version):

```json
{
...
"field_mapping_schema_version": 1,
"<dataset_field_mapping>/<distribution_field_mapping>": {
"<schema_field_name>": {
"languages": {
"<language>": {
<"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<db_field_name>/<db_field_name_list>">
},
...
},
...
},
...
}
}
```

* `<schema_field_name>`: The name of the field in the CKAN schema.
* `<language>`: (Optional) The language code for multilingual fields. Must be a valid [ISO 639-1](https://localizely.com/iso-639-1-list/) language code. Now nested under the `languages` key.
* `<fixed_value>/<fixed_value_list>`: (Optional) A fixed value or a list of fixed values to be assigned to the field for all records.
* **Field Labels**: Field position or field name:
* `<field_name>/<field_name_list>`: (Optional) The name of the field in your database. Must be in the format `{schema}.{table}.{field}`.

For fields that are not multilingual, you can use `field_name` directly without the `languages` key. For example:

```json
{
...
"field_mapping_schema_version": 2,
"<dataset_field_mapping>/<distribution_field_mapping>": {
"<schema_field_name>": {
<"field_value": "<fixed_value>/<fixed_value_list>">,/<"field_name": "<db_field_name>/<db_field_name_list>">
},
...
}
}
```

```json
{
"database_type":"postgres",
"credentials":{
"user":"u_fototeca",
"password":"u_fototeca",
"host":"localhost",
"port":5432,
"db":"fototeca"
},
"field_mapping_schema_version":1,
"dataset_field_mapping":{
"alternate_identifier": {
"field_name": "fototeca.vista_ckan.cod_vuelo",
"is_p_key": true,
"index": true,
"f_key_references": [
"fototeca.vuelos.cod_vuelo"
]
},
"flight_color": {
"field_name": "fototeca.vista_ckan.color",
"f_key_references": [
"fototeca.l_color.color"
]
},
},
"encoding": {
"field_value": "UTF-8"
},
"title_translated":{
"languages": {
"es":{
"field_name": "fototeca.vuelos.nom_vuelo",
}
}
}
},
"defaults_groups":[

],
"defaults_tags":[

],
"default_group_dicts":[

]
}
```

* `database_type`: The type of your database. Currently, only `postgres` is supported.
* `credentials`: The credentials to connect to your database. Must include `username`, `password`, `host`, `port`, and `database name`.
* `field_mapping_schema_version`: The version of the field mapping schema. Currently, only version `1` is supported.
* `dataset_field_mapping`: The mapping of fields in your database to the fields in our system. Each field must be in the format `{schema}.{table}.{field}`.
* Other properties of `ckanext-harvest`/`ckanext-schemingdcat`.

#### Field Types
There are two types of fields that can be defined in the configuration:

1. **Regular Fields**: These fields have a field label to define the mapping or a fixed value for all their records.
- **Properties**: A field can have one of these three properties:
- **Fixed Value Fields (`field_value`)**: These fields have a fixed value that is assigned to all records. This is defined using the `field_value` property. If `field_value` is a list, `field_name` could be set at the same time, and the `field_value` extends the list obtained from the remote field.
- **Field Labels**: Field name in the database:
- **Name-Based Fields (`field_name`)**: These fields are defined by their name in the DB table. This is defined using the `field_name` property. To facilitate data retrieval from the database, especially regarding the identification of primary keys (`p_key`) and foreign keys (`f_key`), the following properties can be added to the `field_mapping` schema:
1. **Primary Key Field (`is_p_key`)** [*Optional*]: This property will identify if the field is a primary key (`p_key`) or not if not indicated. This will facilitate join operations and references between tables.
2. **Table References (`f_key_references`)** [*Optional* (`list`)]: For fields that are foreign keys, this property would specify which schemas, tables, and fields the foreign key refers to. For example, `["public.vuelo.id", "public.camara.id"]`. This is useful for automating joins between tables.
3. **Index (`index`)** [*Optional*]: A boolean property to indicate if the field should be indexed to improve query efficiency. Although not specific to primary or foreign keys, it is relevant for query optimization. By default, its value is `false`.

The modified schema would allow for more efficient data retrieval and simplify the construction of the DataFrame, especially in complex scenarios with multiple tables and relationships. Here is an example of how the modified schema would look for a field that is a foreign key:

```json
"dataset_field_mapping": {
"alternate_identifier": {
"field_name": "fototeca.vista_ckan.cod_vuelo",
"is_p_key": true,
"index": true,
"f_key_references": [
"fototeca.vuelos.cod_vuelo"
]
}
}
```

2. **Multilingual Fields (`languages`)**: These fields have different values for different languages. Each language is represented as a separate object within the field object (`es`, `en`, ...). The language object can have `field_value`, and `field_name` properties, just like a regular field.

## DCAT Profiles
This plugin also contains a custom [`ckanext-dcat` profiles](./ckanext/schemingdcat/profiles) to serialize a CKAN dataset to a:
Expand Down Expand Up @@ -832,6 +978,9 @@ To define which profiles to use you can:
Note that in both cases the order in which you define them is important, as it will be the one that the profiles will be run on.

### Multilingual RDF support
**Requirement**:
- If you want to use custom schemas with multilingual support, you need to use `ckanext-fluent`. A fixed version is available at [mjanez/ckanext-fluent](https://github.com/mjanez/ckanext-fluent).

To add multilingual values from CKAN to RDF, the [`SchemingDCATRDFProfile` method `_object_value](./ckanext/schemingdcat/profiles/base.py)` can be called with optional parameter `multilang=true` (defaults to `false`)).
If `_object_value` is called with the `multilang=true`-parameter, but no language-attribute is found, the value will be added as Literal with the default language (en).

Expand Down
23 changes: 17 additions & 6 deletions ckanext/schemingdcat/assets/css/schemingdcat.css
Original file line number Diff line number Diff line change
Expand Up @@ -953,6 +953,23 @@ img.spatial_uri-icon {
vertical-align: middle;
}

/* Rows dataset/resource metadata */
.truncate-text {
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
display: block;
max-width: 100%;
}
.truncate-link {
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
display: inline-block;
max-width: 100%;
vertical-align: middle;
}

/* Links list*/
.link-list {
list-style-type: "🔗 ";
Expand Down Expand Up @@ -1445,12 +1462,6 @@ img.item_image {
table-layout: fixed; /* Fija el layout de la tabla */
width: 100%; /* Asegura que la tabla ocupe el 100% del contenedor */
}
/* Apply text truncation to cells in metadata_info */
.truncate-text {
overflow: hidden; /* Hides the text overflow */
text-overflow: ellipsis; /* Adds ellipsis if text is too long */
}

.table tr.toggle-separator {
display: table-row;
}
Expand Down
4 changes: 4 additions & 0 deletions ckanext/schemingdcat/i18n/ckanext-schemingdcat.pot
Original file line number Diff line number Diff line change
Expand Up @@ -852,6 +852,10 @@ msgstr ""
msgid "Last Modified"
msgstr ""

#: ckanext/harvest/templates/source/search.html:25
msgid "Only <code>sysadmin</code> users can manage harvest sources. Check the <a href='https://github.com/mjanez/ckanext-harvest'>ckanext-harvest</a> documentation for more information."
msgstr ""

#: ckanext/harvest/templates/source/search.html:50
msgid "Search harvest sources..."
msgstr ""
Expand Down
Binary file modified ckanext/schemingdcat/i18n/en/LC_MESSAGES/ckanext-schemingdcat.mo
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -856,6 +856,10 @@ msgstr "Name Descending"
msgid "Last Modified"
msgstr "Last Modified"

#: ckanext/harvest/templates/source/search.html:25
msgid "Only <code>sysadmin</code> users can manage harvest sources. Check the <a href='https://github.com/mjanez/ckanext-harvest'>ckanext-harvest</a> documentation for more information."
msgstr "Only <code>sysadmin</code> users can manage harvest sources. Check the <a href='https://github.com/mjanez/ckanext-harvest'>ckanext-harvest</a> documentation for more information."

#: ckanext/harvest/templates/source/search.html:50
msgid "Search harvest sources..."
msgstr "Search harvest sources..."
Expand Down
Binary file modified ckanext/schemingdcat/i18n/es/LC_MESSAGES/ckanext-schemingdcat.mo
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -858,6 +858,10 @@ msgstr "Nombre descendente"
msgid "Last Modified"
msgstr "Última modificación"

#: ckanext/harvest/templates/source/search.html:25
msgid "Only <code>sysadmin</code> users can manage harvest sources. Check the <a href='https://github.com/mjanez/ckanext-harvest'>ckanext-harvest</a> documentation for more information."
msgstr "Solo los usuarios <code>sysadmin</code> pueden gestionar las fuentes de cosecha. Comprueba la documentación de <a href='https://github.com/mjanez/ckanext-harvest'>ckanext-harvest</a> para más información."

#: ckanext/harvest/templates/source/search.html:50
msgid "Search harvest sources..."
msgstr "Buscar fuentes de cosecha..."
Expand Down
12 changes: 6 additions & 6 deletions ckanext/schemingdcat/logic/auth/ckan.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import logging
import typing

import ckan.plugins.toolkit as toolkit
import ckan.plugins as p
import ckan.logic as logic
import ckan.logic.auth as auth

Expand All @@ -12,7 +12,7 @@
_check_access = logic.check_access

# Credits to logic.auth.ckan: https://github.com/kartoza/ckanext-dalrrd-emc-dcpr
@toolkit.chained_auth_function
@p.toolkit.chained_auth_function
def package_update(next_auth, context, data_dict=None):
"""Custom auth for the package_update action.

Expand All @@ -24,7 +24,7 @@ def package_update(next_auth, context, data_dict=None):
if user.sysadmin:
final_result = next_auth(context, data_dict)
elif data_dict is not None:
# NOTE: we do not call toolkit.get_action("package_show") here but rather do it
# NOTE: we do not call p.toolkit.get_action("package_show") here but rather do it
# the same as vanilla CKAN which uses a custom way to retrieve the object from
# the context - this is in order to ensure other extensions
# (e.g. ckanext.harvest) are able to function correctly
Expand All @@ -40,7 +40,7 @@ def package_update(next_auth, context, data_dict=None):
org_id = data_dict.get("owner_org", package.owner_org)
if org_id is not None:
# Using the schemingdcat_member_list action to obtain correct roles
members = toolkit.get_action("schemingdcat_member_list")(
members = p.toolkit.get_action("schemingdcat_member_list")(
data_dict={"id": org_id, "object_type": "user"}
)
#log.debug('members:%s', members)
Expand All @@ -62,7 +62,7 @@ def package_update(next_auth, context, data_dict=None):
return final_result


@toolkit.chained_auth_function
@p.toolkit.chained_auth_function
def package_patch(
next_auth: typing.Callable, context: typing.Dict, data_dict: typing.Dict
):
Expand Down Expand Up @@ -93,7 +93,7 @@ def authorize_package_publish(
# beforehand, so we deny
owner_org = data_.get("owner_org", data_.get("group_id"))
if owner_org is not None:
members = toolkit.get_action("member_list")(
members = p.toolkit.get_action("member_list")(
data_dict={"id": owner_org, "object_type": "user"}
)
admin_member_ids = [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2451,7 +2451,7 @@ dataset_fields:
es: Condiciones atmosféricas (AC)
value: http://inspire.ec.europa.eu/theme/ac
- label:
en: Land Cover (LC)
en: Land cover (LC)
es: Cubierta terrestre (LC)
value: http://inspire.ec.europa.eu/theme/lc
- label:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2147,7 +2147,7 @@ dataset_fields:
es: Condiciones atmosféricas (AC)
value: http://inspire.ec.europa.eu/theme/ac
- label:
en: Land Cover (LC)
en: Land cover (LC)
es: Cubierta terrestre (LC)
value: http://inspire.ec.europa.eu/theme/lc
- label:
Expand Down
20 changes: 20 additions & 0 deletions ckanext/schemingdcat/templates/package/base.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% ckan_extends %}

{% block breadcrumb_content %}
{% if pkg %}
{% set org = h.get_organization(pkg.organization.id) %}
{% set dataset = h.dataset_display_name(pkg) %}
{% if org %}
{% set organization = h.get_translated(org, 'title') or org.name %}
{% set group_type = org.type %}
<li>{% link_for h.humanize_entity_type('organization', group_type, 'breadcrumb') or _('Organizations'), named_route=group_type ~ '.index' %}</li>
<li>{% link_for organization|truncate(30), named_route=group_type ~ '.read', id=org.name, title=organization %}</li>
{% else %}
<li>{% link_for _(dataset_type.title()), named_route=dataset_type ~ '.search' %}</li>
{% endif %}
<li{{ self.breadcrumb_content_selected() }}>{% link_for dataset|truncate(30), named_route=pkg.type ~ '.read', id=pkg.name, title=dataset %}</li>
{% else %}
<li>{% link_for _(dataset_type.title()), named_route=dataset_type ~ '.search' %}</li>
<li class="active"><a href="">{{ h.humanize_entity_type('package', dataset_type, 'create label') or _('Create Dataset') }}</a></li>
{% endif %}
{% endblock %}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<span class="truncate-text">{{ data[field.field_name] }}</span>
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{{ h.link_to(data[field.field_name], data[field.field_name], rel=field.display_property, target='_blank') }}
{{ h.link_to(data[field.field_name], data[field.field_name], rel=field.display_property, target='_blank', class='truncate-link') }}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{% if data[field.field_name] %}
{% set url_name = 'EPSG:' + h.schemingdcat_prettify_url_name(data[field.field_name]) %}
{{ h.link_to(url_name, data[field.field_name], rel=field.display_property, target='_blank') }}
{{ h.link_to(url_name, data[field.field_name], rel=field.display_property, target='_blank', class='truncate-link') }}
{% endif %}
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{% set url_name = h.schemingdcat_prettify_url_name(data[field.field_name]) %}

{{ h.link_to(url_name, data[field.field_name], rel=field.display_property, target='_blank') }}
{{ h.link_to(url_name, data[field.field_name], rel=field.display_property, target='_blank', class='truncate-link') }}
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@
<ul class="{{ _class }}">
{% for val, label in filtered_choices %}
<li>
<a class="{% block link_schema_list_item_class %}{{ val }}{% endblock %}" href="{{ val }}">{{ label }}</a>
<a class="{% block link_schema_list_item_class %}{{ val }}{% endblock %} truncate-link" href="{{ val }}">{{ label }}</a>
</li>
{% endfor %}
</ul>
{% else %}
{% for val, label in filtered_choices %}
<a class="{{ val }}" href="{{ val }}">{{ label }}</a>
<a class="{{ val }} truncate-link" href="{{ val }}">{{ label }}</a>
{% endfor %}
{% endif %}
{% endblock %}
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
{% set value = h.scheming_clean_json_value(value.replace(' ', '')) %}
{% set label = h.scheming_clean_json_value(label.replace(' ', '')) %}
<li>
<a href="{{ value }}">{{ label }}</a>
<a href="{{ value }}" class="truncate-link">{{ label }}</a>
</li>
{% endfor %}
</ul>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
{% for value in values if value|length %}
{% set value = h.scheming_clean_json_value(value.replace(' ', '')) %}
<li>
<a href="{{ value }}">{{ value }}</a>
<a href="{{ value }}" class="truncate-link">{{ value }}</a>
</li>
{% endfor %}
</ul>
Expand Down
Loading
Loading