Skip to content

Commit

Permalink
Merge pull request #60 from Open-Telecoms-Data/jh/docs-update-post-te…
Browse files Browse the repository at this point in the history
…sting

chore(docs): update howto, development and README following internal …
  • Loading branch information
odscjen authored Jul 25, 2024
2 parents 550431a + 105ae6d commit 3714fd0
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 165 deletions.
139 changes: 3 additions & 136 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,138 +1,5 @@
# OFDS Deduplication Tool
# OFDS Deduplication Tool [Beta]

A tool to consolidate multiple data sets formatted using the Open Fibre Data Standard. Implemented as a QGIS plugin.
A [QGIS](https://www.qgis.org/) plugin to consolidate (deduplicate, combine) two fibre optic network data sets formatted using the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io).

## Development Environment

## Developing the QGIS Plugin

Handy links to look at:

- https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html
- https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins

### Setup

Tools you'll need:

- QGIS 3.28+
- Qt5 Dev Tools, which should include:
- Qt5 Designer
- `pyuic5` tool

Install QGIS dev package:

```bash
sudo apt install qgis-dev # Ubuntu
sudo dnf install qgis-devel # Fedora
```

### Enabling & Running the plugin in QGIS

You'll need to symlink your project directory into QGIS's local plugins directory, making the directory if it doesn't already exist, i.e.:

From within the project git repo/directory:

```bash
cd path/to/ofds_consolidation_tool/

# NOTE: The project folder *must* only use underscores and letters, no dashes

QGIS_PLUGINS_DIR="$HOME/.local/share/QGIS/QGIS3/profiles/default/python/plugins"
mkdir -p "$QGIS_PLUGINS_DIR"
ln -s "$PWD" "$QGIS_PLUGINS_DIR"
```

There are a couple of useful helper plugins for developing your plugin, `Plugin Reloader` and `First Aid`, see: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins

Be sure to configure your IDE/Python environment with access to QGIS python libraries, e.g.:

```bash
export PYTHONPATH="$PYTHONPATH:/usr/share/qgis/python/plugins:/usr/share/qgis/python"
```

See: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows

In QGIS, go to `Plugins > Manage and Install Plugins`. Search for 'odfs', and activate our plugin in the list. A button should appear on the menu that says "Consolidate OFDS".

When you make changes to the plugin code, you can reload the plugin using `Plugins > Plugin Reloader` (the first time, configure it to reload the ofds_consolidation_tool plugin. After that you can use ctrl+F5).

### UI changes

You can open the `gui.ui` file in Qt5 Designer to make changes to the UI. After each change, run:

```
pyuic5 gui.ui > gui.py
```

### Commit Messages

To enable automatic Semantic Versioning, please use [Angular-like commit convention](https://www.conventionalcommits.org/en/v1.0.0/#summary).

To get started quickly, follow a structure like:

```
<type>(<scope>): Short description
Long description
```

Where `<type>` is one of `feat` (minor version bump), `fix`
(patch version bump) or `chore` (no version change, e.g. documentation changes).

The `<scope>` is per-project, and up to us to decide what to use, e.g. `tool` for tool changes, `docs` for documentation changes, `gui` for pure GUI changes. Use more as needed/relevent. For example:

```
feat(tool): A new feature
A longer description about this new feature and all it's
wonderful new featurelets.
```

### Code Format

Please format code with [Black](https://black.readthedocs.io/en/stable/) formatter/convention.

NOTE: We must only use *relative* imports for importing within this codebase, due to the odd nature of the QGIS plugin environment.

### Dev tools virtual environment

Dev tools i.e. pytest are installed in a virtual environment, but we still need access to the global python environment to access QGIS's PyQGIS libraries. To do this, create the venv with additional access to system packages:

```bash
python -m venv --system-site-packages --upgrade-deps .venv
```

or replace `/bin/python` with the path to your Python installation used by QGIS.

Then activate the virtualenv, and install dev tools:

```bash
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r dev_requirements.txt
```

#### Updating dev tools

To update dev tools, edit `dev_requirements.in` with any changes needed, then use `pip-compile` to update the pinned versions, and finally upgrade the venv:

```bash
pip-compile requirements_dev.in
python -m pip install --upgrade -r dev_requirements.txt
```

#### Running Tests

Setup:

```bash
source .venv/bin/activate # Activate your virtual environment
python -m pip install -e . # Install the module as editable to make tests work
```

To run tests, run PyTest as normal:

```bash
pytest
```
For more information see the [OFDS Consolidation Tool docs](https://open-telecoms-data.github.io/ofds_consolidation_tool/)
8 changes: 4 additions & 4 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ QGIS plugins are written in pure Python, and external libraries must be bundled

To read more about how to develop plugins for QGIS in general, see:

- https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html
- https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins
- [QGIS docs: building a Python plugin](https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html)
- [QGIS developer cookbook: developing Python plugins](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins)

## Setup

Expand Down Expand Up @@ -54,13 +54,13 @@ Configure your IDE/Python environment with access to QGIS python libraries:
export PYTHONPATH="$PYTHONPATH:/usr/share/qgis/python/plugins:/usr/share/qgis/python"
```

See: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows
See: [QGIS developer cookbook: configuring your IDE on Linux and Windows](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows)

In QGIS, go to `Plugins > Manage and Install Plugins`. Search for 'odfs', and activate our plugin in the list. A button should appear on the menu that says "Consolidate OFDS".

Install the `Plugin Reloader` plugin so you can reload any code changes you make without having to restart QGIS. #todo

You may also find it helpful to install the `First Aid` plugin; see: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins
You may also find it helpful to install the `First Aid` plugin; see: [QGIS developer cookbook: Useful plugins](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins)

When you make changes to the plugin code, you can reload the plugin using `Plugins > Plugin Reloader`. The first time, configure it to reload the ofds_consolidation_tool plugin. After that you can use ctrl/cmd+F5.

Expand Down
54 changes: 29 additions & 25 deletions docs/howto.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,24 +11,25 @@ nav_order: 2

## Install QGIS and the plugin

The consolidation tool is a [QGIS](https://qgis.org/) plugin. First, follow the [installation instructions for QGIS]() for your operating system. The tool works with QGIS version 3.28 and higher. If you have an older version of QGIS installed, you need to upgrade it.
The consolidation tool is a [QGIS](https://qgis.org/) plugin. The tool works with QGIS version 3.28 and higher. If you have an older version of QGIS installed, you need to upgrade it.

<!--
To install the plugin, you need an internet connection, but after that it will work offline.
The plugin is currently in development beta, and not yet available from the QGIS plugin libraries.

1. Open QGIS. Go to Plugins > Manage and Install Plugins.
2. Search for "OFDS Consolidation Tool" and click "Install Plugin".
-->
To download and install the plugin, you need an internet connection, but after that it will work offline.

The plugin is currently in development beta, and not yet available from the QGIS plugin libraries. To install the plugin for testing, follow the instructions in [Development > Enabling and running the plugin]().
1. Open the [Consolidation tool Github repository](https://github.com/Open-Telecoms-Data/ofds_consolidation_tool) ensuring you are on the main branch.
2. Click `Code` > `Download ZIP` and download a zipped version of the repository.
3. Open QGIS. Go to `Plugins > Manage and Install Plugins > Install from ZIP`.
4. Search for and select "ofds_consolidation_tool-main.zip" and click `Install Plugin`.
5. The tool can now be accessed via the `Consolidate OFDS` button that has appeared in the `Plugins` toolbar.

If any changes are made to the tool in the Github repository you will need to repeat these steps to update your installation.

## Your data

You should start with node and span data for the two networks you want to compare.

The network data need to be in [geoJSON]() format compatible with the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io).

To convert OFDS JSON data into geoJSON, [use this conversion tool](). To convert data from other formats into OFDS, or to validate your data against the Open Fibre Data Standard, please [see these other things]().
The network data need to be in [geoJSON](https://geojson.org/) format compatible with the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io).

An example of this is; a node network:

Expand Down Expand Up @@ -100,34 +101,37 @@ and a span network:
}
```

To convert OFDS JSON data into geoJSON, [use this conversion tool](https://ofds.cove.opendataservices.coop/). To validate your data against the Open Fibre Data Standard, please [use this validation tool](https://ofds.cove.opendataservices.coop/). To convert data from kml to OFDS [this beta kml2ofds tool](https://github.com/stevesong/kml2ofds) is available.

## Consolidating networks

1. Add your two span and node data files as layers to the project by going to Layer > Add Layer > Add Vector Layer.
2. Navigate to your geojson files one at a time under 'Source' and press 'Add' for each one. They should appear under the Layers list, and appear visually on the map window.
3. Optionally, adjust the colours and thicknesses of each layer by double clicking on each layer in the Layers list.
1. Add your two span and node data files as layers to the project by going to `Layer > Add Layer > Add Vector Layer`.
2. Navigate to your geojson files one at a time under `Source` and press `Add` for each one. Do not adjust the default options, in particular FLATTEN_NESTED_ATTRIBUTES must be set to NO.
3. All four files should now appear in the Layers panel, and appear visually on the Map view.

Tip: To view a map underneath the nodes and spans, go to the Browser window > XYZ tiles and double click Open Street Map or other map tiles of your choice. (TODO: is this the default? How to add maps if there are none?) In the Layers list, make sure the nodes and spans are _above_ the map layer to see them. Adding the map is not necessary for using the tool, but it may make it easier to understand your data.
Tip: To view a map underneath the nodes and spans, go to the Browser panel > `XYZ tiles` and double click `Open Street Map` or other map tiles of your choice. In the Layers panel, make sure the nodes and spans are _above_ the map layer to see them. Adding the map is not necessary for using the tool, but it may make it easier to understand your data.

4. Click "Consolidate OFDS" in the toolbar.
4. Click `Consolidate OFDS` in the toolbar.
5. Select the layers for the spans and nodes of each network using the dropdown menus in the Select Inputs tab.
6. Change any settings you need (see [settings](#settings)).
7. The tool presents data on nodes and spans which are geographically close to each other, pair by pair, along with a confidence score for how likely they are to be duplicates. Click "Consolidate" to confirm the pair presented are duplicates and should be merged. Click "Keep Both" to confirm the pair are _not_ duplicates, and should not be merged. If you're not sure, click "Next". You can use the "Next" and "Previous" buttons to cycle through the comparisons until you have marked them all as either "Consolidate" or "Keep Both".
8. When you've reviewed all of the comparisons, click "Finish".
9. Choose where you would like to save the consolidated node and span GeoJSON files.
7. The tool presents data on nodes and spans which are geographically close to each other, pair by pair, along with a confidence score for how likely they are to be duplicates (see [scoring](#scoring)). The pair being compared will be highlighted in yellow in the tools map inserts. Click `Consolidate` to confirm the pair presented are duplicates and should be merged. Click `Keep Both` to confirm the pair are _not_ duplicates, and should not be merged. If you're not sure, click `Next`. You can use the `Next` and `Previous` buttons to cycle through the comparisons until you have marked them all as either `Consolidate` or `Keep Both`. If there are multiple potential matches, once you have confirmed one match all other potential matches will be automatically assigned to `Keep Both`. If you then try and consolidate one of these pairs the tool will warn you and give you the opportunity to change which of the pairs is consolidated.
8. When you've reviewed all of the nodes comparisons, click `Finish` and repeat step 7 for the spans. The results of your nodes consolidation will be used to select potential span matches, only spans with consolidated nodes will be presented.
9. Once you've reviewed all of the spans comparisons, click `Finish`.
10. Choose where you would like to save the consolidated node and span GeoJSON files.

### Settings

When two features are consolidated, for some fields the data cannot be merged or combined, and only the data from one network will be kept. The network you select for the "Primary Network" will be the one for which data is kept in this case.

You can adjust some thresholds based on the accuracy and completeness of the data you are comparing, and how much oversight you want over the consolidation.

* **Node match radius:** compare nodes within this distance of each other. On data with high precision and accuracy for geographic elements, you may wish to set this number low; for less precise or inaccurate data, a higher number means more comparisons will be made.
* **Node match radius:** compare nodes within this distance of each other. For data with high precision and accuracy for geographic elements, you may wish to set this number low; for less precise or inaccurate data, a higher number means more comparisons will be made.
* **Ask above (%):** the confidence score above which the tool should prompt you to consolidate. Below this score, pairs are assumed to not be matches, and both are kept in the final output.
* **Auto consolidate above (%):** the confidence score above which the tool should automatically consolidate nodes/spans without prompting.

### Scoring

The tool first compares all pairs nodes in the two networks which are within the [node match radius](#settings) of each other. It then updates the span data to use the consolidated nodes, and then compares all pairs of spans which have the same start and end nodes.
The tool first compares all pairs of nodes in the two networks which are within the [node match radius](#settings) of each other. It then updates the span data to use the consolidated nodes, and then compares all pairs of spans which have the same start and end nodes.

The **overall confidence score** of the similarity between two features is generated by comparing the values of each field of each feature. Confidence scores for each pair of fields are generated, which are then combined to generate the overall score.

Expand All @@ -142,15 +146,15 @@ When doing a manual comparison, the overall confidence score, and the breakdown

### Output

The final output of the tool are geoJSON files saved to your computer locally.
The final output of the tool are geoJSON files saved locally to your computer.

Each feature has an additional `provenance` object, containing the following:

* `wasDerivedFrom`: array of the ids of the two features that were consolidated.
* `generatedAtTime`: date or datetime the network was generated.
* `wasDerivedFrom`: an array of the ids of the two features that were consolidated.
* `generatedAtTime`: the date or datetime the network was generated.
* `confidence`: a score between 0 and 1 representing how likely the two source features are the same.
* `similarFields`: array of field names for fields with high similarity scores between the two features.
* `manual`: bool; true means the merge was confirmed manually by the user; false means it was done by the tool.
* `similarFields`: an array of field names for fields with high similarity scores between the two features.
* `manual`: a boolean value; true means the merge was confirmed manually by the user; false means it was done by the tool.

```
{
Expand Down

0 comments on commit 3714fd0

Please sign in to comment.