diff --git a/README.md b/README.md index 62475a7..19ae79b 100644 --- a/README.md +++ b/README.md @@ -1,138 +1,5 @@ -# OFDS Deduplication Tool +# OFDS Deduplication Tool [Beta] -A tool to consolidate multiple data sets formatted using the Open Fibre Data Standard. Implemented as a QGIS plugin. +A [QGIS](https://www.qgis.org/) plugin to consolidate (deduplicate, combine) two fibre optic network data sets formatted using the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io). -## Development Environment - -## Developing the QGIS Plugin - -Handy links to look at: - -- https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html -- https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins - -### Setup - -Tools you'll need: - -- QGIS 3.28+ -- Qt5 Dev Tools, which should include: - - Qt5 Designer - - `pyuic5` tool - -Install QGIS dev package: - -```bash -sudo apt install qgis-dev # Ubuntu -sudo dnf install qgis-devel # Fedora -``` - -### Enabling & Running the plugin in QGIS - -You'll need to symlink your project directory into QGIS's local plugins directory, making the directory if it doesn't already exist, i.e.: - -From within the project git repo/directory: - -```bash -cd path/to/ofds_consolidation_tool/ - -# NOTE: The project folder *must* only use underscores and letters, no dashes - -QGIS_PLUGINS_DIR="$HOME/.local/share/QGIS/QGIS3/profiles/default/python/plugins" -mkdir -p "$QGIS_PLUGINS_DIR" -ln -s "$PWD" "$QGIS_PLUGINS_DIR" -``` - -There are a couple of useful helper plugins for developing your plugin, `Plugin Reloader` and `First Aid`, see: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins - -Be sure to configure your IDE/Python environment with access to QGIS python libraries, e.g.: - -```bash -export PYTHONPATH="$PYTHONPATH:/usr/share/qgis/python/plugins:/usr/share/qgis/python" -``` - -See: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows - -In QGIS, go to `Plugins > Manage and Install Plugins`. Search for 'odfs', and activate our plugin in the list. A button should appear on the menu that says "Consolidate OFDS". - -When you make changes to the plugin code, you can reload the plugin using `Plugins > Plugin Reloader` (the first time, configure it to reload the ofds_consolidation_tool plugin. After that you can use ctrl+F5). - -### UI changes - -You can open the `gui.ui` file in Qt5 Designer to make changes to the UI. After each change, run: - -``` -pyuic5 gui.ui > gui.py -``` - -### Commit Messages - -To enable automatic Semantic Versioning, please use [Angular-like commit convention](https://www.conventionalcommits.org/en/v1.0.0/#summary). - -To get started quickly, follow a structure like: - -``` -(): Short description - -Long description -``` - -Where `` is one of `feat` (minor version bump), `fix` -(patch version bump) or `chore` (no version change, e.g. documentation changes). - -The `` is per-project, and up to us to decide what to use, e.g. `tool` for tool changes, `docs` for documentation changes, `gui` for pure GUI changes. Use more as needed/relevent. For example: - -``` -feat(tool): A new feature - -A longer description about this new feature and all it's -wonderful new featurelets. -``` - -### Code Format - -Please format code with [Black](https://black.readthedocs.io/en/stable/) formatter/convention. - -NOTE: We must only use *relative* imports for importing within this codebase, due to the odd nature of the QGIS plugin environment. - -### Dev tools virtual environment - -Dev tools i.e. pytest are installed in a virtual environment, but we still need access to the global python environment to access QGIS's PyQGIS libraries. To do this, create the venv with additional access to system packages: - -```bash -python -m venv --system-site-packages --upgrade-deps .venv -``` - -or replace `/bin/python` with the path to your Python installation used by QGIS. - -Then activate the virtualenv, and install dev tools: - -```bash -source .venv/bin/activate -python -m pip install --upgrade pip -python -m pip install -r dev_requirements.txt -``` - -#### Updating dev tools - -To update dev tools, edit `dev_requirements.in` with any changes needed, then use `pip-compile` to update the pinned versions, and finally upgrade the venv: - -```bash -pip-compile requirements_dev.in -python -m pip install --upgrade -r dev_requirements.txt -``` - -#### Running Tests - -Setup: - -```bash -source .venv/bin/activate # Activate your virtual environment -python -m pip install -e . # Install the module as editable to make tests work -``` - -To run tests, run PyTest as normal: - -```bash -pytest -``` \ No newline at end of file +For more information see the [OFDS Consolidation Tool docs](https://open-telecoms-data.github.io/ofds_consolidation_tool/) \ No newline at end of file diff --git a/docs/development.md b/docs/development.md index bb6315e..3da5ce0 100644 --- a/docs/development.md +++ b/docs/development.md @@ -15,8 +15,8 @@ QGIS plugins are written in pure Python, and external libraries must be bundled To read more about how to develop plugins for QGIS in general, see: -- https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html -- https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins +- [QGIS docs: building a Python plugin](https://www.qgistutorials.com/en/docs/3/building_a_python_plugin.html) +- [QGIS developer cookbook: developing Python plugins](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/index.html#developing-python-plugins) ## Setup @@ -54,13 +54,13 @@ Configure your IDE/Python environment with access to QGIS python libraries: export PYTHONPATH="$PYTHONPATH:/usr/share/qgis/python/plugins:/usr/share/qgis/python" ``` -See: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows +See: [QGIS developer cookbook: configuring your IDE on Linux and Windows](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#a-note-on-configuring-your-ide-on-linux-and-windows) In QGIS, go to `Plugins > Manage and Install Plugins`. Search for 'odfs', and activate our plugin in the list. A button should appear on the menu that says "Consolidate OFDS". Install the `Plugin Reloader` plugin so you can reload any code changes you make without having to restart QGIS. #todo -You may also find it helpful to install the `First Aid` plugin; see: https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins +You may also find it helpful to install the `First Aid` plugin; see: [QGIS developer cookbook: Useful plugins](https://docs.qgis.org/testing/en/docs/pyqgis_developer_cookbook/plugins/ide_debugging.html#useful-plugins-for-writing-python-plugins) When you make changes to the plugin code, you can reload the plugin using `Plugins > Plugin Reloader`. The first time, configure it to reload the ofds_consolidation_tool plugin. After that you can use ctrl/cmd+F5. diff --git a/docs/howto.md b/docs/howto.md index 252b0e3..ba17aa6 100644 --- a/docs/howto.md +++ b/docs/howto.md @@ -11,24 +11,25 @@ nav_order: 2 ## Install QGIS and the plugin -The consolidation tool is a [QGIS](https://qgis.org/) plugin. First, follow the [installation instructions for QGIS]() for your operating system. The tool works with QGIS version 3.28 and higher. If you have an older version of QGIS installed, you need to upgrade it. +The consolidation tool is a [QGIS](https://qgis.org/) plugin. The tool works with QGIS version 3.28 and higher. If you have an older version of QGIS installed, you need to upgrade it. - +To download and install the plugin, you need an internet connection, but after that it will work offline. -The plugin is currently in development beta, and not yet available from the QGIS plugin libraries. To install the plugin for testing, follow the instructions in [Development > Enabling and running the plugin](). +1. Open the [Consolidation tool Github repository](https://github.com/Open-Telecoms-Data/ofds_consolidation_tool) ensuring you are on the main branch. +2. Click `Code` > `Download ZIP` and download a zipped version of the repository. +3. Open QGIS. Go to `Plugins > Manage and Install Plugins > Install from ZIP`. +4. Search for and select "ofds_consolidation_tool-main.zip" and click `Install Plugin`. +5. The tool can now be accessed via the `Consolidate OFDS` button that has appeared in the `Plugins` toolbar. + +If any changes are made to the tool in the Github repository you will need to repeat these steps to update your installation. ## Your data You should start with node and span data for the two networks you want to compare. -The network data need to be in [geoJSON]() format compatible with the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io). - -To convert OFDS JSON data into geoJSON, [use this conversion tool](). To convert data from other formats into OFDS, or to validate your data against the Open Fibre Data Standard, please [see these other things](). +The network data need to be in [geoJSON](https://geojson.org/) format compatible with the [Open Fibre Data Standard](https://open-fibre-data-standard.readthedocs.io). An example of this is; a node network: @@ -100,20 +101,23 @@ and a span network: } ``` +To convert OFDS JSON data into geoJSON, [use this conversion tool](https://ofds.cove.opendataservices.coop/). To validate your data against the Open Fibre Data Standard, please [use this validation tool](https://ofds.cove.opendataservices.coop/). To convert data from kml to OFDS [this beta kml2ofds tool](https://github.com/stevesong/kml2ofds) is available. + ## Consolidating networks -1. Add your two span and node data files as layers to the project by going to Layer > Add Layer > Add Vector Layer. -2. Navigate to your geojson files one at a time under 'Source' and press 'Add' for each one. They should appear under the Layers list, and appear visually on the map window. -3. Optionally, adjust the colours and thicknesses of each layer by double clicking on each layer in the Layers list. +1. Add your two span and node data files as layers to the project by going to `Layer > Add Layer > Add Vector Layer`. +2. Navigate to your geojson files one at a time under `Source` and press `Add` for each one. Do not adjust the default options, in particular FLATTEN_NESTED_ATTRIBUTES must be set to NO. +3. All four files should now appear in the Layers panel, and appear visually on the Map view. -Tip: To view a map underneath the nodes and spans, go to the Browser window > XYZ tiles and double click Open Street Map or other map tiles of your choice. (TODO: is this the default? How to add maps if there are none?) In the Layers list, make sure the nodes and spans are _above_ the map layer to see them. Adding the map is not necessary for using the tool, but it may make it easier to understand your data. +Tip: To view a map underneath the nodes and spans, go to the Browser panel > `XYZ tiles` and double click `Open Street Map` or other map tiles of your choice. In the Layers panel, make sure the nodes and spans are _above_ the map layer to see them. Adding the map is not necessary for using the tool, but it may make it easier to understand your data. -4. Click "Consolidate OFDS" in the toolbar. +4. Click `Consolidate OFDS` in the toolbar. 5. Select the layers for the spans and nodes of each network using the dropdown menus in the Select Inputs tab. 6. Change any settings you need (see [settings](#settings)). -7. The tool presents data on nodes and spans which are geographically close to each other, pair by pair, along with a confidence score for how likely they are to be duplicates. Click "Consolidate" to confirm the pair presented are duplicates and should be merged. Click "Keep Both" to confirm the pair are _not_ duplicates, and should not be merged. If you're not sure, click "Next". You can use the "Next" and "Previous" buttons to cycle through the comparisons until you have marked them all as either "Consolidate" or "Keep Both". -8. When you've reviewed all of the comparisons, click "Finish". -9. Choose where you would like to save the consolidated node and span GeoJSON files. +7. The tool presents data on nodes and spans which are geographically close to each other, pair by pair, along with a confidence score for how likely they are to be duplicates (see [scoring](#scoring)). The pair being compared will be highlighted in yellow in the tools map inserts. Click `Consolidate` to confirm the pair presented are duplicates and should be merged. Click `Keep Both` to confirm the pair are _not_ duplicates, and should not be merged. If you're not sure, click `Next`. You can use the `Next` and `Previous` buttons to cycle through the comparisons until you have marked them all as either `Consolidate` or `Keep Both`. If there are multiple potential matches, once you have confirmed one match all other potential matches will be automatically assigned to `Keep Both`. If you then try and consolidate one of these pairs the tool will warn you and give you the opportunity to change which of the pairs is consolidated. +8. When you've reviewed all of the nodes comparisons, click `Finish` and repeat step 7 for the spans. The results of your nodes consolidation will be used to select potential span matches, only spans with consolidated nodes will be presented. +9. Once you've reviewed all of the spans comparisons, click `Finish`. +10. Choose where you would like to save the consolidated node and span GeoJSON files. ### Settings @@ -121,13 +125,13 @@ When two features are consolidated, for some fields the data cannot be merged or You can adjust some thresholds based on the accuracy and completeness of the data you are comparing, and how much oversight you want over the consolidation. -* **Node match radius:** compare nodes within this distance of each other. On data with high precision and accuracy for geographic elements, you may wish to set this number low; for less precise or inaccurate data, a higher number means more comparisons will be made. +* **Node match radius:** compare nodes within this distance of each other. For data with high precision and accuracy for geographic elements, you may wish to set this number low; for less precise or inaccurate data, a higher number means more comparisons will be made. * **Ask above (%):** the confidence score above which the tool should prompt you to consolidate. Below this score, pairs are assumed to not be matches, and both are kept in the final output. * **Auto consolidate above (%):** the confidence score above which the tool should automatically consolidate nodes/spans without prompting. ### Scoring -The tool first compares all pairs nodes in the two networks which are within the [node match radius](#settings) of each other. It then updates the span data to use the consolidated nodes, and then compares all pairs of spans which have the same start and end nodes. +The tool first compares all pairs of nodes in the two networks which are within the [node match radius](#settings) of each other. It then updates the span data to use the consolidated nodes, and then compares all pairs of spans which have the same start and end nodes. The **overall confidence score** of the similarity between two features is generated by comparing the values of each field of each feature. Confidence scores for each pair of fields are generated, which are then combined to generate the overall score. @@ -142,15 +146,15 @@ When doing a manual comparison, the overall confidence score, and the breakdown ### Output -The final output of the tool are geoJSON files saved to your computer locally. +The final output of the tool are geoJSON files saved locally to your computer. Each feature has an additional `provenance` object, containing the following: -* `wasDerivedFrom`: array of the ids of the two features that were consolidated. -* `generatedAtTime`: date or datetime the network was generated. +* `wasDerivedFrom`: an array of the ids of the two features that were consolidated. +* `generatedAtTime`: the date or datetime the network was generated. * `confidence`: a score between 0 and 1 representing how likely the two source features are the same. -* `similarFields`: array of field names for fields with high similarity scores between the two features. -* `manual`: bool; true means the merge was confirmed manually by the user; false means it was done by the tool. +* `similarFields`: an array of field names for fields with high similarity scores between the two features. +* `manual`: a boolean value; true means the merge was confirmed manually by the user; false means it was done by the tool. ``` {