Skip to content

Reconciliation for UK Charities and other nonprofit organisations, with elasticsearch back end.


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

Find that charity

Elasticsearch-powered search engine for looking for charities and other non-profit organisations. Allows for:

  • importing data nearly 20 sources in the UK, ensuring that duplicates are matched to one record.
  • An elasticsearch index that can be queried.
  • Org-ids are added to organisations.
  • Reconciliation API for searching organisations, based on an optimised search query.
  • Facility for uploading a CSV of charity names and adding the (best guess) at a charity number.
  • HTML pages for searching for a charity


  1. Clone repository
  2. Create virtual environment (python -m venv env)
  3. Activate virtual environment (env/bin/activate or env/Scripts\activate)
  4. Install requirements (pip install -r requirements.txt)
  5. Install postgres
  6. Start postgres
  7. Create 2 postgres databases - one for admin (eg ftc_admin and one for data eg ftc_data)
  8. Install elasticsearch 7 - you may need to increase available memory (see below)
  9. Start elasticsearch
  10. Create .env file in root directory. Contents based on .env.example.
  11. Create the database tables (python ./ migrate --database=data && python ./ migrate --database=admin && python ./ createcachetable --database=admin)
  12. Import data on charities (python ./ import_charities)
  13. Import data on nonprofit companies (python ./ import_ch)
  14. Import data on other non-profit organisations (python ./ import_all)
  15. Add organisations to elasticsearch index (python ./ es_index) - (Don't use the default search_index command as this won't setup aliases correctly)

Dokku Installation

1. Set up dokku server

SSH into server and run:

# create app
dokku apps:create ftc

# postgres
sudo dokku plugin:install postgres
dokku postgres:create ftc-db-data
dokku postgres:link ftc-db-data ftc --alias "DATABASE_URL"
dokku postgres:create ftc-db-admin
dokku postgres:link ftc-db-admin ftc --alias "DATABASE_ADMIN_URL"

# elasticsearch
sudo dokku plugin:install elasticsearch
echo 'vm.max_map_count=262144' | sudo tee -a /etc/sysctl.conf; sudo sysctl -p
export ELASTICSEARCH_IMAGE="elasticsearch"
dokku elasticsearch:create ftc-es
dokku elasticsearch:link ftc-es ftc
# configure elasticsearch 7:

# setup elasticsearch increased memory (might be needed)
nano /var/lib/dokku/services/elasticsearch/ftc-es/config/jvm.options
# replace `-Xms512m` with `-Xms2g`
# replace `-Xms512m` with `-Xmx2g`
# restart elasticsearch
dokku elasticsearch:restart ftc-es

# Redirect
dokku plugin:install
dokku redirect:set ftc

sudo dokku plugin:install
dokku letsencrypt:set ftc email [email protected]
dokku letsencrypt:enable ftc
dokku letsencrypt:cron-job --add

2. Add as a git remote and push

On local machine:

git remote add dokku dokku@SERVER_HOST:ftc
git push dokku main

3. Setup and run import

On Dokku server run:

# setup
dokku run ftc python ./ migrate --database=data
dokku run ftc python ./ migrate --database=admin
dokku run ftc python ./ createcachetable --database=admin

# run import
dokku run ftc python ./ charity_setup
dokku run ftc python ./ import_oscr
dokku run ftc python ./ import_charities
dokku run ftc python ./ import_ch
dokku run ftc python ./ import_other_data
dokku run ftc python ./ import_all
dokku run ftc python ./ es_index


The server uses django. Run it with the following command:

python ./ runserver

The server offers the following API endpoints:

  • /reconcile: a reconciliation service API conforming to the OpenRefine reconciliation API specification.

  • /charity/12345: Look up information about a particular charity



  • tests for ensuring data is correctly imported
  • server tests
  • use results of server/ to produce the best reconciliation search query for use in the server (recon_test_7 seems the best at the moment)
  • threshold for when to use the result vs discard

Future development:

  • upload a CSV file and reconcile each row with a charity
  • allow updating a charity with additional possible names


coverage run pytest && coverage html
python -m http.server -d htmlcov --bind 8001