CrowdEval

CrowdEval is an experimental crowdsourced factchecking application, created as part of my MComp Computer Science dissertation project. The idea for this project was provided by the project supervisor, Carolina Scarton.

CrowdEval ships with two Docker environments:

a development environment with containerised infrastructure, but the app and frontend run natively
a fully containerised production environment

Installation of Development Environment

Dependencies:

Python 3.9.4
Poetry

cd into the webapp directory
Run poetry install
- On macOS Big Sur, if Numpy fails to install, ensure the environment variable SYSTEM_VERSION_COMPAT is set to 1 and try again
cp .env.dev .env
- Enter a SECRET_KEY, which can be any string
- Enter the TWITTER_ API keys. This has to be a new-style Twitter project because we use v2.0 of the Twitter API
  - The callback URL for development should be localhost:5000/login/twitter/authorized
- Enter the RECAPTCHA_ API keys
- The remainder of the file is already configured for the development environment and shouldn't need to be changed
Run docker-compose up to start
Once the Elasticsearch service has started, run:
```
$ poetry run flask create-index -i posts -c infrastructure/elasticsearch/posts.json
```
This will create the required posts index.
Install frontend dependencies with npm install
To start asset compilation and the Flask dev server, run
```
$ npm run start
```
The app will be started on localhost:5000
Migrate the database with poetry run flask db upgrade

Deploying to production

Working from the root directory:

cp webapp/.env.prod webapp/.env
- Enter a SECRET_KEY, which should be a random ~32-character secret
- Enter the TWITTER_ API keys. This has to be a new-style Twitter project because we use v2.0 of the Twitter API
  - The callback URL should be <your hostname>/login/twitter/authorized
- Enter the RECAPTCHA_ API keys
- The remainder of the file is already configured for the production environment and shouldn't need to be changed
docker-compose up

The application should now be built and started via Gunicorn.

Administrative commands

In development, these commands should be prefixed with poetry run (or run poetry shell once to activate and run as is).

In production, attach to the crowdeval service i.e.:
docker-compose run --entrypoint "bash -l" crowdeval
and then run the commands, although note that flask needs to be run from ./venv/bin/flask to ensure the correct version is used.

Seeding with dummy data

The system can import the Kochkina et al.'s PHEME dataset, which has been pre-processed and stored in /seeds/kochkina_et_al_PHEME. To import it:

$ flask import-tweet-seeds seeds/kochkina_et_al_PHEME

This will create a .veracities.json file, which can then be used to seed random (but biased towards the dataset's veracity) ratings with

$ flask seed-ratings

Regenerating explore cache

The explore by rating pages are served from Redis, and must be manually regenerated.

$ flask recache-explore

In production this is probably best run as a scheduled task on the host machine via cron with an entry such as (note the hard coded path to the docker-compose.yml file):

*/5 * * * * /usr/local/bin/docker-compose -f /data/crowdeval/docker-compose.yml run --entrypoint "venv/bin/flask recache-explore" crowdeval >> crowdeval-cron.log 2>&1

Quality Control

Tests and linting can be run with

$ flask test
$ flask lint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CrowdEval

Installation of Development Environment

Deploying to production

Administrative commands

Seeding with dummy data

Regenerating explore cache

Quality Control

Files

README.md

Latest commit

History

README.md

File metadata and controls

CrowdEval

Installation of Development Environment

Deploying to production

Administrative commands

Seeding with dummy data

Regenerating explore cache

Quality Control