Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating README and local config #4211

Open
wants to merge 12 commits into
base: qat
Choose a base branch
from
Open

Conversation

sethstoudenmier
Copy link
Contributor

@sethstoudenmier sethstoudenmier commented Oct 18, 2024

Description:
Update the README and other aspects of the usaspending-api codebase to support local development.

Technical details:
Below is a brief list of changes going file by file in order to give a high-level reasoning of "why".

  • .env.template
    • The wrong database name was used for the Broker DB
  • .gitignore
    • Support change of the docker-compose.yaml to use more relative paths
  • Dockerfile
    • The es_configure job relies on curl to query the ES indexes
    • Without curl the elasticsearch_indexer job fails to create the templates for the ES indexes and thus creates many mis-typed fields on the indexes
  • Dockerfile.spark
    • Similar changes that were made to the Dockerfile to change the base image used
    • Also had to update the version of OpenJDK
    • Some new installs were needed to support the different image
    • Old installs were left in place to avoid breaking things
  • Dockerfile.testing
    • Update to the OpenJDK version similar to Dockerfile.spark
    • Remove the logic that changed the working directory in favor of having the volume named /usaspending-api instead of /dockermount
  • Makefile
    • Update references of docker-compose to instead use docker compose (read more here)
    • Some paths updated to relative paths
  • README.md
    • Some cleanup
    • Made changes to reflect the steps that I took when setting everything up
  • docker-compose.yaml
    • Removed the version as that is deprecated
    • Updated to use relative paths; main reasoning here is to avoid use of parameters such as $PWD in windows and also more WSL friendly
    • Updated the usaspending-test volume to be /usaspending-api in the container to support hot reload when testing
  • usaspending_api/search/tests/integration/test_spending_by_category_categories.py
    • This test never mocked the ES Transaction index; as a result if you tried to run this locally and you had data on your ES index it would fail because it would retrieve data
    • Would be good to update the test cases to return actual data instead of a count of zero, but that is out of the scope of this PR

Requirements for PR merge:

  1. Unit & integration tests updated
  2. API documentation updated
  3. Necessary PR reviewers:
    • Backend
  4. Matview impact assessment completed
  5. Frontend impact assessment completed
  6. Data validation completed
  7. N/A Appropriate Operations ticket(s) created
  8. N/A Jira Ticket

Area for explaining above N/A when needed:

@sethstoudenmier sethstoudenmier marked this pull request as ready for review October 21, 2024 21:13
ayubshahab
ayubshahab previously approved these changes Oct 29, 2024
# NOTE: The .env file is used to provide environment variable values that replace variables in the compose file
# Because the .env file does not live in the same place as the compose file, we have to tell compose explicitly
# where it is with "--project_directory". Since this is called from the root Makefile, using ./ points to the dir
# of that Makefile
docker-compose ${profiles} --project-directory . --file ${docker_compose_file} ${args}
docker compose ${profiles} --project-directory . --file ${docker_compose_file} ${args}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove the - between docker compose? I have historically always used docker-compose. Is there a difference? For example: docker-compose up usaspending-db usaspending-es

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayubshahab Recent updates to docker have added compose as a sub-command of docker itself instead of being a standalone executable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aguest-kc
Copy link
Contributor

@sethstoudenmier What do you think about adding an init.sql script to run some common SQL commands when setting up the DB for the first time, since we're updating the docs? There are some common issues and their solutions don't seem to be documented publicly.

  • ALTER USER usaspending SET SEARCH_PATH TO "public", "raw", "int", "temp", "rpt";
    • This prevents a common error about tables outside of the public schema not being found.
  • Create the etl_user
    • This user is needed for migrations to run.
  • Create the readonly user
    • This user is needed for migrations to run

@sethstoudenmier
Copy link
Contributor Author

@sethstoudenmier What do you think about adding an init.sql script to run some common SQL commands when setting up the DB for the first time, since we're updating the docs? There are some common issues and their solutions don't seem to be documented publicly.

  • ALTER USER usaspending SET SEARCH_PATH TO "public", "raw", "int", "temp", "rpt";

    • This prevents a common error about tables outside of the public schema not being found.
  • Create the etl_user

    • This user is needed for migrations to run.
  • Create the readonly user

    • This user is needed for migrations to run

Funny you should say that, I actually had that in place originally. The problem is that it conflicts with the Travis CI/CD build process when we use the --reuse-db flag because it attempts to create users that already exist. Instead I opted to document that in the README.

@aguest-kc
Copy link
Contributor

Ahh.. that makes sense with the CI/CD process. I missed the changes to the README, because it was collapsed. I'm happy with those steps being documented!

Copy link
Contributor

@zachflanders-frb zachflanders-frb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

RUN yum -y groupinstall "Development Tools"
# Install dependencies
RUN apt update && \
apt install -y coreutils gcc wget openssl libssl-dev libbz2-dev build-essential
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apt install -y coreutils gcc wget openssl libssl-dev libbz2-dev build-essential
apt install -y coreutils gcc wget openssl libssl-dev libbz2-dev build-essential curl

@@ -259,10 +263,10 @@ services:

spark-worker:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found I had to add the following environmental variables to the spark worker:

      PYTHONPATH: /project
      DJANGO_SETTINGS_MODULE: "usaspending_api.settings"
      DATABASE_URL: postgres://${USASPENDING_DB_USER}:${USASPENDING_DB_PASSWORD}@${USASPENDING_DB_HOST}:${USASPENDING_DB_PORT}/data_store_api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants