From 336286f2795203213548028e526d3899a8a3f6d8 Mon Sep 17 00:00:00 2001 From: Tadhg O'Higgins <2626258+tadhg-ohiggins@users.noreply.github.com> Date: Fri, 22 Mar 2024 16:11:53 -0700 Subject: [PATCH] Jadudm/materialized views (#3511) * Adds materialized views to the startup Uses the run.sh refactored form to add materialized views. These are dropped first, then created on every deploy. This is because we might choose to change the view between deploys. In that case, we should completely destroy it. The Django command fac materialized_views --refresh can be used to refresh the view(s) on any cycle desired. * Linting * Added unmanaged model for materialized view * Switched from using dissemination tables in search module to materialized view * An increment on the materialized view. * Adding *all* the columns. Because, if we're doing it, we should go all the way. * Code cleaning * Added passthrough to materialized view * Adding in the .profile change We dropped/added views in run.sh, but did not add it to .profile. Before deploying, this would be a good idea. * Incremental * Adds the workflow for materialized views Configures a matrix for the 3 core environments on a schedule, and allows us to run via workflow_dispatch for a single target environment * Add materialized view migration file. * Linting fixes. * Updated test cases to use Dissemination Combined view * Ensure search is run against dissemination combined * Updated test cases to use dissemination combined * Code improvement * Added testing environment to TestAdminAPI * Skipping TestAdminAPI for now * Disabling more test for now. Before going to prod, all these tests must pass * Remove skipped API tests * Add source for materialized view sh functions. * Remove materialized view commands from .profile. * Fixing names query The names query was potentially not right. It appears to now perform better (in terms of time to execute) as well as "do the right thing". * Adding in index creation. Timing data in summary reports. * Approximate 4x speedup in SF-SAC generation This walks the DisseminationCombined table *only once*, which reduces the number of times we traverse 4.4M rows. This currently has a weird alternation in the view, to test the two different exports. * Adding .profile I wonder if this actually is an issue? * Yep, it matters. This causes a timeout. has to run post-deploy. * Allows for alternation... In seconds 0-9 of a minute, we get the original report generator. 19-19, the new... 20-29 the old... This makes testing the two against each-other in preview possible. * Workflow changes for preview * For want of a 'd' * Updates tests to accommodate timing info * Removing "TESTING" We should be using a default dockerized set of values for the connection string. * Linting. * Revert workflow test changes * Set materizalized views creation post deployment * Updating tests * Troubleshooting commit. And some linting fixes. * Troubleshooting commit. * Troubleshooting commit. * Proof of concept * Switched Admin API Test to using django.db library * More tests * Still testing * ....more testing * bug fix * Trying another approach of creating test tables * Temporarily skipping for speed of troubleshooting * Undo some changes made for testing purpose * Commented out the new code in an attempt to isolate the issue * Commenting more code to isolate the issue * Looking for env variable values * Reverting back previous changes now that the issue is isolated * Typo * Fixed linting * Ensure summary report is using combined view * Re-structure script to ensure db tables are created before materialized views * Code cleaning + improvement * Reverting tests that were skipped for debugging purpose * Switched to concurrent refresh mode for materialized view * Fixing PK * Removing model fields to align with view * Removing ID * Possibly correct? * Hotswap the view. * Removing unique column * remove unused files and commands * clean up stray comments * lint * bring refresh back. it's used for a lot of tests. * restore MV shell * re-remove shell file, remove concurrently from MV refresh sql * build MV on start i guess * bring back passthrough * lint * no more shell script * Update backend/run.sh --------- Co-authored-by: Matt Jadud Co-authored-by: Hassan D. M. Sambo Co-authored-by: Alex Steel <130377221+asteel-gsa@users.noreply.github.com> Co-authored-by: Daniel Swick <2365503+danswick@users.noreply.github.com> --- .github/workflows/materialize-views.yml | 32 + backend/audit/test_workbooks_should_fail.py | 2 +- backend/dissemination/api_versions.py | 11 + .../management/commands/materialized_views.py | 22 + .../migrations/0015_disseminationcombined.py | 623 ++++++++++++++++++ backend/dissemination/models.py | 352 +++++++++- backend/dissemination/search.py | 36 +- .../dissemination/searchlib/search_alns.py | 8 +- .../searchlib/search_direct_funding.py | 4 +- .../searchlib/search_findings.py | 28 +- .../dissemination/searchlib/search_general.py | 65 +- .../searchlib/search_major_program.py | 4 +- .../sql/create_materialized_views.sql | 179 +++++ .../sql/drop_materialized_views.sql | 3 + .../sql/refresh_materialized_views.sql | 1 + backend/dissemination/summary_reports.py | 216 ++++-- backend/dissemination/test_search.py | 143 +++- backend/dissemination/test_summary_reports.py | 42 +- backend/dissemination/test_views.py | 54 +- backend/dissemination/views.py | 3 +- .../admin_api_v1_1_0/create_access_tables.sql | 2 + .../api/admin_api_v1_1_0/drop_views.sql | 2 +- 22 files changed, 1656 insertions(+), 176 deletions(-) create mode 100644 .github/workflows/materialize-views.yml create mode 100644 backend/dissemination/management/commands/materialized_views.py create mode 100644 backend/dissemination/migrations/0015_disseminationcombined.py create mode 100644 backend/dissemination/sql/create_materialized_views.sql create mode 100644 backend/dissemination/sql/drop_materialized_views.sql create mode 100644 backend/dissemination/sql/refresh_materialized_views.sql diff --git a/.github/workflows/materialize-views.yml b/.github/workflows/materialize-views.yml new file mode 100644 index 0000000000..111b095585 --- /dev/null +++ b/.github/workflows/materialize-views.yml @@ -0,0 +1,32 @@ +--- +name: Run the Materialize Views Django Function +on: + workflow_dispatch: + inputs: + environment: + required: true + type: choice + description: The environment the workflow should run on. + options: + - dev + - staging + - preview + - production + +jobs: + dispatch-materialize-views: + if: ${{ github.event.inputs.environment != '' }} + name: Run Materialize Views on ${{ inputs.environment }} + runs-on: ubuntu-latest + environment: ${{ inputs.environment }} + env: + space: ${{ inputs.environment }} + steps: + - name: Run Command + uses: cloud-gov/cg-cli-tools@main + with: + cf_username: ${{ secrets.CF_USERNAME }} + cf_password: ${{ secrets.CF_PASSWORD }} + cf_org: gsa-tts-oros-fac + cf_space: ${{ env.space }} + command: cf run-task gsa-fac -k 2G -m 2G --name dispatch_create_materialized_views --command "python manage.py materialized_views --create" diff --git a/backend/audit/test_workbooks_should_fail.py b/backend/audit/test_workbooks_should_fail.py index 278be9aae5..e8f5354edf 100644 --- a/backend/audit/test_workbooks_should_fail.py +++ b/backend/audit/test_workbooks_should_fail.py @@ -1,7 +1,7 @@ -from django.test import SimpleTestCase import os from functools import reduce import re +from django.test import SimpleTestCase from django.core.exceptions import ValidationError from audit.intakelib import ( diff --git a/backend/dissemination/api_versions.py b/backend/dissemination/api_versions.py index 1c21efbab7..ccd6c7ec61 100644 --- a/backend/dissemination/api_versions.py +++ b/backend/dissemination/api_versions.py @@ -1,6 +1,7 @@ from psycopg2._psycopg import connection from config import settings import logging +import os logger = logging.getLogger(__name__) @@ -23,6 +24,16 @@ def get_conn_string(): return conn_string +def exec_sql_at_path(dir, filename): + conn = connection(get_conn_string()) + conn.autocommit = True + path = os.path.join(dir, filename) + with conn.cursor() as curs: + logger.info(f"EXEC SQL {path}") + sql = open(path, "r").read() + curs.execute(sql) + + def exec_sql(location, version, filename): conn = connection(get_conn_string()) conn.autocommit = True diff --git a/backend/dissemination/management/commands/materialized_views.py b/backend/dissemination/management/commands/materialized_views.py new file mode 100644 index 0000000000..67acc95b5d --- /dev/null +++ b/backend/dissemination/management/commands/materialized_views.py @@ -0,0 +1,22 @@ +from django.core.management.base import BaseCommand +from dissemination import api_versions + + +class Command(BaseCommand): + help = """ + Runs sql scripts to recreate access tables for the postgrest API. + """ + + def add_arguments(self, parser): + parser.add_argument("-c", "--create", action="store_true", default=False) + parser.add_argument("-d", "--drop", action="store_true", default=False) + parser.add_argument("-r", "--refresh", action="store_true", default=False) + + def handle(self, *args, **options): + path = "dissemination/sql" + if options["create"]: + api_versions.exec_sql_at_path(path, "create_materialized_views.sql") + elif options["drop"]: + api_versions.exec_sql_at_path(path, "drop_materialized_views.sql") + elif options["refresh"]: + api_versions.exec_sql_at_path(path, "refresh_materialized_views.sql") diff --git a/backend/dissemination/migrations/0015_disseminationcombined.py b/backend/dissemination/migrations/0015_disseminationcombined.py new file mode 100644 index 0000000000..4abeeaec41 --- /dev/null +++ b/backend/dissemination/migrations/0015_disseminationcombined.py @@ -0,0 +1,623 @@ +# Generated by Django 5.0.2 on 2024-03-14 21:56 + +from django.db import migrations, models + + +class Migration(migrations.Migration): + dependencies = [ + ("dissemination", "0014_tribalapiaccesskeyids"), + ] + + operations = [ + migrations.CreateModel( + name="DisseminationCombined", + fields=[ + ("id", models.BigAutoField(primary_key=True, serialize=False)), + ( + "report_id", + models.TextField( + help_text="GSAFAC generated identifier", + unique=True, + verbose_name="Report ID", + ), + ), + ( + "auditee_certify_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/g; SF-SAC 2001-2003: I/6/g; SF-SAC 2004-2007: I/6/g; SF-SAC 2008-2009: I/5/g; SF-SAC 2010-2012: I/5/g; SF-SAC 2013-2015: certifications; SF-SAC 2016-2018: certifications; SF-SAC 2019-2021: certifications; SF-SAC 2022: certifications Census mapping: GENERAL, AUDITEECERTIFYNAME", + verbose_name="Name of Auditee Certifying Official", + ), + ), + ( + "auditee_certify_title", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/g; SF-SAC 2001-2003: I/6/g; SF-SAC 2004-2007: I/6/g; SF-SAC 2008-2009: I/5/g; SF-SAC 2010-2012: I/5/g; SF-SAC 2013-2015: certifications; SF-SAC 2016-2018: certifications; SF-SAC 2019-2021: certifications; SF-SAC 2022: certifications Census mapping: GENERAL, AUDITEECERTIFYTITLE", + verbose_name="Title of Auditee Certifying Official", + ), + ), + ( + "auditor_certify_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/g; SF-SAC 2001-2003: I/6/g; SF-SAC 2004-2007: I/6/g; SF-SAC 2008-2009: I/5/g; SF-SAC 2010-2012: I/5/g; SF-SAC 2013-2015: certifications; SF-SAC 2016-2018: certifications; SF-SAC 2019-2021: certifications; SF-SAC 2022: certifications Census mapping: UNKNOWN", + verbose_name="Name of Auditor Certifying Official", + ), + ), + ( + "auditor_certify_title", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/g; SF-SAC 2001-2003: I/6/g; SF-SAC 2004-2007: I/6/g; SF-SAC 2008-2009: I/5/g; SF-SAC 2010-2012: I/5/g; SF-SAC 2013-2015: certifications; SF-SAC 2016-2018: certifications; SF-SAC 2019-2021: certifications; SF-SAC 2022: certifications Census mapping: UNKNOWN", + verbose_name="Title of Auditor Certifying Official", + ), + ), + ( + "auditee_contact_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/c; SF-SAC 2001-2003: I/6/c; SF-SAC 2004-2007: I/6/c; SF-SAC 2008-2009: I/5/c; SF-SAC 2010-2012: I/5/c; SF-SAC 2013-2015: I/5/c; SF-SAC 2016-2018: I/5/c; SF-SAC 2019-2021: I/5/c; SF-SAC 2022: I/5/c Census mapping: GENERAL, AUDITEECONTACT", + verbose_name="Name of Auditee Contact", + ), + ), + ( + "auditee_email", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/f; SF-SAC 2001-2003: I/6/f; SF-SAC 2004-2007: I/6/f; SF-SAC 2008-2009: I/5/f; SF-SAC 2010-2012: I/5/f; SF-SAC 2013-2015: I/5/f; SF-SAC 2016-2018: I/5/e; SF-SAC 2019-2021: I/5/e; SF-SAC 2022: I/5/e Census mapping: GENERAL, AUDITEEEMAIL", + verbose_name="Auditee Email address", + ), + ), + ( + "auditee_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/a; SF-SAC 2001-2003: I/6/a; SF-SAC 2004-2007: I/6/a; SF-SAC 2008-2009: I/5/a; SF-SAC 2010-2012: I/5/a; SF-SAC 2013-2015: I/5/a; SF-SAC 2016-2018: I/5/a; SF-SAC 2019-2021: I/5/a; SF-SAC 2022: I/5/a Census mapping: GENERAL, AUDITEENAME", + verbose_name="Name of the Auditee", + ), + ), + ( + "auditee_phone", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/d; SF-SAC 2001-2003: I/6/d; SF-SAC 2004-2007: I/6/d; SF-SAC 2008-2009: I/5/d; SF-SAC 2010-2012: I/5/d; SF-SAC 2013-2015: I/5/d; SF-SAC 2016-2018: I/5/d; SF-SAC 2019-2021: I/5/d; SF-SAC 2022: I/5/d Census mapping: GENERAL, AUDITEEPHONE", + verbose_name="Auditee Phone Number", + ), + ), + ( + "auditee_contact_title", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/c; SF-SAC 2001-2003: I/6/c; SF-SAC 2004-2007: I/6/c; SF-SAC 2008-2009: I/5/c; SF-SAC 2010-2012: I/5/c; SF-SAC 2013-2015: I/5/c; SF-SAC 2016-2018: I/5/c; SF-SAC 2019-2021: I/5/c; SF-SAC 2022: I/5/c Census mapping: GENERAL, AUDITEETITLE", + verbose_name="Title of Auditee Contact", + ), + ), + ( + "auditee_address_line_1", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/b; SF-SAC 2001-2003: I/6/b; SF-SAC 2004-2007: I/6/b; SF-SAC 2008-2009: I/5/b; SF-SAC 2010-2012: I/5/b; SF-SAC 2013-2015: I/5/b; SF-SAC 2016-2018: I/5/b; SF-SAC 2019-2021: I/5/b; SF-SAC 2022: I/5/b Census mapping: GENERAL, STREET1", + verbose_name="Auditee Street Address", + ), + ), + ( + "auditee_city", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/b; SF-SAC 2001-2003: I/6/b; SF-SAC 2004-2007: I/6/b; SF-SAC 2008-2009: I/5/b; SF-SAC 2010-2012: I/5/b; SF-SAC 2013-2015: I/5/b; SF-SAC 2016-2018: I/5/b; SF-SAC 2019-2021: I/5/b; SF-SAC 2022: I/5/b Census mapping: GENERAL, CITY", + verbose_name="Auditee City", + ), + ), + ( + "auditee_state", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/b; SF-SAC 2001-2003: I/6/b; SF-SAC 2004-2007: I/6/b; SF-SAC 2008-2009: I/5/b; SF-SAC 2010-2012: I/5/b; SF-SAC 2013-2015: I/5/b; SF-SAC 2016-2018: I/5/b; SF-SAC 2019-2021: I/5/b; SF-SAC 2022: I/5/b Census mapping: GENERAL, STATE", + verbose_name="Auditee State", + ), + ), + ( + "auditee_ein", + models.TextField( + verbose_name="Primary Employer Identification Number" + ), + ), + ( + "auditee_uei", + models.TextField( + help_text="Data sources: SF-SAC 2022: I/4/g Census mapping: GENERAL, UEI", + verbose_name="Auditee UEI", + ), + ), + ("is_additional_ueis", models.TextField()), + ( + "auditee_zip", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/6/b; SF-SAC 2001-2003: I/6/b; SF-SAC 2004-2007: I/6/b; SF-SAC 2008-2009: I/5/b; SF-SAC 2010-2012: I/5/b; SF-SAC 2013-2015: I/5/b; SF-SAC 2016-2018: I/5/b; SF-SAC 2019-2021: I/5/b; SF-SAC 2022: I/5/b Census mapping: GENERAL, ZIPCODE", + verbose_name="Auditee Zip Code", + ), + ), + ( + "auditor_phone", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/d; SF-SAC 2001-2003: I/7/d; SF-SAC 2004-2007: I/7/d; SF-SAC 2008-2009: I/6/d; SF-SAC 2010-2012: I/6/d; SF-SAC 2013-2015: I/6/e; SF-SAC 2016-2018: I/6/e; SF-SAC 2019-2021: I/6/e; SF-SAC 2022: I/6/e Census mapping: GENERAL, CPAPHONE (AND) Data sources: SF-SAC 2008-2009: I/8/d; SF-SAC 2010-2012: I/8/d; SF-SAC 2013-2015: I/8/i; SF-SAC 2016-2018: I/8/i; SF-SAC 2019-2021: I/6/h/ix; SF-SAC 2022: I/6/h/ix Census mapping: MULTIPLE CPAS INFO, CPAPHONE", + verbose_name="CPA phone number", + ), + ), + ( + "auditor_state", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/b; SF-SAC 2001-2003: I/7/b; SF-SAC 2004-2007: I/7/b; SF-SAC 2008-2009: I/6/b; SF-SAC 2010-2012: I/6/b; SF-SAC 2013-2015: I/6/c; SF-SAC 2016-2018: I/6/c; SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPASTATE (AND) Data sources: SF-SAC 2008-2009: I/8/b; SF-SAC 2010-2012: I/8/b; SF-SAC 2013-2015: I/8/e; SF-SAC 2016-2018: I/8/e; SF-SAC 2019-2021: I/6/h/v; SF-SAC 2022: I/6/h/v Census mapping: MULTIPLE CPAS INFO, CPASTATE", + verbose_name="CPA State", + ), + ), + ( + "auditor_city", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/b; SF-SAC 2001-2003: I/7/b; SF-SAC 2004-2007: I/7/b; SF-SAC 2008-2009: I/6/b; SF-SAC 2010-2012: I/6/b; SF-SAC 2013-2015: I/6/c; SF-SAC 2016-2018: I/6/c; SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPACITY (AND) Data sources: SF-SAC 2008-2009: I/8/b; SF-SAC 2010-2012: I/8/b; SF-SAC 2013-2015: I/8/d; SF-SAC 2016-2018: I/8/d; SF-SAC 2019-2021: I/6/h/iv; SF-SAC 2022: I/6/h/iv Census mapping: MULTIPLE CPAS INFO, CPACITY", + verbose_name="CPA City", + ), + ), + ( + "auditor_contact_title", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/c; SF-SAC 2001-2003: I/7/c; SF-SAC 2004-2007: I/7/c; SF-SAC 2008-2009: I/6/c; SF-SAC 2010-2012: I/6/c; SF-SAC 2013-2015: I/6/d; SF-SAC 2016-2018: I/6/d; SF-SAC 2019-2021: I/6/d; SF-SAC 2022: I/6/d Census mapping: GENERAL, CPATITLE (AND) Data sources: SF-SAC 2008-2009: I/8/c; SF-SAC 2010-2012: I/8/c; SF-SAC 2013-2015: I/8/h; SF-SAC 2016-2018: I/8/h; SF-SAC 2019-2021: I/6/h/viii; SF-SAC 2022: I/6/h/viii Census mapping: MULTIPLE CPAS INFO, CPATITLE", + verbose_name="Title of CPA Contact", + ), + ), + ( + "auditor_address_line_1", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/b; SF-SAC 2001-2003: I/7/b; SF-SAC 2004-2007: I/7/b; SF-SAC 2008-2009: I/6/b; SF-SAC 2010-2012: I/6/b; SF-SAC 2013-2015: I/6/c; SF-SAC 2016-2018: I/6/c; SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPASTREET1 (AND) Data sources: SF-SAC 2008-2009: I/8/b; SF-SAC 2010-2012: I/8/b; SF-SAC 2013-2015: I/8/c; SF-SAC 2016-2018: I/8/c; SF-SAC 2019-2021: I/6/h/iii; SF-SAC 2022: I/6/h/iii Census mapping: MULTIPLE CPAS INFO, CPASTREET1", + verbose_name="CPA Street Address", + ), + ), + ( + "auditor_zip", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/b; SF-SAC 2001-2003: I/7/b; SF-SAC 2004-2007: I/7/b; SF-SAC 2008-2009: I/6/b; SF-SAC 2010-2012: I/6/b; SF-SAC 2013-2015: I/6/c; SF-SAC 2016-2018: I/6/c; SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPAZIPCODE (AND) Data sources: SF-SAC 2008-2009: I/8/b; SF-SAC 2010-2012: I/8/b; SF-SAC 2013-2015: I/8/f; SF-SAC 2016-2018: I/8/f; SF-SAC 2019-2021: I/6/h/vi; SF-SAC 2022: I/6/h/vi Census mapping: MULTIPLE CPAS INFO, CPAZIPCODE", + verbose_name="CPA Zip Code", + ), + ), + ( + "auditor_country", + models.TextField( + help_text="Data sources: SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPACOUNTRY", + verbose_name="CPA Country", + ), + ), + ( + "auditor_contact_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/c; SF-SAC 2001-2003: I/7/c; SF-SAC 2004-2007: I/7/c; SF-SAC 2008-2009: I/6/c; SF-SAC 2010-2012: I/6/c; SF-SAC 2013-2015: I/6/d; SF-SAC 2016-2018: I/6/d; SF-SAC 2019-2021: I/6/d; SF-SAC 2022: I/6/d Census mapping: GENERAL, CPACONTACT (AND) Data sources: SF-SAC 2008-2009: I/8/c; SF-SAC 2010-2012: I/8/c; SF-SAC 2013-2015: I/8/g; SF-SAC 2016-2018: I/8/g; SF-SAC 2019-2021: I/6/h/vii; SF-SAC 2022: I/6/h/vii Census mapping: MULTIPLE CPAS INFO, CPACONTACT", + verbose_name="Name of CPA Contact", + ), + ), + ( + "auditor_email", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/f; SF-SAC 2001-2003: I/7/f; SF-SAC 2004-2007: I/7/f; SF-SAC 2008-2009: I/6/f; SF-SAC 2010-2012: I/6/f; SF-SAC 2013-2015: I/6/g; SF-SAC 2016-2018: I/6/f; SF-SAC 2019-2021: I/6/f; SF-SAC 2022: I/6/f Census mapping: GENERAL, CPAEMAIL (AND) Data sources: SF-SAC 2008-2009: I/8/f; SF-SAC 2010-2012: I/8/f; SF-SAC 2013-2015: I/8/k; SF-SAC 2016-2018: I/8/k; SF-SAC 2019-2021: I/6/h/x; SF-SAC 2022: I/6/h/x Census mapping: MULTIPLE CPAS INFO, CPAEMAIL", + verbose_name="CPA mail address (optional)", + ), + ), + ( + "auditor_firm_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/7/a; SF-SAC 2001-2003: I/7/a; SF-SAC 2004-2007: I/7/a; SF-SAC 2008-2009: I/6/a; SF-SAC 2010-2012: I/6/a; SF-SAC 2013-2015: I/6/a; SF-SAC 2016-2018: I/6/a; SF-SAC 2019-2021: I/6/a; SF-SAC 2022: I/6/a Census mapping: GENERAL, CPAFIRMNAME (AND) Data sources: SF-SAC 2008-2009: I/8/a; SF-SAC 2010-2012: I/8/a; SF-SAC 2013-2015: I/8/a; SF-SAC 2016-2018: I/8/a; SF-SAC 2019-2021: I/6/h/i; SF-SAC 2022: I/6/h/i Census mapping: MULTIPLE CPAS INFO, CPAFIRMNAME", + verbose_name="CPA Firm Name", + ), + ), + ( + "auditor_foreign_address", + models.TextField( + help_text="Data sources: SF-SAC 2019-2021: I/6/c; SF-SAC 2022: I/6/c Census mapping: GENERAL, CPAFOREIGN", + verbose_name="CPA Address - if international", + ), + ), + ( + "auditor_ein", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: I/6/b; SF-SAC 2016-2018: I/6/b; SF-SAC 2019-2021: I/6/b; SF-SAC 2022: I/6/b Census mapping: GENERAL, AUDITOR_EIN (AND) Data sources: SF-SAC 2013-2015: I/8/b; SF-SAC 2016-2018: I/8/b; SF-SAC 2019-2021: I/6/h/ii; SF-SAC 2022: I/6/h/ii Census mapping: MULTIPLE CPAS INFO, CPAEIN", + verbose_name="CPA Firm EIN (only available for audit years 2013 and beyond)", + ), + ), + ( + "cognizant_agency", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/9; SF-SAC 2001-2003: I/9 Census mapping: GENERAL, COGAGENCY", + null=True, + verbose_name="Two digit Federal agency prefix of the cognizant agency", + ), + ), + ( + "oversight_agency", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/9; SF-SAC 2001-2003: I/9 Census mapping: GENERAL, OVERSIGHTAGENCY", + null=True, + verbose_name="Two digit Federal agency prefix of the oversight agency", + ), + ), + ( + "date_created", + models.DateField( + help_text="Census mapping: GENERAL, INITIAL DATE RECEIVED", + verbose_name="The first date an audit component or Form SF-SAC was received by the Federal audit Clearinghouse (FAC).", + ), + ), + ( + "ready_for_certification_date", + models.DateField( + verbose_name="The date at which the audit transitioned to 'ready for certification'" + ), + ), + ( + "auditor_certified_date", + models.DateField( + verbose_name="The date at which the audit transitioned to 'auditor certified'" + ), + ), + ( + "auditee_certified_date", + models.DateField( + verbose_name="The date at which the audit transitioned to 'auditee certified'" + ), + ), + ( + "submitted_date", + models.DateField( + verbose_name="The date at which the audit transitioned to 'submitted'" + ), + ), + ( + "fac_accepted_date", + models.DateField( + verbose_name="The date at which the audit transitioned to 'accepted'" + ), + ), + ( + "fy_end_date", + models.DateField( + help_text="Data sources: SF-SAC 1997-2000: Part I, Item 1; SF-SAC 2001-2003: Part I, Item 1; SF-SAC 2004-2007: Part I, Item 1; SF-SAC 2008-2009: Part I, Item 1; SF-SAC 2010-2012: Part I, Item 1; SF-SAC 2013-2015: Part I, Item 1; SF-SAC 2016-2018: Part I, Item 1; SF-SAC 2019-2021: I/1/b; SF-SAC 2022: I/1/b Census mapping: GENERAL, FYENDDATE", + verbose_name="Fiscal Year End Date", + ), + ), + ( + "fy_start_date", + models.DateField( + help_text="Data sources: SF-SAC 2019-2021: Part I, Item 1(a); SF-SAC 2022: Part I, Item 1(a) Census mapping: GENERAL, FYSTARTDATE", + verbose_name="Fiscal Year Start Date", + ), + ), + ( + "audit_year", + models.TextField( + help_text="Census mapping: GENERAL, AUDITYEAR", + verbose_name="Audit year from fy_start_date.", + ), + ), + ( + "audit_type", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/2; SF-SAC 2001-2003: I/2; SF-SAC 2004-2007: I/2; SF-SAC 2008-2009: I/2; SF-SAC 2010-2012: I/2; SF-SAC 2013-2015: I/2; SF-SAC 2016-2018: I/2; SF-SAC 2019-2021: I/2; SF-SAC 2022: I/2 Census mapping: GENERAL, AUDITTYPE", + verbose_name="Type of Audit", + ), + ), + ( + "gaap_results", + models.TextField(verbose_name="GAAP Results by Auditor"), + ), + ( + "sp_framework_basis", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: III/2/a/ii; SF-SAC 2019-2021: III/2/a/i; SF-SAC 2022: III/2/a/i Census mapping: GENERAL, SP_FRAMEWORK", + verbose_name="Special Purpose Framework that was used as the basis of accounting", + ), + ), + ( + "is_sp_framework_required", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: III/2/a/iii; SF-SAC 2019-2021: III/2/a/ii; SF-SAC 2022: III/2/a/ii Census mapping: GENERAL, SP_FRAMEWORK_REQUIRED", + verbose_name="Indicate whether or not the special purpose framework used as basis of accounting by state law or tribal law", + ), + ), + ( + "sp_framework_opinions", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: III/2/a/iv; SF-SAC 2019-2021: III/2/a/iii; SF-SAC 2022: III/2/a/iii Census mapping: GENERAL, TYPEREPORT_SP_FRAMEWORK", + verbose_name="The auditor's opinion on the special purpose framework", + ), + ), + ( + "is_going_concern_included", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: II/2; SF-SAC 2001-2003: II/2; SF-SAC 2004-2007: II/2; SF-SAC 2008-2009: II/2; SF-SAC 2010-2012: II/2; SF-SAC 2013-2015: II/2; SF-SAC 2016-2018: III/2/b; SF-SAC 2019-2021: III/2/b; SF-SAC 2022: III/2/b Census mapping: GENERAL, GOINGCONCERN", + verbose_name="Whether or not the audit contained a going concern paragraph on financial statements", + ), + ), + ( + "is_internal_control_deficiency_disclosed", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: II/3; SF-SAC 2001-2003: II/3; SF-SAC 2004-2007: II/3; SF-SAC 2008-2009: II/3; SF-SAC 2010-2012: II/3; SF-SAC 2013-2015: II/3; SF-SAC 2016-2018: III/2/c; SF-SAC 2019-2021: III/2/c; SF-SAC 2022: III/2/c Census mapping: GENERAL, SIGNIFICANTDEFICIENCY", + verbose_name="Whether or not the audit disclosed a significant deficiency on financial statements", + ), + ), + ( + "is_internal_control_material_weakness_disclosed", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: II/4; SF-SAC 2001-2003: II/4; SF-SAC 2004-2007: II/4; SF-SAC 2008-2009: II/4; SF-SAC 2010-2012: II/4; SF-SAC 2013-2015: II/4; SF-SAC 2016-2018: III/2/d; SF-SAC 2019-2021: III/2/d; SF-SAC 2022: III/2/d Census mapping: GENERAL, MATERIALWEAKNESS" + ), + ), + ( + "is_material_noncompliance_disclosed", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: II/5; SF-SAC 2001-2003: II/5; SF-SAC 2004-2007: II/5; SF-SAC 2008-2009: II/5; SF-SAC 2010-2012: II/5; SF-SAC 2013-2015: II/5; SF-SAC 2016-2018: III/2/e; SF-SAC 2019-2021: III/2/e; SF-SAC 2022: III/2/e Census mapping: GENERAL, MATERIALNONCOMPLIANCE", + verbose_name="Whether or not the audit disclosed a material noncompliance on financial statements", + ), + ), + ("is_aicpa_audit_guide_included", models.TextField()), + ( + "dollar_threshold", + models.BigIntegerField( + help_text="Data sources: SF-SAC 1997-2000: III/2; SF-SAC 2001-2003: III/3; SF-SAC 2004-2007: III/2; SF-SAC 2008-2009: III/2; SF-SAC 2010-2012: III/2; SF-SAC 2013-2015: III/2; SF-SAC 2016-2018: III/3/b; SF-SAC 2019-2021: III/3/b; SF-SAC 2022: III/3/b Census mapping: GENERAL, DOLLARTHRESHOLD", + verbose_name="Dollar Threshold to distinguish between Type A and Type B programs.", + ), + ), + ( + "is_low_risk_auditee", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: III/3; SF-SAC 2001-2003: III/4; SF-SAC 2004-2007: III/3; SF-SAC 2008-2009: III/3; SF-SAC 2010-2012: III/3; SF-SAC 2013-2015: III/3; SF-SAC 2016-2018: III/3/c; SF-SAC 2019-2021: III/3/c; SF-SAC 2022: III/3/c Census mapping: GENERAL, LOWRISK", + verbose_name="Indicate whether or not the auditee qualified as a low-risk auditee", + ), + ), + ( + "agencies_with_prior_findings", + models.TextField( + verbose_name="List of agencues with prior findings" + ), + ), + ( + "entity_type", + models.TextField( + help_text="Census mapping: GENERAL, ENTITY_TYPE", + verbose_name="Self reported type of entity (i.e., States, Local Governments, Indian Tribes, Institutions of Higher Education, NonProfit)", + ), + ), + ( + "number_months", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/3; SF-SAC 2001-2003: I/3; SF-SAC 2004-2007: I/3; SF-SAC 2008-2009: I/3; SF-SAC 2010-2012: I/3; SF-SAC 2013-2015: I/3; SF-SAC 2016-2018: I/3; SF-SAC 2019-2021: I/3; SF-SAC 2022: I/3 Census mapping: GENERAL, NUMBERMONTHS", + verbose_name="Number of Months Covered by the 'Other' Audit Period", + ), + ), + ( + "audit_period_covered", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: I/3; SF-SAC 2001-2003: I/3; SF-SAC 2004-2007: I/3; SF-SAC 2008-2009: I/3; SF-SAC 2010-2012: I/3; SF-SAC 2013-2015: I/3; SF-SAC 2016-2018: I/3; SF-SAC 2019-2021: I/3; SF-SAC 2022: I/3 Census mapping: GENERAL, PERIODCOVERED", + verbose_name="Audit Period Covered by Audit", + ), + ), + ( + "total_amount_expended", + models.BigIntegerField( + help_text="Data sources: SF-SAC 1997-2000: III/6/c- Total; SF-SAC 2001-2003: III/10/d -Total; SF-SAC 2004-2007: III/9/e -Total; SF-SAC 2008-2009: III/9/e -Total; SF-SAC 2010-2012: III/9/f -Total; SF-SAC 2013-2015: III/6/d -Total; SF-SAC 2016-2018: II/1/e- Total; SF-SAC 2019-2021: II/1/e - Total; SF-SAC 2022: II/1/e - Total Census mapping: GENERAL, TOTFEDEXPEND", + verbose_name="Total Federal Expenditures", + ), + ), + ( + "type_audit_code", + models.TextField(verbose_name="Determines if audit is A133 or UG"), + ), + ( + "is_public", + models.BooleanField( + default=False, + verbose_name="True for public records, False for non-public records", + ), + ), + ( + "data_source", + models.TextField( + verbose_name="Data origin; GSA, Census, or TESTDATA" + ), + ), + ( + "additional_award_identification", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: II/1/c; SF-SAC 2019-2021: II/1/c; SF-SAC 2022: II/1/c Census mapping: CFDA INFO, AWARDIDENTIFICATION", + verbose_name="Other data used to identify the award which is not a CFDA number (e.g., program year, contract number)", + ), + ), + ( + "amount_expended", + models.BigIntegerField( + help_text="Data sources: SF-SAC 1997-2000: III/6/c; SF-SAC 2001-2003: III/10/d; SF-SAC 2004-2007: III/9/e; SF-SAC 2008-2009: III/9/e; SF-SAC 2010-2012: III/9/f; SF-SAC 2013-2015: III/6/d; SF-SAC 2016-2018: II/1/e; SF-SAC 2019-2021: II/1/e; SF-SAC 2022: II/1/e Census mapping: CFDA INFO, AMOUNT", + verbose_name="Amount Expended for the Federal Program", + ), + ), + ( + "award_reference", + models.TextField( + verbose_name="Order that the award line was reported" + ), + ), + ( + "cluster_name", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: II/1/f; SF-SAC 2019-2021: II/1/f; SF-SAC 2022: II/1/f Census mapping: CFDA INFO, CLUSTERNAME", + verbose_name="The name of the cluster", + ), + ), + ( + "cluster_total", + models.BigIntegerField( + help_text="Data sources: SF-SAC 2016-2018: II/1/h; SF-SAC 2019-2021: II/1/h; SF-SAC 2022: II/1/h Census mapping: CFDA INFO, CLUSTERTOTAL", + verbose_name="Total Federal awards expended for each individual Federal program is auto-generated by summing the amount expended for all line items with the same Cluster Name", + ), + ), + ( + "federal_agency_prefix", + models.TextField(verbose_name="2-char code refers to an agency"), + ), + ( + "federal_award_extension", + models.TextField( + verbose_name="3-digit extn for a program defined by the agency" + ), + ), + ( + "aln", + models.TextField( + verbose_name="2-char agency code concatenated to 3-digit program extn" + ), + ), + ( + "federal_program_name", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: III/6/b; SF-SAC 2001-2003: III/10/c; SF-SAC 2004-2007: III/9/d; SF-SAC 2008-2009: III/9/d; SF-SAC 2010-2012: III/9/e; SF-SAC 2013-2015: III/6/c; SF-SAC 2016-2018: II/1/d; SF-SAC 2019-2021: II/1/d; SF-SAC 2022: II/1/d Census mapping: CFDA INFO, FEDERALPROGRAMNAME", + verbose_name="Name of Federal Program", + ), + ), + ( + "federal_program_total", + models.BigIntegerField( + help_text="Data sources: SF-SAC 2016-2018: II/1/g; SF-SAC 2019-2021: II/1/g; SF-SAC 2022: II/1/g Census mapping: CFDA INFO, PROGRAMTOTAL", + verbose_name="Total Federal awards expended for each individual Federal program is auto-generated by summing the amount expended for all line items with the same CFDA Prefix and Extension", + ), + ), + ( + "findings_count", + models.IntegerField( + help_text="Data sources: SF-SAC 2013-2015: III/6/k; SF-SAC 2016-2018: III/1/c; SF-SAC 2019-2021: III/1/c; SF-SAC 2022: III/1/c Census mapping: CFDA INFO, FINDINGSCOUNT", + verbose_name="Number of findings for the federal program (only available for audit years 2013 and beyond)", + ), + ), + ( + "is_direct", + models.TextField( + help_text="Data sources: SF-SAC 2001-2003: III/10/e; SF-SAC 2004-2007: III/9/f; SF-SAC 2008-2009: III/9/f; SF-SAC 2010-2012: III/9/g; SF-SAC 2013-2015: III/6/h; SF-SAC 2016-2018: II/1/k; SF-SAC 2019-2021: II/1/k; SF-SAC 2022: II/1/k Census mapping: CFDA INFO, DIRECT", + verbose_name="Indicate whether or not the award was received directly from a Federal awarding agency", + ), + ), + ( + "is_loan", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/6/f; SF-SAC 2016-2018: II/1/i; SF-SAC 2019-2021: II/1/i; SF-SAC 2022: II/1/i Census mapping: CFDA INFO, LOANS", + verbose_name="Indicate whether or not the program is a Loan or Loan Guarantee (only available for audit years 2013 and beyond)", + ), + ), + ( + "is_major", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: III/7/a; SF-SAC 2001-2003: III/10/f; SF-SAC 2004-2007: III/9/g; SF-SAC 2008-2009: III/9/g; SF-SAC 2010-2012: III/9/h; SF-SAC 2013-2015: III/6/i; SF-SAC 2016-2018: III/1/a; SF-SAC 2019-2021: III/1/a; SF-SAC 2022: III/1/a Census mapping: CFDA INFO, MAJORPROGRAM", + verbose_name="Indicate whether or not the Federal program is a major program", + ), + ), + ( + "is_passthrough_award", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: II/1/n; SF-SAC 2019-2021: II/1/n; SF-SAC 2022: II/1/n Census mapping: CFDA INFO, PASSTHROUGHAWARD", + verbose_name="Indicates whether or not funds were passed through to any subrecipients for the Federal program", + ), + ), + ( + "loan_balance", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: II/1/j; SF-SAC 2019-2021: II/1/j; SF-SAC 2022: II/1/j Census mapping: CFDA INFO, LOANBALANCE", + verbose_name="The loan or loan guarantee (loan) balance outstanding at the end of the audit period. A response of ‘N/A’ is acceptable.", + ), + ), + ( + "audit_report_type", + models.TextField( + help_text="Data sources: SF-SAC 2004-2007: III/9/h; SF-SAC 2008-2009: III/9/h; SF-SAC 2010-2012: III/9/i; SF-SAC 2013-2015: III/6/j; SF-SAC 2016-2018: III/1/b; SF-SAC 2019-2021: III/1/b; SF-SAC 2022: III/1/b Census mapping: CFDA INFO, TYPEREPORT_MP", + verbose_name="Type of Report Issued on the Major Program Compliance", + ), + ), + ( + "other_cluster_name", + models.TextField( + help_text="Census mapping: CFDA INFO, OTHERCLUSTERNAME", + verbose_name="The name of the cluster (if not listed in the Compliance Supplement)", + ), + ), + ( + "passthrough_amount", + models.BigIntegerField( + help_text="Data sources: SF-SAC 2016-2018: II/1/o; SF-SAC 2019-2021: II/1/o; SF-SAC 2022: II/1/o Census mapping: CFDA INFO, PASSTHROUGHAMOUNT", + null=True, + verbose_name="Amount passed through to subrecipients", + ), + ), + ( + "state_cluster_name", + models.TextField( + help_text="Census mapping: CFDA INFO, STATECLUSTERNAME", + verbose_name="The name of the state cluster", + ), + ), + ( + "reference_number", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/d; SF-SAC 2016-2018: III/4/e; SF-SAC 2019-2021: III/4/e; SF-SAC 2022: III/4/e Census mapping: FINDINGS, FINDINGSREFNUMS", + verbose_name="Findings Reference Numbers", + ), + ), + ( + "is_material_weakness", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/h; SF-SAC 2016-2018: III/4/i; SF-SAC 2019-2021: III/4/i; SF-SAC 2022: III/4/i Census mapping: FINDINGS, MATERIALWEAKNESS", + verbose_name="Material Weakness finding", + ), + ), + ( + "is_modified_opinion", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/f; SF-SAC 2016-2018: III/4/g; SF-SAC 2019-2021: III/4/g; SF-SAC 2022: III/4/g Census mapping: FINDINGS, MODIFIEDOPINION", + verbose_name="Modified Opinion finding", + ), + ), + ( + "is_other_findings", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/j; SF-SAC 2016-2018: III/4/k; SF-SAC 2019-2021: III/4/k; SF-SAC 2022: III/4/k Census mapping: FINDINGS, OTHERFINDINGS", + verbose_name="Other findings", + ), + ), + ( + "is_other_matters", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/g; SF-SAC 2016-2018: III/4/h; SF-SAC 2019-2021: III/4/h; SF-SAC 2022: III/4/h Census mapping: FINDINGS, OTHERNONCOMPLIANCE", + verbose_name="Other non-compliance", + ), + ), + ( + "is_questioned_costs", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/k; SF-SAC 2016-2018: III/4/l; SF-SAC 2019-2021: III/4/l; SF-SAC 2022: III/4/l Census mapping: FINDINGS, QCOSTS", + verbose_name="Questioned Costs", + ), + ), + ( + "is_repeat_finding", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: III/4/m; SF-SAC 2019-2021: III/4/m; SF-SAC 2022: III/4/m Census mapping: FINDINGS, REPEATFINDING", + verbose_name="Indicates whether or not the audit finding was a repeat of an audit finding in the immediate prior audit", + ), + ), + ( + "is_significant_deficiency", + models.TextField( + help_text="Data sources: SF-SAC 1997-2000: II/3; SF-SAC 2001-2003: II/3; SF-SAC 2004-2007: II/3; SF-SAC 2008-2009: II/3; SF-SAC 2010-2012: II/3; SF-SAC 2013-2015: II/3; SF-SAC 2016-2018: III/2/c; SF-SAC 2019-2021: III/2/c; SF-SAC 2022: III/2/c Census mapping: FINDINGS, SIGNIFICANTDEFICIENCY", + verbose_name="Significant Deficiency finding", + ), + ), + ( + "prior_finding_ref_numbers", + models.TextField( + help_text="Data sources: SF-SAC 2016-2018: III/4/n; SF-SAC 2019-2021: III/4/n; SF-SAC 2022: III/4/n Census mapping: FINDINGS, PRIORFINDINGREFNUMS", + verbose_name="Audit finding reference numbers from the immediate prior audit", + ), + ), + ( + "type_requirement", + models.TextField( + help_text="Data sources: SF-SAC 2013-2015: III/7/e; SF-SAC 2016-2018: III/4/f; SF-SAC 2019-2021: III/4/f; SF-SAC 2022: III/4/f Census mapping: FINDINGS, TYPEREQUIREMENT", + verbose_name="Type Requirement Failure", + ), + ), + ], + options={ + "db_table": "dissemination_combined", + "managed": False, + }, + ), + ] diff --git a/backend/dissemination/models.py b/backend/dissemination/models.py index 1de4f46d5a..4db5031b88 100644 --- a/backend/dissemination/models.py +++ b/backend/dissemination/models.py @@ -1,6 +1,5 @@ from django.db import models from django.utils import timezone - from . import docs from .hist_models import census_2019, census_2022 # noqa: F401 @@ -620,3 +619,354 @@ class MigrationInspectionRecord(models.Model): passthrough = models.JSONField(blank=True, null=True) general = models.JSONField(blank=True, null=True) secondary_auditor = models.JSONField(blank=True, null=True) + + +class DisseminationCombined(models.Model): + """ + Represents the 'dissemination_combined' materialized view. + """ + + # Meta options + class Meta: + managed = False + db_table = "dissemination_combined" + + # General Information + report_id = models.TextField( + "Report ID", + help_text=REPORT_ID_FK_HELP_TEXT, + unique=True, + ) + auditee_certify_name = models.TextField( + "Name of Auditee Certifying Official", + help_text=docs.auditee_certify_name, + ) + auditee_certify_title = models.TextField( + "Title of Auditee Certifying Official", + help_text=docs.auditee_certify_title, + ) + auditor_certify_name = models.TextField( + "Name of Auditor Certifying Official", + help_text=docs.auditor_certify_name, + ) + auditor_certify_title = models.TextField( + "Title of Auditor Certifying Official", + help_text=docs.auditor_certify_title, + ) + auditee_contact_name = models.TextField( + "Name of Auditee Contact", + help_text=docs.auditee_contact, + ) + auditee_email = models.TextField( + "Auditee Email address", + help_text=docs.auditee_email, + ) + auditee_name = models.TextField("Name of the Auditee", help_text=docs.auditee_name) + auditee_phone = models.TextField( + "Auditee Phone Number", help_text=docs.auditee_phone + ) + auditee_contact_title = models.TextField( + "Title of Auditee Contact", + help_text=docs.auditee_title, + ) + auditee_address_line_1 = models.TextField( + "Auditee Street Address", help_text=docs.street1 + ) + auditee_city = models.TextField("Auditee City", help_text=docs.city) + auditee_state = models.TextField("Auditee State", help_text=docs.state) + auditee_ein = models.TextField( + "Primary Employer Identification Number", + ) + + auditee_uei = models.TextField("Auditee UEI", help_text=docs.uei_general) + + is_additional_ueis = models.TextField() + + auditee_zip = models.TextField( + "Auditee Zip Code", + help_text=docs.zip_code, + ) + auditor_phone = models.TextField("CPA phone number", help_text=docs.auditor_phone) + + auditor_state = models.TextField("CPA State", help_text=docs.auditor_state) + auditor_city = models.TextField("CPA City", help_text=docs.auditor_city) + auditor_contact_title = models.TextField( + "Title of CPA Contact", + help_text=docs.auditor_title, + ) + auditor_address_line_1 = models.TextField( + "CPA Street Address", + help_text=docs.auditor_street1, + ) + auditor_zip = models.TextField( + "CPA Zip Code", + help_text=docs.auditor_zip_code, + ) + auditor_country = models.TextField("CPA Country", help_text=docs.auditor_country) + auditor_contact_name = models.TextField( + "Name of CPA Contact", + help_text=docs.auditor_contact, + ) + auditor_email = models.TextField( + "CPA mail address (optional)", + help_text=docs.auditor_email, + ) + auditor_firm_name = models.TextField( + "CPA Firm Name", help_text=docs.auditor_firm_name + ) + # Once loaded, would like to add these as regular addresses and just change this to a country field + auditor_foreign_address = models.TextField( + "CPA Address - if international", + help_text=docs.auditor_foreign, + ) + auditor_ein = models.TextField( + "CPA Firm EIN (only available for audit years 2013 and beyond)", + help_text=docs.auditor_ein, + ) + + # Agency + cognizant_agency = models.TextField( + "Two digit Federal agency prefix of the cognizant agency", + help_text=docs.cognizant_agency, + null=True, + ) + oversight_agency = models.TextField( + "Two digit Federal agency prefix of the oversight agency", + help_text=docs.oversight_agency, + null=True, + ) + + # Dates + date_created = models.DateField( + "The first date an audit component or Form SF-SAC was received by the Federal audit Clearinghouse (FAC).", + help_text=docs.initial_date_received, + ) + ready_for_certification_date = models.DateField( + "The date at which the audit transitioned to 'ready for certification'", + ) + auditor_certified_date = models.DateField( + "The date at which the audit transitioned to 'auditor certified'", + ) + auditee_certified_date = models.DateField( + "The date at which the audit transitioned to 'auditee certified'", + ) + submitted_date = models.DateField( + "The date at which the audit transitioned to 'submitted'", + ) + fac_accepted_date = models.DateField( + "The date at which the audit transitioned to 'accepted'", + ) + + fy_end_date = models.DateField("Fiscal Year End Date", help_text=docs.fy_end_date) + fy_start_date = models.DateField( + "Fiscal Year Start Date", help_text=docs.fy_start_date + ) + audit_year = models.TextField( + "Audit year from fy_start_date.", + help_text=docs.audit_year_general, + ) + + audit_type = models.TextField( + "Type of Audit", + help_text=docs.audit_type, + ) + + # Audit Info + gaap_results = models.TextField( + "GAAP Results by Auditor", + ) # Concatenation of choices + sp_framework_basis = models.TextField( + "Special Purpose Framework that was used as the basis of accounting", + help_text=docs.sp_framework, + ) + is_sp_framework_required = models.TextField( + "Indicate whether or not the special purpose framework used as basis of accounting by state law or tribal law", + help_text=docs.sp_framework_required, + ) + sp_framework_opinions = models.TextField( + "The auditor's opinion on the special purpose framework", + help_text=docs.type_report_special_purpose_framework, + ) + is_going_concern_included = models.TextField( + "Whether or not the audit contained a going concern paragraph on financial statements", + help_text=docs.going_concern, + ) + is_internal_control_deficiency_disclosed = models.TextField( + "Whether or not the audit disclosed a significant deficiency on financial statements", + help_text=docs.significant_deficiency_general, + ) + is_internal_control_material_weakness_disclosed = models.TextField( + help_text=docs.material_weakness_general + ) + is_material_noncompliance_disclosed = models.TextField( + "Whether or not the audit disclosed a material noncompliance on financial statements", + help_text=docs.material_noncompliance, + ) + + is_aicpa_audit_guide_included = models.TextField() + dollar_threshold = models.BigIntegerField( + "Dollar Threshold to distinguish between Type A and Type B programs.", + help_text=docs.dollar_threshold, + ) + is_low_risk_auditee = models.TextField( + "Indicate whether or not the auditee qualified as a low-risk auditee", + help_text=docs.low_risk, + ) + agencies_with_prior_findings = models.TextField( + "List of agencues with prior findings", + ) # Concatenation of agency codes + # End of Audit Info + + entity_type = models.TextField( + "Self reported type of entity (i.e., States, Local Governments, Indian Tribes, Institutions of Higher Education, NonProfit)", + help_text=docs.entity_type, + ) + number_months = models.TextField( + "Number of Months Covered by the 'Other' Audit Period", + help_text=docs.number_months, + ) + audit_period_covered = models.TextField( + "Audit Period Covered by Audit", help_text=docs.period_covered + ) + total_amount_expended = models.BigIntegerField( + "Total Federal Expenditures", + help_text=docs.total_fed_expenditures, + ) + + type_audit_code = models.TextField( + "Determines if audit is A133 or UG", + ) + + is_public = models.BooleanField( + "True for public records, False for non-public records", default=False + ) + # Choices are: GSA, Census, or TESTDATA + data_source = models.TextField("Data origin; GSA, Census, or TESTDATA") + + # Federal Award Details + additional_award_identification = models.TextField( + "Other data used to identify the award which is not a CFDA number (e.g., program year, contract number)", + help_text=docs.award_identification, + ) + amount_expended = models.BigIntegerField( + "Amount Expended for the Federal Program", + help_text=docs.amount, + ) + award_reference = models.TextField( + "Order that the award line was reported", + ) + cluster_name = models.TextField( + "The name of the cluster", + help_text=docs.cluster_name, + ) + cluster_total = models.BigIntegerField( + "Total Federal awards expended for each individual Federal program is auto-generated by summing the amount expended for all line items with the same Cluster Name", + help_text=docs.cluster_total, + ) + federal_agency_prefix = models.TextField( + "2-char code refers to an agency", + ) + federal_award_extension = models.TextField( + "3-digit extn for a program defined by the agency", + ) + aln = models.TextField( + "2-char agency code concatenated to 3-digit program extn", + ) + federal_program_name = models.TextField( + "Name of Federal Program", + help_text=docs.federal_program_name, + ) + federal_program_total = models.BigIntegerField( + "Total Federal awards expended for each individual Federal program is auto-generated by summing the amount expended for all line items with the same CFDA Prefix and Extension", + help_text=docs.program_total, + ) + findings_count = models.IntegerField( + "Number of findings for the federal program (only available for audit years 2013 and beyond)", + help_text=docs.findings_count, + ) + is_direct = models.TextField( + "Indicate whether or not the award was received directly from a Federal awarding agency", + help_text=docs.direct, + ) + is_loan = models.TextField( + "Indicate whether or not the program is a Loan or Loan Guarantee (only available for audit years 2013 and beyond)", + help_text=docs.loans, + ) + is_major = models.TextField( + "Indicate whether or not the Federal program is a major program", + help_text=docs.major_program, + ) + is_passthrough_award = models.TextField( + "Indicates whether or not funds were passed through to any subrecipients for the Federal program", + help_text=docs.passthrough_award, + ) + loan_balance = models.TextField( + "The loan or loan guarantee (loan) balance outstanding at the end of the audit period. A response of ‘N/A’ is acceptable.", + help_text=docs.loan_balance, + ) + audit_report_type = models.TextField( + "Type of Report Issued on the Major Program Compliance", + help_text=docs.type_report_major_program_cfdainfo, + ) + other_cluster_name = models.TextField( + "The name of the cluster (if not listed in the Compliance Supplement)", + help_text=docs.other_cluster_name, + ) + passthrough_amount = models.BigIntegerField( + "Amount passed through to subrecipients", + help_text=docs.passthrough_amount, + null=True, + ) + state_cluster_name = models.TextField( + "The name of the state cluster", + help_text=docs.state_cluster_name, + ) + + # Finding Details + reference_number = models.TextField( + "Findings Reference Numbers", + help_text=docs.finding_ref_nums_findings, + ) + is_material_weakness = models.TextField( + "Material Weakness finding", + help_text=docs.material_weakness_findings, + ) + is_modified_opinion = models.TextField( + "Modified Opinion finding", help_text=docs.modified_opinion + ) + is_other_findings = models.TextField( + "Other findings", help_text=docs.other_findings + ) + is_other_matters = models.TextField( + "Other non-compliance", help_text=docs.other_non_compliance + ) + is_questioned_costs = models.TextField( + "Questioned Costs", help_text=docs.questioned_costs_findings + ) + is_repeat_finding = models.TextField( + "Indicates whether or not the audit finding was a repeat of an audit finding in the immediate prior audit", + help_text=docs.repeat_finding, + ) + is_significant_deficiency = models.TextField( + "Significant Deficiency finding", + help_text=docs.significant_deficiency_findings, + ) + prior_finding_ref_numbers = models.TextField( + "Audit finding reference numbers from the immediate prior audit", + help_text=docs.prior_finding_ref_nums, + ) + + # each element in the list is a FK to Finding + type_requirement = models.TextField( + "Type Requirement Failure", + help_text=docs.type_requirement_findings, + ) + + passthrough_id = models.TextField( + "Identifying Number Assigned by the Pass-through Entity", + help_text=docs.passthrough_id, + ) + passthrough_name = models.TextField( + "Name of Pass-through Entity", + help_text=docs.passthrough_name, + ) diff --git a/backend/dissemination/search.py b/backend/dissemination/search.py index b3b12b7e28..bb35f8a562 100644 --- a/backend/dissemination/search.py +++ b/backend/dissemination/search.py @@ -6,6 +6,7 @@ from .searchlib.search_findings import search_findings from .searchlib.search_direct_funding import search_direct_funding from .searchlib.search_major_program import search_major_program +from dissemination.models import DisseminationCombined logger = logging.getLogger(__name__) @@ -16,6 +17,23 @@ # https://books.agiliq.com/projects/django-orm-cookbook/en/latest/subquery.html +def is_only_general_params(params_dict): + params_set = set(list(params_dict.keys())) + gen_set = set( + [ + "audit_years", + "auditee_state", + "names", + "uei_or_eins", + "start_date", + "end_date", + "agency_name", + "cog_or_oversight", + ] + ) + return params_set.issubset(gen_set) + + def search(params): """ Given any (or no) search fields, build and execute a query on the General table and return the results. @@ -30,12 +48,18 @@ def search(params): ############## # GENERAL - results = search_general(params) - results = _sort_results(results, params) - results = search_alns(results, params) - results = search_findings(results, params) - results = search_direct_funding(results, params) - results = search_major_program(results, params) + + if is_only_general_params(params): + results = search_general(DisseminationCombined, params) + results = _sort_results(results, params) + else: + results = search_general(DisseminationCombined, params) + results = _sort_results(results, params) + results = search_alns(results, params) + results = search_findings(results, params) + results = search_direct_funding(results, params) + results = search_major_program(results, params) + results = results.distinct("report_id", params.get("order_by", "fac_accepted_date")) t1 = time.time() diff --git a/backend/dissemination/searchlib/search_alns.py b/backend/dissemination/searchlib/search_alns.py index a36122a406..2f50c1f290 100644 --- a/backend/dissemination/searchlib/search_alns.py +++ b/backend/dissemination/searchlib/search_alns.py @@ -63,14 +63,12 @@ def _build_aln_q(full_alns, agency_numbers): q = Q() if agency_numbers: # Build a filter for the agency numbers. E.g. given 93 and 45 - q |= Q( - federalaward__federal_agency_prefix__in=[an.prefix for an in agency_numbers] - ) + q |= Q(federal_agency_prefix__in=[an.prefix for an in agency_numbers]) if full_alns: for full_aln in full_alns: - q |= Q(federalaward__federal_agency_prefix=full_aln.prefix) & Q( - federalaward__federal_award_extension=full_aln.program + q |= Q(federal_agency_prefix=full_aln.prefix) & Q( + federal_award_extension=full_aln.program ) return q diff --git a/backend/dissemination/searchlib/search_direct_funding.py b/backend/dissemination/searchlib/search_direct_funding.py index f3fbb69744..c2841f614d 100644 --- a/backend/dissemination/searchlib/search_direct_funding.py +++ b/backend/dissemination/searchlib/search_direct_funding.py @@ -15,9 +15,9 @@ def search_direct_funding(general_results, params): for field in direct_funding_fields: match field: case "direct_funding": - q |= Q(federalaward__is_direct="Y") + q |= Q(is_direct="Y") case "passthrough_funding": - q |= Q(federalaward__is_direct="N") + q |= Q(is_direct="N") case _: pass diff --git a/backend/dissemination/searchlib/search_findings.py b/backend/dissemination/searchlib/search_findings.py index 6d0aab9fbd..3e85f794cd 100644 --- a/backend/dissemination/searchlib/search_findings.py +++ b/backend/dissemination/searchlib/search_findings.py @@ -17,27 +17,27 @@ def search_findings(general_results, params): case "all_findings": # This can be achieved via federalaward__findings_count__gt=0, # But, it's faster to chain ORs in the Finding table than it is to walk the FederalAward table. - q |= Q(finding__is_modified_opinion="Y") - q |= Q(finding__is_other_findings="Y") - q |= Q(finding__is_material_weakness="Y") - q |= Q(finding__is_significant_deficiency="Y") - q |= Q(finding__is_other_matters="Y") - q |= Q(finding__is_questioned_costs="Y") - q |= Q(finding__is_repeat_finding="Y") + q |= Q(is_modified_opinion="Y") + q |= Q(is_other_findings="Y") + q |= Q(is_material_weakness="Y") + q |= Q(is_significant_deficiency="Y") + q |= Q(is_other_matters="Y") + q |= Q(is_questioned_costs="Y") + q |= Q(is_repeat_finding="Y") case "is_modified_opinion": - q |= Q(finding__is_modified_opinion="Y") + q |= Q(is_modified_opinion="Y") case "is_other_findings": - q |= Q(finding__is_other_findings="Y") + q |= Q(is_other_findings="Y") case "is_material_weakness": - q |= Q(finding__is_material_weakness="Y") + q |= Q(is_material_weakness="Y") case "is_significant_deficiency": - q |= Q(finding__is_significant_deficiency="Y") + q |= Q(is_significant_deficiency="Y") case "is_other_matters": - q |= Q(finding__is_other_matters="Y") + q |= Q(is_other_matters="Y") case "is_questioned_costs": - q |= Q(finding__is_questioned_costs="Y") + q |= Q(is_questioned_costs="Y") case "is_repeat_finding": - q |= Q(finding__is_repeat_finding="Y") + q |= Q(is_repeat_finding="Y") case _: pass filtered_general_results = general_results.filter(q) diff --git a/backend/dissemination/searchlib/search_general.py b/backend/dissemination/searchlib/search_general.py index e56e29ffb6..97aebe6f5f 100644 --- a/backend/dissemination/searchlib/search_general.py +++ b/backend/dissemination/searchlib/search_general.py @@ -1,55 +1,54 @@ -from django.db.models import Q -from dissemination.models import General import time from math import ceil import logging +from django.db.models import Q logger = logging.getLogger(__name__) -def search_general(params=None): - params = params or {} +def search_general(base_model, params=None): + params = params or dict() # Time general reduction t0 = time.time() ############## # Initialize query. - r_base = General.objects.all() + r_base = base_model.objects.all() ############## # Audit years # query.add(_get_audit_years_match_query(audit_years), Q.AND) q_audit_year = _get_audit_years_match_query(params.get("audit_years", None)) - r_audit_year = General.objects.filter(q_audit_year) + r_audit_year = base_model.objects.filter(q_audit_year) ############## # State q_state = _get_auditee_state_match_query(params.get("auditee_state", None)) - r_state = General.objects.filter(q_state) + r_state = base_model.objects.filter(q_state) ############## # Names q_names = _get_names_match_query(params.get("names", None)) - r_names = General.objects.filter(q_names) + r_names = base_model.objects.filter(q_names) ############## # UEI/EIN q_uei = _get_uei_or_eins_match_query(params.get("uei_or_eins", None)) - r_uei = General.objects.filter(q_uei) + r_uei = base_model.objects.filter(q_uei) ############## # Start/end dates q_start_date = _get_start_date_match_query(params.get("start_date", None)) - r_start_date = General.objects.filter(q_start_date) + r_start_date = base_model.objects.filter(q_start_date) q_end_date = _get_end_date_match_query(params.get("end_date", None)) - r_end_date = General.objects.filter(q_end_date) + r_end_date = base_model.objects.filter(q_end_date) ############## # Cog/Over q_cogover = _get_cog_or_oversight_match_query( params.get("agency_name", None), params.get("cog_or_oversight", None) ) - r_cogover = General.objects.filter(q_cogover) + r_cogover = base_model.objects.filter(q_cogover) ############## # Intersection @@ -123,31 +122,55 @@ def _get_auditee_state_match_query(auditee_state): return Q(auditee_state__in=[auditee_state]) -def _get_names_match_query(names): +def _get_names_match_query(names_list): """ Given a list of (potential) names, return the query object that searches auditee and firm names. """ - if not names: + if not names_list: return Q() name_fields = [ - "auditee_city", + # "auditee_city", "auditee_contact_name", + "auditee_certify_name", "auditee_email", "auditee_name", - "auditee_state", - "auditor_city", + # "auditee_state", + # "auditor_city", "auditor_contact_name", + "auditor_certify_name", "auditor_email", "auditor_firm_name", - "auditor_state", + # "auditor_state", ] names_match = Q() - # turn ["name1", "name2", "name3"] into "name1 name2 name3" - names = " ".join(names) + # The search terms are coming in as a string in a list. + # E.g. the search text "college berea" returns nothing, + # when it should return entries for "Berea College". That is + # because it comes in as + # ["college berea"] + # + # This has to be flattened to a list of singleton terms. + flattened = [] + for term in names_list: + for sub in term.split(): + flattened.append(sub) + + # Now, for each field (e.g. "auditee_contact_name") + # build up an AND over the terms. We want something where all of the + # terms appear. + # Then, do an OR over all of the fields. If that combo appears in + # any of the fields, we want to return it. for field in name_fields: - names_match.add(Q(**{"%s__search" % field: names}), Q.OR) + field_q = Q() + for name in flattened: + field_q.add(Q(**{f"{field}__icontains": name}), Q.AND) + names_match.add(field_q, Q.OR) + + # Now, "college berea" and "university state ohio" return + # the appropriate terms. It is also significantly faster than what + # we had before. return names_match diff --git a/backend/dissemination/searchlib/search_major_program.py b/backend/dissemination/searchlib/search_major_program.py index e1caf4cf7b..3dbd241b53 100644 --- a/backend/dissemination/searchlib/search_major_program.py +++ b/backend/dissemination/searchlib/search_major_program.py @@ -16,9 +16,9 @@ def search_major_program(general_results, params): major_program_fields = params.get("major_program", []) if True in major_program_fields: - q |= Q(federalaward__is_major="Y") + q |= Q(is_major="Y") elif False in major_program_fields: - q |= Q(federalaward__is_major="N") + q |= Q(is_major="N") filtered_general_results = general_results.filter(q).distinct() diff --git a/backend/dissemination/sql/create_materialized_views.sql b/backend/dissemination/sql/create_materialized_views.sql new file mode 100644 index 0000000000..12ff646d59 --- /dev/null +++ b/backend/dissemination/sql/create_materialized_views.sql @@ -0,0 +1,179 @@ +CREATE SEQUENCE IF NOT EXISTS dissemination_combined_id_seq + START WITH 1 + INCREMENT BY 1 + NO MINVALUE + NO MAXVALUE; + +----------------------- +-- dissemination_combined +-- This table is used primarily by search. +CREATE MATERIALIZED VIEW IF NOT EXISTS + dissemination_combined_temp AS + SELECT + nextval('dissemination_combined_id_seq') AS id, + dg.report_id, + dfa.award_reference, + df.reference_number, + -- Build a composite ALN in case we want/need it + concat(dfa.federal_agency_prefix,'.',dfa.federal_award_extension) as aln, + -- All of diss_general as of 20240313 + dg.agencies_with_prior_findings, + dg.audit_period_covered, + dg.audit_type, + dg.audit_year, + dg.auditee_address_line_1, + dg.auditee_certified_date, + dg.auditee_certify_name, + dg.auditee_certify_title, + dg.auditee_city, + dg.auditee_contact_name, + dg.auditee_contact_title, + dg.auditee_ein, + dg.auditee_email, + dg.auditee_name, + dg.auditee_phone, + dg.auditee_state, + dg.auditee_uei, + dg.auditee_zip, + dg.auditor_address_line_1, + dg.auditor_certified_date, + dg.auditor_certify_name, + dg.auditor_certify_title, + dg.auditor_city, + dg.auditor_contact_name, + dg.auditor_contact_title, + dg.auditor_country, + dg.auditor_ein, + dg.auditor_email, + dg.auditor_firm_name, + dg.auditor_foreign_address, + dg.auditor_phone, + dg.auditor_state, + dg.auditor_zip, + dg.cognizant_agency, + dg.data_source, + dg.date_created, + dg.dollar_threshold, + dg.entity_type, + dg.fac_accepted_date, + dg.fy_end_date, + dg.fy_start_date, + dg.gaap_results, + dg.is_additional_ueis, + dg.is_aicpa_audit_guide_included, + dg.is_going_concern_included, + dg.is_internal_control_deficiency_disclosed, + dg.is_internal_control_material_weakness_disclosed, + dg.is_low_risk_auditee, + dg.is_material_noncompliance_disclosed, + dg.is_public, + dg.is_sp_framework_required, + dg.number_months, + dg.oversight_agency, + dg.ready_for_certification_date, + dg.sp_framework_basis, + dg.sp_framework_opinions, + dg.submitted_date, + dg.total_amount_expended, + dg.type_audit_code, + -- All of diss_federalaward + dfa.additional_award_identification, + dfa.amount_expended, + dfa.cluster_name, + dfa.cluster_total, + dfa.federal_agency_prefix, + dfa.federal_award_extension, + dfa.federal_program_name, + dfa.federal_program_total, + dfa.findings_count, + dfa.is_direct, + dfa.is_loan, + dfa.is_major, + dfa.is_passthrough_award, + dfa.loan_balance, + dfa.audit_report_type, + dfa.other_cluster_name, + dfa.passthrough_amount, + dfa.state_cluster_name, + -- All of diss_finding + df.is_material_weakness, + df.is_modified_opinion, + df.is_other_findings, + df.is_other_matters, + df.is_questioned_costs, + df.is_repeat_finding, + df.is_significant_deficiency, + df.prior_finding_ref_numbers, + df.type_requirement, + -- ALL of Passthrough + dp.passthrough_name, + dp.passthrough_id + FROM + dissemination_federalaward dfa + LEFT JOIN dissemination_general dg + ON dfa.report_id = dg.report_id + LEFT JOIN dissemination_finding df + ON dfa.report_id = df.report_id + AND dfa.award_reference = df.award_reference + LEFT JOIN dissemination_passthrough dp + ON dfa.report_id = dp.report_id + AND dfa.award_reference = dp.award_reference + ; + + +DROP MATERIALIZED VIEW IF EXISTS dissemination_combined; +ALTER SEQUENCE dissemination_combined_id_seq RESTART; +ALTER MATERIALIZED VIEW dissemination_combined_temp RENAME TO dissemination_combined; + +CREATE INDEX IF NOT EXISTS dc_report_id_idx + on dissemination_combined (report_id); + +CREATE INDEX IF NOT EXISTS dc_auditee_certify_name_idx + ON dissemination_combined + ((lower(auditee_certify_name))); + +CREATE INDEX IF NOT EXISTS dc_auditee_name_idx + ON dissemination_combined + ((lower(auditee_name))); + +CREATE INDEX IF NOT EXISTS dc_auditor_certify_name_idx + ON dissemination_combined + ((lower(auditor_certify_name))); + +CREATE INDEX IF NOT EXISTS dc_auditor_contact_name_idx + ON dissemination_combined + ((lower(auditor_contact_name))); + +CREATE INDEX IF NOT EXISTS dc_auditor_firm_name_idx + ON dissemination_combined + ((lower(auditor_firm_name))); + +CREATE INDEX IF NOT EXISTS dc_auditee_email_idx + on dissemination_combined ((lower(auditee_email))); + +CREATE INDEX IF NOT EXISTS dc_auditor_email_idx + on dissemination_combined ((lower(auditor_email))); + +CREATE INDEX IF NOT EXISTS dc_start_date_idx + ON dissemination_combined (fy_start_date); + +CREATE INDEX IF NOT EXISTS dc_end_date_idx + ON dissemination_combined (fy_end_date); + +CREATE INDEX IF NOT EXISTS dc_auditee_uei_idx + ON dissemination_combined (auditee_uei); + +CREATE INDEX IF NOT EXISTS dc_auditee_ein_idx + ON dissemination_combined (auditee_ein); + +CREATE INDEX IF NOT EXISTS dc_federal_agency_prefix_idx + on dissemination_combined (federal_agency_prefix); + +CREATE INDEX IF NOT EXISTS dc_federal_award_extension_idx + on dissemination_combined (federal_award_extension); + +CREATE INDEX IF NOT EXISTS dc_audit_year_idx + on dissemination_combined (audit_year); + +CREATE INDEX IF NOT EXISTS dc_aln_idx + on dissemination_combined (aln); diff --git a/backend/dissemination/sql/drop_materialized_views.sql b/backend/dissemination/sql/drop_materialized_views.sql new file mode 100644 index 0000000000..6b528f795c --- /dev/null +++ b/backend/dissemination/sql/drop_materialized_views.sql @@ -0,0 +1,3 @@ +DROP MATERIALIZED VIEW IF EXISTS dissemination_combined; + +DROP SEQUENCE IF EXISTS dissemination_combined_id_seq; diff --git a/backend/dissemination/sql/refresh_materialized_views.sql b/backend/dissemination/sql/refresh_materialized_views.sql new file mode 100644 index 0000000000..17bbd171c1 --- /dev/null +++ b/backend/dissemination/sql/refresh_materialized_views.sql @@ -0,0 +1 @@ +REFRESH MATERIALIZED VIEW dissemination_combined; diff --git a/backend/dissemination/summary_reports.py b/backend/dissemination/summary_reports.py index a808d74a62..5f17161f3e 100644 --- a/backend/dissemination/summary_reports.py +++ b/backend/dissemination/summary_reports.py @@ -1,8 +1,9 @@ from datetime import datetime -import openpyxl as pyxl import io import logging import uuid +import time +import openpyxl as pyxl from boto3 import client as boto3_client from botocore.client import ClientError, Config @@ -24,6 +25,7 @@ Note, Passthrough, SecondaryAuditor, + DisseminationCombined, ) logger = logging.getLogger(__name__) @@ -177,7 +179,7 @@ "award_reference", "federal_agency_prefix", "federal_award_extension", - "_aln", + "aln", "findings_count", "additional_award_identification", "federal_program_name", @@ -197,9 +199,9 @@ ], "finding": [ "report_id", - "_federal_agency_prefix", - "_federal_award_extension", - "_aln", + "federal_agency_prefix", + "federal_award_extension", + "aln", "award_reference", "reference_number", "type_requirement", @@ -265,11 +267,20 @@ cannot_read_tribal_disclaimer = "This document includes one or more Tribal entities that have chosen to keep their data private per 2 CFR 200.512(b)(2). It doesn't include their audit findings text, corrective action plan, or notes to SEFA." -def _get_tribal_report_ids(report_ids): +def _get_model_by_name(name): + for m in models: + if m.__name__.lower() == name: + return m + return None + + +def get_tribal_report_ids(report_ids): + t0 = time.time() """Filters the given report_ids with only ones that are tribal""" objects = General.objects.all().filter(report_id__in=report_ids, is_public=False) - - return [obj.report_id for obj in objects] + objs = [obj.report_id for obj in objects] + t1 = time.time() + return (objs, t1 - t0) def set_column_widths(worksheet): @@ -332,37 +343,10 @@ def insert_dissem_coversheet(workbook, contains_tribal, include_private): def _get_attribute_or_data(obj, field_name): - if field_name.startswith("_"): - if field_name == "_aln": - if isinstance(obj, FederalAward): - return ( - getattr(obj, "federal_agency_prefix") - + "." - + getattr(obj, "federal_award_extension") - ) - elif isinstance(obj, Finding): - fa = FederalAward.objects.get( - report_id=getattr(obj, "report_id"), - award_reference=getattr(obj, "award_reference"), - ) - return ( - getattr(fa, "federal_agency_prefix") - + "." - + getattr(fa, "federal_award_extension") - ) - else: - field_name = field_name[1:] - if isinstance(obj, Finding): - fa = FederalAward.objects.get( - report_id=getattr(obj, "report_id"), - award_reference=getattr(obj, "award_reference"), - ) - return getattr(fa, field_name) - else: - value = getattr(obj, field_name) - if isinstance(value, General): - value = value.report_id - return value + value = getattr(obj, field_name) + if isinstance(value, General): + value = value.report_id + return value def gather_report_data_dissemination(report_ids, tribal_report_ids, include_private): @@ -381,40 +365,129 @@ def gather_report_data_dissemination(report_ids, tribal_report_ids, include_priv ... } """ - + t0 = time.time() # Make report IDs unique report_ids = set(report_ids) + all_names = set(field_name_ordered.keys()) + names_in_dc = set(["general", "federalaward", "finding", "passthrough"]) + names_not_in_dc = all_names - names_in_dc + data = initialize_data_structure(names_in_dc.union(names_not_in_dc)) - data = {} + process_combined_results( + report_ids, names_in_dc, data, include_private, tribal_report_ids + ) - for model in models: - model_name = model.__name__.lower() + process_non_combined_results( + report_ids, names_not_in_dc, data, include_private, tribal_report_ids + ) - # This pulls directly from the model - # fields = model._meta.get_fields() - # field_names = [f.name for f in fields] - # This uses the ordered columns above - field_names = field_name_ordered[model_name] + return (data, time.time() - t0) - data[model_name] = {"field_names": field_names, "entries": []} +def initialize_data_structure(names): + data = {} + for model_name in names: + data[model_name] = { + "field_names": field_name_ordered[model_name], + "entries": [], + } + return data + + +def process_combined_results( + report_ids, names_in_dc, data, include_private, tribal_report_ids +): + # Grab all the rows from the combined table into a local structure. + # We'll do this in memory. This table flattens general, federalaward, and findings + # so we can move much faster on those tables without extra lookups. + dc_results = DisseminationCombined.objects.all().filter(report_id__in=report_ids) + # Different tables want to be visited/filtered differently. + visited = set() + # Do all of the names in the DisseminationCombined at the same time. + # That way, we only go through the results once. + for obj in dc_results: + for model_name in names_in_dc: + field_names = field_name_ordered[model_name] + report_id = getattr(obj, "report_id") + award_reference = getattr(obj, "award_reference") + reference_number = getattr(obj, "reference_number") + passthrough_name = getattr(obj, "passthrough_name") + + # WATCH THIS IF/ELIF + # It is making sure we do not double-disseminate some rows. + #### + # GENERAL + if model_name == "general" and report_id in visited: + pass + #### + # PASSTHROUGH + # We should never disseminate something that has no name. + elif model_name == "passthrough" and passthrough_name is None: + pass + #### + # FEDERAL AWARD + # This condition is actually filtering out the damage to the + # data from the race hazard we had at the start of 2024. + # NOTE + # We cannot filter `passthrough` here. Each award reference row has + # a one-to-many relationship with passthrough. + elif ( + model_name == "federalaward" + and f"{report_id}-{award_reference}" in visited + ): + pass + #### + # FINDING + elif model_name == "finding" and ( + award_reference is None or reference_number is None + ): + # And we don't include rows in finding where there are none. + pass + else: + # Track to limit duplication + if model_name == "general": + visited.add(report_id) + # Handle special tracking for federal awards, so we don't duplicate award # rows. + if model_name == "federalaward": + visited.add(f"{report_id}-{award_reference}") + # Omit rows for private tribal data when the user doesn't have perms + if ( + model_name in restricted_model_names + and not include_private + and report_id in tribal_report_ids + ): + pass + else: + data[model_name]["entries"].append( + [getattr(obj, field_name) for field_name in field_names] + ) + + +def process_non_combined_results( + report_ids, names_not_in_dc, data, include_private, tribal_report_ids +): + for model_name in names_not_in_dc: + model = _get_model_by_name(model_name) + print(model_name) + field_names = field_name_ordered[model_name] objects = model.objects.all().filter(report_id__in=report_ids) + # Walk the objects for obj in objects: report_id = _get_attribute_or_data(obj, "report_id") - # Omit rows for private tribal data when the user doesn't have perms if ( model_name in restricted_model_names and not include_private and report_id in tribal_report_ids ): - continue - - data[model_name]["entries"].append( - [_get_attribute_or_data(obj, field_name) for field_name in field_names] - ) - - return data + pass + else: + data[model_name]["entries"].append( + [ + _get_attribute_or_data(obj, field_name) + for field_name in field_names + ] + ) def gather_report_data_pre_certification(i2d_data): @@ -492,6 +565,7 @@ def gather_report_data_pre_certification(i2d_data): def create_workbook(data, protect_sheets=False): + t0 = time.time() workbook = pyxl.Workbook() for sheet_name in data.keys(): @@ -517,11 +591,12 @@ def create_workbook(data, protect_sheets=False): # remove sheet that is created during workbook construction workbook.remove_sheet(workbook.get_sheet_by_name("Sheet")) - - return workbook + t1 = time.time() + return (workbook, t1 - t0) def persist_workbook(workbook): + t0 = time.time() s3_client = boto3_client( service_name="s3", region_name=settings.AWS_S3_PRIVATE_REGION_NAME, @@ -552,28 +627,33 @@ def persist_workbook(workbook): except ClientError: logger.warn(f"Unable to put summary report file {filename} in S3!") raise - - return f"temp/{filename}" + t1 = time.time() + return (f"temp/{filename}", t1 - t0) def generate_summary_report(report_ids, include_private=False): - tribal_report_ids = _get_tribal_report_ids(report_ids) - data = gather_report_data_dissemination( + t0 = time.time() + (tribal_report_ids, ttri) = get_tribal_report_ids(report_ids) + (data, tgrdd) = gather_report_data_dissemination( report_ids, tribal_report_ids, include_private ) - workbook = create_workbook(data) + (workbook, tcw) = create_workbook(data) insert_dissem_coversheet(workbook, bool(tribal_report_ids), include_private) - filename = persist_workbook(workbook) - + (filename, tpw) = persist_workbook(workbook) + t1 = time.time() + logger.info( + f"SUMMARY_REPORTS generate_summary_report\n\ttotal: {t1-t0} ttri: {ttri} tgrdd: {tgrdd} tcw: {tcw} tpw: {tpw}" + ) return filename +# Ignore performance profiling for the presub. def generate_presubmission_report(i2d_data): data = gather_report_data_pre_certification(i2d_data) - workbook = create_workbook(data, protect_sheets=True) + (workbook, _) = create_workbook(data, protect_sheets=True) insert_precert_coversheet(workbook) workbook.security.workbookPassword = str(uuid.uuid4()) workbook.security.lockStructure = True - filename = persist_workbook(workbook) + (filename, _) = persist_workbook(workbook) return filename diff --git a/backend/dissemination/test_search.py b/backend/dissemination/test_search.py index e022bf2c7f..30fd712ec0 100644 --- a/backend/dissemination/test_search.py +++ b/backend/dissemination/test_search.py @@ -1,7 +1,13 @@ +import os +from django.db import connection from django.test import TestCase - -from dissemination.models import General, FederalAward, Finding -from dissemination.search import search_general, search_alns, search +from dissemination.models import DisseminationCombined, Finding, General, FederalAward +from dissemination.search import ( + search_general, + search_alns, + search, + is_only_general_params, +) from model_bakery import baker @@ -34,6 +40,21 @@ def assert_results_contain_private_and_public(cls, results): class SearchGeneralTests(TestCase): + def test_is_only_general_params_works(self): + params = { + "uei_or_eins": "not_important", + "agency_name": "not_important", + "start_date": "not_important", + } + bad_params = { + "uei_or_eins": "not_important", + "findings": "not_important", + "start_date": "not_important", + } + + self.assertTrue(is_only_general_params(params)) + self.assertFalse(is_only_general_params(bad_params)) + def test_empty_query(self): """ Given empty query parameters, search_general should return all records @@ -44,19 +65,20 @@ def test_empty_query(self): baker.make(General, is_public=True, _quantity=public_count) baker.make(General, is_public=False, _quantity=private_count) - results = search_general() + results = search_general(General) assert_results_contain_private_and_public(self, results) self.assertEqual(len(results), public_count + private_count) def test_name_matches_auditee_name(self): """ - Given an entity name, search_general should return records with a matching auditee_name + Given an entity name, search_general(General) should return records with a matching auditee_name """ auditee_name = "auditeeeeeeee" baker.make(General, is_public=True, auditee_name=auditee_name) results = search_general( + General, {"names": [auditee_name]}, ) @@ -71,6 +93,7 @@ def test_name_matches_auditor_firm_name(self): baker.make(General, is_public=True, auditor_firm_name=auditor_firm_name) results = search_general( + General, {"names": [auditor_firm_name]}, ) @@ -88,7 +111,7 @@ def test_name_multiple(self): baker.make(General, is_public=True, auditee_name="city of bronze") baker.make(General, is_public=True, auditee_name="bronze city") - results = search_general({"names": names}) + results = search_general(General, {"names": names}) assert_all_results_public(self, results) self.assertEqual(len(results), 2) @@ -102,7 +125,7 @@ def test_name_matches_inexact(self): ) baker.make(General, is_public=True, auditor_firm_name="not this one") - results = search_general({"names": ["UNIVERSITY"]}) + results = search_general(General, {"names": ["UNIVERSITY"]}) assert_all_results_public(self, results) self.assertEqual(len(results), 1) @@ -116,6 +139,7 @@ def test_uei_or_ein_matches_uei(self): baker.make(General, is_public=True, auditee_uei=auditee_uei) results = search_general( + General, {"uei_or_eins": [auditee_uei]}, ) @@ -130,6 +154,7 @@ def test_uei_or_ein_matches_ein(self): baker.make(General, is_public=True, auditee_ein=auditee_ein) results = search_general( + General, {"uei_or_eins": [auditee_ein]}, ) @@ -151,7 +176,7 @@ def test_uei_or_ein_multiple(self): baker.make(General, is_public=True, auditee_uei="not-looking-for-this-uei") baker.make(General, is_public=True, auditee_ein="not-looking-for-this-ein") - results = search_general({"uei_or_eins": uei_or_eins}) + results = search_general(General, {"uei_or_eins": uei_or_eins}) assert_all_results_public(self, results) self.assertEqual(len(results), 2) @@ -175,10 +200,11 @@ def test_date_range(self): search_end_date = datetime.date(2023, 6, 15) results = search_general( + General, { "start_date": search_start_date, "end_date": search_end_date, - } + }, ) assert_all_results_public(self, results) @@ -201,10 +227,11 @@ def test_cognizant_agency(self): baker.make(General, is_public=True, oversight_agency="01") results = search_general( + General, { "cog_or_oversight": "cog", "agency_name": "01", - } + }, ) assert_all_results_public(self, results) @@ -222,10 +249,11 @@ def test_oversight_agency(self): baker.make(General, is_public=True, oversight_agency="02") results = search_general( + General, { "cog_or_oversight": "oversight", "agency_name": "01", - } + }, ) assert_all_results_public(self, results) @@ -242,25 +270,28 @@ def test_audit_year(self): baker.make(General, is_public=True, audit_year="2022") results = search_general( + General, { "audit_years": [2016], - } + }, ) assert_all_results_public(self, results) self.assertEqual(len(results), 0) results = search_general( + General, { "audit_years": [2020], - } + }, ) assert_all_results_public(self, results) self.assertEqual(len(results), 1) results = search_general( + General, { "audit_years": [2020, 2021, 2022], - } + }, ) assert_all_results_public(self, results) self.assertEqual(len(results), 3) @@ -279,22 +310,48 @@ def test_auditee_state(self): ) # there should be on result for AL - results = search_general({"auditee_state": "AL"}) + results = search_general(General, {"auditee_state": "AL"}) assert_all_results_public(self, results) self.assertEqual(len(results), 1) self.assertEqual(results[0], al) # there should be no results for WI - results = search_general({"auditee_state": "WI"}) + results = search_general(General, {"auditee_state": "WI"}) assert_all_results_public(self, results) self.assertEqual(len(results), 0) -class SearchALNTests(TestCase): +class TestMaterializedViewBuilder(TestCase): + def setUp(self): + super().setUp() + self.execute_sql_file("dissemination/sql/create_materialized_views.sql") + + def tearDown(self): + self.execute_sql_file("dissemination/sql/drop_materialized_views.sql") + super().tearDown() + + def execute_sql_file(self, relative_path): + """Execute the SQL commands in the file at the given path.""" + full_path = os.path.join(os.getcwd(), relative_path) + try: + with open(full_path, "r") as file: + sql_commands = file.read() + with connection.cursor() as cursor: + cursor.execute(sql_commands) + except Exception as e: + print(f"Error executing SQL command: {e}") + + def refresh_materialized_view(self): + """Refresh the materialized view""" + self.execute_sql_file("dissemination/sql/refresh_materialized_views.sql") + + +class SearchALNTests(TestMaterializedViewBuilder): def test_aln_search(self): """Given an ALN (or ALNs), search_general should only return records with awards under one of these ALNs.""" + prefix_object = baker.make( General, is_public=True, report_id="2022-04-TSTDAT-0000000001" ) @@ -324,28 +381,37 @@ def test_aln_search(self): federal_agency_prefix="00", federal_award_extension="000", ) + self.refresh_materialized_view() # Just a prefix params_prefix = {"alns": ["12"]} - results_general_prefix = search_general(params_prefix) + results_general_prefix = search_general(DisseminationCombined, params_prefix) results_alns_prefix = search_alns(results_general_prefix, params_prefix) self.assertEqual(len(results_alns_prefix), 1) - self.assertEqual(results_alns_prefix[0], prefix_object) + # Check if the prefix_object's report_id is in the results + self.assertIn(prefix_object.report_id, results_alns_prefix[0].report_id) # Prefix + extension params_extention = {"alns": ["98.765"]} - results_general_extention = search_general(params_extention) + results_general_extention = search_general( + DisseminationCombined, params_extention + ) results_alns_extention = search_alns( results_general_extention, params_extention ) self.assertEqual(len(results_alns_extention), 1) - self.assertEqual(results_alns_extention[0], extension_object) + self.assertIn(extension_object.report_id, results_alns_extention[0].report_id) # Both params_both = {"alns": ["12", "98.765"]} - results_general_both = search_general(params_both) + results_general_both = search_general(DisseminationCombined, params_both) results_alns_both = search_alns(results_general_both, params_both) + self.assertEqual(len(results_alns_both), 2) + result_report_ids = set(result.report_id for result in results_alns_both) + self.assertSetEqual( + result_report_ids, {prefix_object.report_id, extension_object.report_id} + ) def test_no_associated_awards(self): """ @@ -354,8 +420,8 @@ def test_no_associated_awards(self): # General record with one award. gen_object = baker.make( General, - is_public=True, report_id="2022-04-TSTDAT-0000000001", + is_public=True, audit_year="2024", ) baker.make( @@ -366,9 +432,9 @@ def test_no_associated_awards(self): federal_award_extension="000", findings_count=1, ) - + self.refresh_materialized_view() params = {"alns": ["99"], "audit_years": ["2024"]} - results_general = search_general(params) + results_general = search_general(DisseminationCombined, params) results_alns = search_alns(results_general, params) self.assertEqual(len(results_alns), 0) @@ -500,7 +566,7 @@ def test_alns_no_findings(self): ) -class SearchAdvancedFilterTests(TestCase): +class SearchAdvancedFilterTests(TestMaterializedViewBuilder): def test_search_findings(self): """ When making a search on a particular type of finding, search_general should only return records with a finding of that type. @@ -517,16 +583,26 @@ def test_search_findings(self): # For every field, create a General object with an associated Finding with a 'Y' in that field. gen_objects = [] + award_objects = [] finding_objects = [] for field in findings_fields: general = baker.make( General, is_public=True, ) - finding = baker.make(Finding, report_id=general, **field) + award = baker.make( + FederalAward, + report_id=general, + findings_count=1, + award_reference="2023-001", + ) + finding = baker.make( + Finding, report_id=general, award_reference="2023-001", **field + ) finding_objects.append(finding) gen_objects.append(general) - + award_objects.append(award) + self.refresh_materialized_view() # One field returns the one appropriate general params = {"findings": ["is_modified_opinion"]} results = search(params) @@ -563,16 +639,17 @@ def test_search_direct_funding(self): is_public=True, ) baker.make(FederalAward, report_id=general_passthrough, is_direct="N") + self.refresh_materialized_view() params = {"direct_funding": ["direct_funding"]} results = search(params) self.assertEqual(len(results), 1) - self.assertEqual(results[0], general_direct) + self.assertEqual(results[0].report_id, general_direct.report_id) params = {"direct_funding": ["passthrough_funding"]} results = search(params) self.assertEqual(len(results), 1) - self.assertEqual(results[0], general_passthrough) + self.assertEqual(results[0].report_id, general_passthrough.report_id) # One can search on both, even if there's not much reason to. params = {"direct_funding": ["direct_funding", "passthrough_funding"]} @@ -594,13 +671,13 @@ def test_search_major_program(self): is_public=True, ) baker.make(FederalAward, report_id=general_non_major, is_major="N") - + self.refresh_materialized_view() params = {"major_program": [True]} results = search(params) self.assertEqual(len(results), 1) - self.assertEqual(results[0], general_major) + self.assertEqual(results[0].report_id, general_major.report_id) params = {"major_program": [False]} results = search(params) self.assertEqual(len(results), 1) - self.assertEqual(results[0], general_non_major) + self.assertEqual(results[0].report_id, general_non_major.report_id) diff --git a/backend/dissemination/test_summary_reports.py b/backend/dissemination/test_summary_reports.py index 176335aaf5..a751c5af6f 100644 --- a/backend/dissemination/test_summary_reports.py +++ b/backend/dissemination/test_summary_reports.py @@ -1,24 +1,28 @@ -from django.test import TestCase - +from dissemination.test_search import TestMaterializedViewBuilder from dissemination.summary_reports import ( can_read_tribal_disclaimer, cannot_read_tribal_disclaimer, gather_report_data_dissemination, generate_summary_report, - _get_tribal_report_ids, + get_tribal_report_ids, insert_dissem_coversheet, ) -from dissemination.models import General, CapText, Note, FindingText +from dissemination.models import FederalAward, General, CapText, Note, FindingText from model_bakery import baker import openpyxl as pyxl -class SummaryReportTests(TestCase): +class SummaryReportTests(TestMaterializedViewBuilder): def test_generate_summary_report_returns_filename(self): """The filename returned should be correctly formatted""" general = baker.make(General, _quantity=100) report_ids = [g.report_id for g in general] + + for g in general: + baker.make(FederalAward, report_id=g) + self.refresh_materialized_view() + filename = generate_summary_report(report_ids) self.assertTrue(filename.startswith, "fac-summary-report-") @@ -31,8 +35,16 @@ def test_get_tribal_report_ids(self): public_report_ids = [g.report_id for g in public_general] tribal_report_ids = [g.report_id for g in tribal_general] + for g in public_general: + baker.make(FederalAward, report_id=g) + for g in tribal_general: + baker.make(FederalAward, report_id=g) + + self.refresh_materialized_view() + + (ls, _) = get_tribal_report_ids(public_report_ids + tribal_report_ids) self.assertEqual( - len(_get_tribal_report_ids(public_report_ids + tribal_report_ids)), + len(ls), 2, ) @@ -41,8 +53,14 @@ def test_get_tribal_report_ids_no_tribal(self): public_general = baker.make(General, _quantity=3, is_public=True) public_report_ids = [g.report_id for g in public_general] + for g in public_general: + baker.make(FederalAward, report_id=g) + + self.refresh_materialized_view() + + (ls, _) = get_tribal_report_ids(public_report_ids) self.assertEqual( - len(_get_tribal_report_ids(public_report_ids)), + len(ls), 0, ) @@ -51,8 +69,14 @@ def test_get_tribal_report_ids_no_public(self): tribal_general = baker.make(General, _quantity=2, is_public=False) tribal_report_ids = [g.report_id for g in tribal_general] + for g in tribal_general: + baker.make(FederalAward, report_id=g) + + self.refresh_materialized_view() + + (ls, _) = get_tribal_report_ids(tribal_report_ids) self.assertListEqual( - _get_tribal_report_ids(tribal_report_ids), + ls, tribal_report_ids, ) @@ -100,7 +124,7 @@ def _test_gather_report_data_dissemination_helper(self, include_private): baker.make(FindingText, report_id=tribal_general) # Get the data that constitutes the summary workbook - data = gather_report_data_dissemination( + (data, _) = gather_report_data_dissemination( public_report_ids + tribal_report_ids, tribal_report_ids, include_private, diff --git a/backend/dissemination/test_views.py b/backend/dissemination/test_views.py index fae7e3e540..6d8b74d9f8 100644 --- a/backend/dissemination/test_views.py +++ b/backend/dissemination/test_views.py @@ -10,6 +10,8 @@ SingleAuditReportFile, generate_sac_report_id, ) + +from dissemination.test_search import TestMaterializedViewBuilder from dissemination.models import ( General, FederalAward, @@ -146,8 +148,9 @@ def test_private_returns_302_for_permissioned(self, mock_file_exists): self.assertIn(file.filename, response.url) -class SearchViewTests(TestCase): +class SearchViewTests(TestMaterializedViewBuilder): def setUp(self): + super().setUp() self.anon_client = Client() self.auth_client = Client() self.perm_client = Client() @@ -181,7 +184,11 @@ def test_anonymous_returns_private_and_public(self): """Anonymous users should see all reports (public and private included).""" public = baker.make(General, is_public=True, audit_year=2023, _quantity=5) private = baker.make(General, is_public=False, audit_year=2023, _quantity=5) - + for p in public: + baker.make(FederalAward, report_id=p) + for p in private: + baker.make(FederalAward, report_id=p) + self.refresh_materialized_view() response = self.anon_client.post(self._search_url(), {}) self.assertContains(response, "Results: 10") @@ -198,7 +205,11 @@ def test_non_permissioned_returns_private_and_public(self): """Non-permissioned users should see all reports (public and private included).""" public = baker.make(General, is_public=True, audit_year=2023, _quantity=5) private = baker.make(General, is_public=False, audit_year=2023, _quantity=5) - + for p in public: + baker.make(FederalAward, report_id=p) + for p in private: + baker.make(FederalAward, report_id=p) + self.refresh_materialized_view() response = self.auth_client.post(self._search_url(), {}) self.assertContains(response, "Results: 10") @@ -214,6 +225,11 @@ def test_non_permissioned_returns_private_and_public(self): def test_permissioned_returns_all(self): public = baker.make(General, is_public=True, audit_year=2023, _quantity=5) private = baker.make(General, is_public=False, audit_year=2023, _quantity=5) + for p in public: + baker.make(FederalAward, report_id=p) + for p in private: + baker.make(FederalAward, report_id=p) + self.refresh_materialized_view() response = self.perm_client.post(self._search_url(), {}) @@ -583,8 +599,9 @@ def test_summary_context(self): ) -class SummaryReportDownloadViewTests(TestCase): +class SummaryReportDownloadViewTests(TestMaterializedViewBuilder): def setUp(self): + super().setUp() self.anon_client = Client() self.perm_client = Client() @@ -613,7 +630,7 @@ def _summary_report_url(self): return reverse("dissemination:MultipleSummaryReportDownload") def _mock_filename(self): - return "some-report-name.xlsx" + return "some-report-name.xlsx", None def _mock_download_url(self): return "http://example.com/gsa-fac-private-s3/temp/some-report-name.xlsx" @@ -633,7 +650,9 @@ def test_empty_results_returns_404(self, mock_persist_workbook): """ Searches with no results should return a 404, not an empty excel file. """ - self._make_general(is_public=False, auditee_uei="123456789012") + general = self._make_general(is_public=False, auditee_uei="123456789012") + baker.make(FederalAward, report_id=general) + self.refresh_materialized_view() response = self.anon_client.post( self._summary_report_url(), {"uei_or_ein": "NotTheOther1"} ) @@ -650,8 +669,11 @@ def test_no_permissions_returns_404_on_private( mock_persist_workbook.return_value = self._mock_filename() mock_get_download_url.return_value = self._mock_download_url() - self._make_general(is_public=False) + general = self._make_general(is_public=False) + baker.make(FederalAward, report_id=general) + self.refresh_materialized_view() response = self.anon_client.post(self._summary_report_url(), {}) + mock_persist_workbook.assert_called_once() self.assertRedirects( response, self._mock_download_url(), @@ -666,14 +688,16 @@ def test_permissions_returns_file_on_private( self, mock_persist_workbook, mock_get_download_url ): """ - Permissioned users recieve a file if there are private results. + Permissioned users receive a file if there are private results. """ mock_persist_workbook.return_value = self._mock_filename() mock_get_download_url.return_value = self._mock_download_url() - self._make_general(is_public=False) - + general = self._make_general(is_public=False) + baker.make(FederalAward, report_id=general) + self.refresh_materialized_view() response = self.perm_client.post(self._summary_report_url(), {}) + mock_persist_workbook.assert_called_once() self.assertRedirects( response, self._mock_download_url(), @@ -693,9 +717,12 @@ def test_empty_search_params_returns_file( mock_persist_workbook.return_value = self._mock_filename() mock_get_download_url.return_value = self._mock_download_url() - self._make_general(is_public=True) + general = self._make_general(is_public=True) + baker.make(FederalAward, report_id=general) + self.refresh_materialized_view() response = self.anon_client.post(self._summary_report_url(), {}) + mock_persist_workbook.assert_called_once() self.assertRedirects( response, self._mock_download_url(), @@ -716,13 +743,16 @@ def test_many_results_returns_file( mock_get_download_url.return_value = self._mock_download_url() for i in range(4): - self._make_general( + general = self._make_general( is_public=True, report_id=generate_sac_report_id(end_date="2023-12-31", count=str(i)), ) + baker.make(FederalAward, report_id=general) + self.refresh_materialized_view() with self.settings(SUMMARY_REPORT_DOWNLOAD_LIMIT=2): response = self.anon_client.post(self._summary_report_url(), {}) + mock_persist_workbook.assert_called_once() self.assertRedirects( response, self._mock_download_url(), diff --git a/backend/dissemination/views.py b/backend/dissemination/views.py index 573f64ce5a..fe65785202 100644 --- a/backend/dissemination/views.py +++ b/backend/dissemination/views.py @@ -31,7 +31,9 @@ AdditionalUei, OneTimeAccess, ) + from dissemination.summary_reports import generate_summary_report + from support.decorators import newrelic_timing_metric from users.permissions import can_read_tribal @@ -382,7 +384,6 @@ def post(self, request): if len(results) == 0: raise Http404("Cannot generate summary report. No results found.") report_ids = [result.report_id for result in results] - filename = generate_summary_report(report_ids, include_private) download_url = get_download_url(filename) diff --git a/backend/support/api/admin_api_v1_1_0/create_access_tables.sql b/backend/support/api/admin_api_v1_1_0/create_access_tables.sql index ce9dda43ee..785ad81064 100644 --- a/backend/support/api/admin_api_v1_1_0/create_access_tables.sql +++ b/backend/support/api/admin_api_v1_1_0/create_access_tables.sql @@ -6,6 +6,7 @@ -- This is because administrative keys can read/write -- to some tables in the database. They can read internal and -- in-flight data. + DROP TABLE IF EXISTS support_administrative_key_uuids; CREATE TABLE support_administrative_key_uuids @@ -45,3 +46,4 @@ INSERT INTO support_administrative_key_uuids '2023-12-08' ) ; + diff --git a/backend/support/api/admin_api_v1_1_0/drop_views.sql b/backend/support/api/admin_api_v1_1_0/drop_views.sql index 8e8e9b55ab..41236e55d7 100644 --- a/backend/support/api/admin_api_v1_1_0/drop_views.sql +++ b/backend/support/api/admin_api_v1_1_0/drop_views.sql @@ -1,6 +1,6 @@ begin; - drop table if exists admin_api_v1_1_0.audit_access; + drop view if exists admin_api_v1_1_0.audit_access; commit;