Skip to content

Commit

Permalink
2883 update logic in notes to sefa workbook generator to use census m…
Browse files Browse the repository at this point in the history
…odels (#2899)

* #2883 Updated logic to use Census models

* #2883 Updated workbooks

* Update backend/census_historical_migration/workbooklib/notes_to_sefa.py

Co-authored-by: Phil Dominguez <[email protected]>

* Update backend/census_historical_migration/test_core_xforms.py

Co-authored-by: Phil Dominguez <[email protected]>

---------

Co-authored-by: Phil Dominguez <[email protected]>
  • Loading branch information
sambodeme and phildominguez-gsa authored Nov 30, 2023
1 parent 786fee7 commit f698ca8
Show file tree
Hide file tree
Showing 45 changed files with 160 additions and 65 deletions.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -1236,7 +1236,7 @@
"report_id": "2022-06-CENSUS-0000180818",
"rows": [],
"singletons": {
"accounting_policies": "The accompanying schedule of expenditures of federal awards (the Schedule) includes the federal award activity of the Board of Education of Queen Annes County, Maryland under programs of the federal government for the year ended June 30, 2022.The information in this Schedule is presented in accordance with the requirements of Title 2 U.S. Code of Federal Regulations Part 200, Uniform Administrative Requirements, Cost Principles, and Audit Requirements for Federal Awards (Uniform Guidance).Because the Schedule presents only a selected portion of the operations of the Board of Education of Queen Annes County, Maryland it is not intended to and does not present the financial position, changes in net assets, or cash flows of the Board of Education of Queen Annes County, Maryland.Expenditures reported on the Schedule are reported on the modified accrual basis of accounting.Such expenditures are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenditures are not allowable or are limited as to reimbursement.",
"accounting_policies": "The accompanying schedule of expenditures of federal awards (the Schedule) includes the federal award activity of the Board of Education of Queen Annes County, Maryland under programs of the federal government for the year ended June 30, 2022. The information in this Schedule is presented in accordance with the requirements of Title 2 U.S. Code of Federal Regulations Part 200, Uniform Administrative Requirements, Cost Principles, and Audit Requirements for Federal Awards (Uniform Guidance). Because the Schedule presents only a selected portion of the operations of the Board of Education of Queen Annes County, Maryland it is not intended to and does not present the financial position, changes in net assets, or cash flows of the Board of Education of Queen Annes County, Maryland. Expenditures reported on the Schedule are reported on the modified accrual basis of accounting. Such expenditures are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenditures are not allowable or are limited as to reimbursement.",
"auditee_uei": "XQGMGEP47M41",
"is_minimis_rate_used": "N",
"rate_explained": "The auditee did not use the de minimis cost rate."
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -2200,7 +2200,7 @@
],
"values": [
"Loan/loan guarantee outstanding balances",
"Loans outstanding at the beginning of the year are included in the Federal Expenditures presented in the schedule. The balance of the loans outstanding at December 31, 2022 consists of: RURAL RENTAL HOUSING LOANS (10.415) - Balances outstanding at the end of the audit period were 9585530. SECTION 515 MULTI-FAMILY HOUSING PRESERVATION REVOLOVING LOAN FUND DEMONSTRATION PROGRAM (10.415) - Balances outstanding at the end of the audit period were 1675039. THE RURAL DEVELOPMENT (RD) MULTI-FAMILY HOUSINGREVITALIZATION DEMONSTRATION PROGRAM(MPR) (10.447) - Balances outstanding at the end of the audit period were 866122. MORTGAGE INSURANCE FOR THE PURCHASE OR REFINANCING OF EXISTING MULTIFAMILY HOUSING PROJECTS (14.155) - Balances outstanding at the end of the audit period were 20503643. SUPPORTIVE HOUSING FOR THE ELDERLY (14.157) - Balances outstanding at the end of the audit period were 660803. HOME INVESTMENT PARTNERSHIPS PROGRAM (14.239) - Balances outstanding at the end of the audit period were 3048996. ASSISTED HOUSING STABILITY AND ENERGY AND GREEN RETROFIT INVESTMENTS PROGRAM (RECOVERY ACT FUNDED) (14.318) - Balances outstanding at the end of the audit period were 970138."
"Loans outstanding at the beginning of the year are included in the Federal Expenditures presented in the schedule. The balance of the loans outstanding at December 31, 2022 consists of: RURAL RENTAL HOUSING LOANS (10.415) - Balances outstanding at the end of the audit period were 9585530. SECTION 515 MULTI-FAMILY HOUSING PRESERVATION REVOLOVING LOAN FUND DEMONSTRATION PROGRAM (10.415) - Balances outstanding at the end of the audit period were 1675039. THE RURAL DEVELOPMENT (RD) MULTI-FAMILY HOUSING REVITALIZATION DEMONSTRATION PROGRAM (MPR) (10.447) - Balances outstanding at the end of the audit period were 866122. MORTGAGE INSURANCE FOR THE PURCHASE OR REFINANCING OF EXISTING MULTIFAMILY HOUSING PROJECTS (14.155) - Balances outstanding at the end of the audit period were 20503643. SUPPORTIVE HOUSING FOR THE ELDERLY (14.157) - Balances outstanding at the end of the audit period were 660803. HOME INVESTMENT PARTNERSHIPS PROGRAM (14.239) - Balances outstanding at the end of the audit period were 3048996. ASSISTED HOUSING STABILITY AND ENERGY AND GREEN RETROFIT INVESTMENTS PROGRAM (RECOVERY ACT FUNDED) (14.318) - Balances outstanding at the end of the audit period were 970138."
]
},
{
Expand All @@ -2215,7 +2215,7 @@
}
],
"singletons": {
"accounting_policies": "The accompanying Schedule of Expenditures of Federal Awards (the Schedule) includes the federal award activity of Wisconsin Housing Preservation Corp. & Subsidiaries under programs of the federal government for the year ended December 31, 2022 and is presented on the accrual basis of accounting. The information in this Schedule is presented in accordance with the requirements of Title 2 U.S. Code of Federal Regulations (CFR) Part 200, Uniform Administrative Requirements, Cost Principles, and Audit Requirements for Federal Awards (Uniform Guidance).Because the Schedule presents only a selected portion of the operations of Wisconsin Housing Preservation Corp. & Subsidiaries, it is not intended to and does not present the financial position, changes in net assets, or cash flows of Wisconsin Housing Preservation Corp. & Subsidiaries. Expenditures reported on the Schedule are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenditures are not allowable or are limited as to reimbursement.",
"accounting_policies": "The accompanying Schedule of Expenditures of Federal Awards (the Schedule) includes the federal award activity of Wisconsin Housing Preservation Corp. & Subsidiaries under programs of the federal government for the year ended December 31, 2022 and is presented on the accrual basis of accounting. The information in this Schedule is presented in accordance with the requirements of Title 2 U.S. Code of Federal Regulations (CFR) Part 200, Uniform Administrative Requirements, Cost Principles, and Audit Requirements for Federal Awards (Uniform Guidance). Because the Schedule presents only a selected portion of the operations of Wisconsin Housing Preservation Corp. & Subsidiaries, it is not intended to and does not present the financial position, changes in net assets, or cash flows of Wisconsin Housing Preservation Corp. & Subsidiaries. Expenditures reported on the Schedule are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenditures are not allowable or are limited as to reimbursement.",
"auditee_uei": "KVLYDC1KL4W7",
"is_minimis_rate_used": "N",
"rate_explained": "The auditee did not use the de minimis cost rate."
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -356,9 +356,9 @@
}
],
"singletons": {
"accounting_policies": "Expenses reported on the Schedule are reported on the accrual basis of accounting. Such expenses are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenses are not allowable or are limited as to reimbursement.The amounts passed through to subrecipients, if any, are reported on in this schedule when disbursed in accordance with 2 CFR 200.502(a), which differs from the accrual basis of accounting used under generally accepted accounting principles.",
"accounting_policies": "Expenses reported on the Schedule are reported on the accrual basis of accounting. Such expenses are recognized following the cost principles contained in the Uniform Guidance, wherein certain types of expenses are not allowable or are limited as to reimbursement.The amounts passed through to subrecipients, if any, are reported on in this schedule when disbursed in accordance with 2 CFR \u00a7200.502(a), which differs from the accrual basis of accounting used under generally accepted accounting principles.",
"auditee_uei": "GM27HGUL61B5",
"is_minimis_rate_used": "Both",
"is_minimis_rate_used": "N",
"rate_explained": "The District did not charge indirect costs to its federal programs."
}
},
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
48 changes: 48 additions & 0 deletions backend/census_historical_migration/test_core_xforms.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
from django.test import SimpleTestCase
from .exception_utils import DataMigrationError

from .workbooklib.notes_to_sefa import xform_is_minimis_rate_used


class TestXformIsMinimisRateUsed(SimpleTestCase):
def test_rate_used(self):
"""Test that the function returns 'Y' when the rate is used."""
self.assertEqual(
# Ensure extra whitespace is acceptable
xform_is_minimis_rate_used("The auditee used the de minimis cost rate."),
"Y",
)

self.assertEqual(
xform_is_minimis_rate_used(
"The School has elected to use the 10-percent de minimis indirect cost rate as allowed under the Uniform Guidance."
),
"Y",
)

def test_rate_not_used(self):
"""Test that the function returns 'N' when the rate is not used."""
self.assertEqual(
xform_is_minimis_rate_used(
"The auditee did not use the de minimis cost rate."
),
"N",
)
self.assertEqual(
xform_is_minimis_rate_used(
"The Board has elected not to use the 10 percent de minimus indirect cost as allowed under the Uniform Guidance."
),
"N",
)

def test_ambiguous_or_unclear_raises_exception(self):
"""Test that the function raises an exception when rate usage is ambiguous or unclear."""
with self.assertRaises(DataMigrationError):
xform_is_minimis_rate_used(
"The information regarding the de minimis rate is not clear."
)

with self.assertRaises(DataMigrationError):
xform_is_minimis_rate_used(
"It is unknown whether the de minimis rate was applied."
)
167 changes: 107 additions & 60 deletions backend/census_historical_migration/workbooklib/notes_to_sefa.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
from census_historical_migration.workbooklib.excel_creation_utils import (
set_uei,
from ..exception_utils import DataMigrationError
from ..transforms.xform_string_to_string import string_to_string
from ..models import ELECNOTES as Notes
from ..workbooklib.excel_creation_utils import (
get_audit_header,
set_range,
map_simple_columns,
generate_dissemination_test_table,
set_workbook_uei,
)
from census_historical_migration.base_field_maps import SheetFieldMap
from census_historical_migration.workbooklib.templates import sections_to_template_paths
from census_historical_migration.workbooklib.census_models.census import dynamic_import
from ..base_field_maps import SheetFieldMap
from ..workbooklib.templates import sections_to_template_paths
from audit.fixtures.excel import FORM_SECTIONS

import openpyxl as pyxl
Expand All @@ -18,91 +21,135 @@
logger = logging.getLogger(__name__)

mappings = [
SheetFieldMap("note_title", "title", "title", None, str),
SheetFieldMap("note_content", "content", "content", None, str),
SheetFieldMap("note_title", "TITLE", "title", None, str),
SheetFieldMap("note_content", "CONTENT", "content", None, str),
]


def cleanup_string(s):
if s is None:
return ""
else:
s = s.rstrip()
# s = unidecode.unidecode(s)
s = str(s.encode("utf-8").decode("ascii", "ignore"))
return s
def xform_cleanup_string(s):
"""Transforms a string to a string, cleaning up unicode characters."""
value = string_to_string(s)
if value:
# FIXME-MSHD: This is a transformation that we may want to record
return str(value.encode("utf-8").decode("ascii", "ignore"))
return ""


def generate_notes_to_sefa(dbkey, year, outfile):
logger.info(f"--- generate notes to sefa {dbkey} {year}---")
Gen = dynamic_import("Gen", year)
Notes = dynamic_import("Notes", year)
wb = pyxl.load_workbook(sections_to_template_paths[FORM_SECTIONS.NOTES_TO_SEFA])
def xform_is_minimis_rate_used(rate_content):
"""Determines if the de minimis rate was used based on the given text."""

g = set_uei(Gen, wb, dbkey)
# WARNING: ANY RESULTS FROM THIS FUNCTION MUST BE RECORDED AS A TRANSFORMATION
# We're assign a Y/N question in the collection.
# Census just let them type some stuff. This is an
# attempt to generate a Y/N value from the content.
# This means the data is *not* true to what was intended, but
# it *is* good enough for us to use for testing.

# The mapping is weird.
# Patterns that indicate the de minimis rate was NOT used
not_used_patterns = [
r"did\s+not\s+use",
r"not\s+to\s+use",
r"not\s+use",
r"not\s+elected",
r"elected\s+not\s+to\s+use",
r"does\s+not\s+use",
r"has\s+not\s+elected",
r"has\s+not\s+charged.*not\s+applicable",
r"did\s+not\s+charge\s+indirect\s+costs", # FIXME-MSHD: Is this correct? see dbkey: 251020 year:22
]

# Patterns that indicate the de minimis rate WAS used
used_patterns = [r"used", r"elected\s+to\s+use", r"uses.*allowed"]

# Check for each pattern in the respective lists
for pattern in not_used_patterns:
if re.search(pattern, rate_content, re.IGNORECASE):
# FIXME-MSHD: RECORD THIS TRANSFORMATION
return "N"
for pattern in used_patterns:
if re.search(pattern, rate_content, re.IGNORECASE):
# FIXME-MSHD: RECORD THIS TRANSFORMATION
return "Y"

# I am raising an exception here because we cannot clearly determine if the de minimis rate was used.
# return "Both"
raise DataMigrationError("Unable to determine if the de minimis rate was used.")


def _get_accounting_policies(dbkey):
# https://facdissem.census.gov/Documents/DataDownloadKey.xlsx
# The TYPEID column determines which field in the form a given row corresponds to.
# TYPEID=1 is the description of significant accounting policies.
"""Get the accounting policies for a given dbkey."""
try:
note = Notes.objects.get(DBKEY=dbkey, TYPE_ID="1")
content = string_to_string(note.CONTENT)
except Notes.DoesNotExist:
logger.info(f"No accounting policies found for dbkey: {dbkey}")
content = ""
return content


def _get_minimis_cost_rate(dbkey):
"""Get the De Minimis cost rate for a given dbkey."""
# https://facdissem.census.gov/Documents/DataDownloadKey.xlsx
# The TYPEID column determines which field in the form a given row corresponds to.
# TYPEID=2 is the De Minimis cost rate.
try:
note = Notes.objects.get(DBKEY=dbkey, TYPE_ID="2")
rate = string_to_string(note.CONTENT)
except Notes.DoesNotExist:
logger.info(f"De Minimis cost rate not found for dbkey: {dbkey}")
rate = ""
return rate


def _get_notes(dbkey):
"""Get the notes for a given dbkey."""
# https://facdissem.census.gov/Documents/DataDownloadKey.xlsx
# The TYPEID column determines which field in the form a given row corresponds to.
# TYPEID=3 is for notes, which have sequence numbers... that must align somewhere.
policies = (
Notes.select().where((Notes.dbkey == g.dbkey) & (Notes.type_id == 1)).get()
)
rate = Notes.select().where((Notes.dbkey == g.dbkey) & (Notes.type_id == 2)).get()
notes = (
Notes.select()
.where((Notes.dbkey == g.dbkey) & (Notes.type_id == 3))
.order_by(Notes.seq_number)
)
return Notes.objects.filter(DBKEY=dbkey, TYPE_ID="3").order_by("SEQ_NUMBER")

rate_content = cleanup_string(rate.content)
policies_content = cleanup_string(policies.content)

if rate_content == "":
rate_content = "FILLED FOR TESTING"
if policies_content == "":
policies_content = "FILLED FOR TESTING"
def generate_notes_to_sefa(dbkey, year, outfile):
"""
Generates notes to SEFA workbook for a given dbkey.
"""
logger.info(f"--- generate notes to sefa {dbkey} {year}---")

# WARNING
# This is being faked. We're askign a Y/N question in the collection.
# Census just let them type some stuff. So, this is a rough
# attempt to generate a Y/N value from the content.
# This means the data is *not* true to what was intended, but
# it *is* good enough for us to use for testing.
is_used = "Huh"
if (
re.search("did not use", rate_content)
or re.search("not to use", rate_content)
or re.search("not use", rate_content)
or re.search("not elected", rate_content)
):
is_used = "N"
elif re.search("used", rate_content):
is_used = "Y"
else:
is_used = "Both"
wb = pyxl.load_workbook(sections_to_template_paths[FORM_SECTIONS.NOTES_TO_SEFA])

audit_header = get_audit_header(dbkey)
set_workbook_uei(wb, audit_header.UEI)

notes = _get_notes(dbkey)
rate_content = _get_minimis_cost_rate(dbkey)
policies_content = _get_accounting_policies(dbkey)
is_minimis_rate_used = xform_is_minimis_rate_used(rate_content)

set_range(wb, "accounting_policies", [policies_content])
set_range(wb, "is_minimis_rate_used", [is_used])
set_range(wb, "is_minimis_rate_used", [is_minimis_rate_used])
set_range(wb, "rate_explained", [rate_content])

# Map the rest as notes.
map_simple_columns(wb, mappings, notes)

# Add a Y/N column
# def set_range(wb, range_name, values, default=None, conversion_fun=str):

# FIXME-MSHD: We do not have a match for contains_chart_or_table in historical data ?
# If there is no match in historic data, then this is not a transformation.
# Should this be recorded ?
set_range(wb, "contains_chart_or_table", map(lambda v: "N", notes), "N", str)
wb.save(outfile)

table = generate_dissemination_test_table(
Gen, "notes_to_sefa", dbkey, mappings, notes
audit_header, "notes_to_sefa", dbkey, mappings, notes
)

table["singletons"]["accounting_policies"] = policies_content
table["singletons"]["is_minimis_rate_used"] = is_used
table["singletons"]["is_minimis_rate_used"] = is_minimis_rate_used
table["singletons"]["rate_explained"] = rate_content
table["singletons"]["auditee_uei"] = g.uei
table["singletons"]["auditee_uei"] = audit_header.UEI

return (wb, table)

0 comments on commit f698ca8

Please sign in to comment.