Skip to content

Commit

Permalink
Improved add_uid function (#178)
Browse files Browse the repository at this point in the history
Closes #177
  • Loading branch information
jcadam14 authored May 6, 2024
1 parent 88a4d2e commit 25bdc17
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions src/regtech_data_validator/create_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,12 +172,12 @@ def validate(schema: DataFrameSchema, submission_df: pd.DataFrame) -> tuple[bool
def add_uid(results_df: pd.DataFrame, submission_df: pd.DataFrame) -> pd.DataFrame:
if results_df.empty:
return results_df
all_uids = []
sub_uids = submission_df['uid'].tolist()
for index, row in results_df.iterrows():
all_uids.append(sub_uids[int(row['record_no']) - 1])

results_df.insert(1, "uid", all_uids, True)
# uses pandas column operation to get list of record_no - 1 values, which would be indexes in the submission, since
# record_no is index offset by 1, and the uid column values for that into a new series that is then
# assigned to the results uid column. Much simpler and faster than looping over and assigning row by row.

results_df['uid'] = submission_df.loc[results_df['record_no'] - 1, 'uid'].values
return results_df


Expand Down

0 comments on commit 25bdc17

Please sign in to comment.