Skip to content

Commit

Permalink
read_flat_vidrl: add column map to script
Browse files Browse the repository at this point in the history
The column map will be more complicated with the need to ingest two
slightly different flat files (_flat_file.csv and _reference_panel.csv)
as discussed in #161 (comment).

I also found myself constantly toggling back and forth between the
separate column_map.tsv and the upload script to figure out how the
columns are being used, so it makes more sense to just hard-code the
column map in the script.
  • Loading branch information
joverlee521 committed Nov 5, 2024
1 parent 63af127 commit 27757d8
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 16 deletions.
6 changes: 0 additions & 6 deletions source-data/vidrl_flat_file_column_map.tsv

This file was deleted.

20 changes: 10 additions & 10 deletions tdb/vidrl_upload.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,15 +54,6 @@
}
}

def parse_tsv_mapping_to_dict(tsv_file):
map_dict = {}
with open(tsv_file, 'r') as f:
for line in f:
(key, value) = line.split('\t')
key = key.lower()
map_dict[key] = value.rstrip('\n')
return map_dict


def parse_human_serum_references(human_serum_data, subtype):
"""
Expand Down Expand Up @@ -320,7 +311,16 @@ def read_flat_vidrl(path, fstem, assay_type):
Read the flat CSV file with *fstem* in the provided *path* and convert
to the expected TSV file at `data/tmp/<fstem>.tsv` for tdb/elife_upload.
"""
column_map = parse_tsv_mapping_to_dict("source-data/vidrl_flat_file_column_map.tsv")
# The new column names need to be one of the ELIFE_COLUMNS in order to be
# included in the temporary output file that's then passed to elife_upload.py
column_map = {
"virus": "virus_strain",
"virus.passage": "virus_passage",
"antisera.passage": "serum_passage",
"ferret": "serum_id",
"value": "titer",
"antisera.name": "serum_strain"
}
filepath = path + fstem + ".csv"

titer_measurements = pd.read_csv(filepath, usecols=column_map.keys()) \
Expand Down

0 comments on commit 27757d8

Please sign in to comment.