-
Notifications
You must be signed in to change notification settings - Fork 2
Output Tables
Peter van Galen edited this page Apr 26, 2022
·
6 revisions
Here we describe the two result tables generated by the downstream
command.
The first table, barcode_UMI_results.csv, contains results at the level of transcripts. That is, each row corresponds to the V(D)J alignment for a single UMI coming from a cell. The barcode_UMI_results.csv table contains the following columns.
Column name | Description |
---|---|
barcode | Cell barcode |
UMI | UMI |
UMI_count | Number of different UMI sequences corresponding to the same barcode and CDR3 sequence |
UMI_count_BC | Number of different UMI sequences corresponding to the same barcode |
consensus_count | Number of reads contributing to the consensus sequence |
sequence_id | BC + UMI + cluster to which the consensus and alignment correspond |
sequence | TCR 150nt consensus sequence |
rev_comp | Reverse complementary (True or False) |
productive | A rearranged IG or TR (genomic or cDNA) whose coding region has an open reading frame, with no stop codon and no defect described in the initiation codon, splicing sites and/or regulatory elements, and with an in-frame junction (True or False) |
v_call | V allele assignment(s). If ties, multiple calls are separated by "," |
d_call | D allele assignment(s). If ties, multiple calls are separated by "," |
j_call | J allele assignment(s). If ties, multiple calls are separated by "," |
sequence_alignment | IMGT-numbered V(D)J nucleotide sequence |
germline_alignment | Full IMGT-numbered germline V(D)J nucleotide sequence |
junction | CDR3 nucleotide sequence + 2 anchor AA triplets at the beginning (C) and end (F or W) |
junction_aa | Junction AA sequence |
v_cigar | CIGAR string for the alignment of the V allele |
d_cigar | CIGAR string for the alignment of the D allele |
j_cigar | CIGAR string for the alignment of the J allele |
stop_codon | stop codon is present in sample V(D)J nucleotide sequence |
vj_in_frame | sample junction region nucleotide sequence is in-frame |
locus | TRA, TRB, TRD or TRG |
junction_length | Number of junction nucleotides |
np1_length | Number of nucleotides between sample V and D sequences |
np2_length | Number of nucleotides between sample D and J sequences |
v_sequence_start | Position of the first V nucleotide in "sequence" |
v_sequence_end | Position of the last V nucleotide in "sequence" |
v_sequence_end | Position of the last V nucleotide in "sequence" |
v_germline_start | Position of "v_sequence_start" in IMGT numbered germline V(D)J sequence |
v_germline_end | Position of "v_sequence_end" in IMGT numbered germline V(D)J sequence |
d_sequence_start | Position of the first D nucleotide in "sequence" |
d_sequence_end | Position of the last D nucleotide in "sequence" |
d_germline_start | Position of "d_sequence_start" in IMGT numbered germline V(D)J sequence |
d_germline_end | Position of "d_sequence_end" in IMGT numbered germline V(D)J sequence |
j_sequence_start | Position of the first J nucleotide in "sequence" |
j_sequence_end | Position of the last J nucleotide in "sequence" |
j_germline_start | Position of "j_sequence_start" in IMGT numbered germline V(D)J sequence |
j_germline_end | Position of "j_sequence_end" in IMGT numbered germline V(D)J sequence |
v_score | Alignment bit score for the V allele |
v_identity | Alignment identity for the V allele |
v_support | E value for the alignment of the V allele |
d_score | Alignment bit score for the D allele |
d_identity | Alignment identity for the D allele |
d_support | E value for the alignment of the D allele |
j_score | Alignment bit score for the J allele |
j_identity | Alignment identity for the J allele |
j_support | E value for the alignment of the J allele |
fwr1 | FWR1 nucleotide sequence |
fwr2 | FWR2 nucleotide sequence |
fwr3 | FWR3 nucleotide sequence |
fwr4 | FWR4 nucleotide sequence |
cdr1 | CDR1 nucleotide sequence |
cdr2 | CDR2 nucleotide sequence |
cdr3 | CDR3 nucleotide sequence |
error_rate | Consensus building error rate |
weighted_mean_error_rate_BC | Mean consensus building error per barcode, weighted by consensus counts |
weighted_mean_error_rate_BC_cdr3 | Mean consensus building error per barcode and CDR3 sequence, weighted by consensus counts |
The second table, barcode_results.csv contains selected TRA and TRB calls at the cell barcode level. Results are summarized so that there is no more than one call for the TRB gene and two calls for the TRA gene per cell (see FAQ for more information). Calls are selected based on UMI and read counts. The barcode_results.csv table contains the following columns.
Column name | Description |
---|---|
BC | Cell barcode |
TCR_Recovery | TCR gene(s) recovered for a barcode: TRA only / TRB only / TRA and TRB |
TRB_CDR3 | TRB CDR3 AA sequence |
TRB_CDR3nuc | TRB CRD3 nucleotide sequence |
TRB_CDR3_UMIcount | Number of different UMI sequences corresponding to the same barcode and TRB CDR3 sequence |
TRB_CDR3_error | Mean consensus building error corresponding to a barcode and TRB CDR3 sequence, weighted by consensus counts |
TRB_nReads | Number of reads corresponding to the TRB call |
TRBV | TRB V allele assignment(s). If ties, multiple calls are separated by "," |
TRBD | TRB D allele assignment(s). If ties, multiple calls are separated by "," |
TRBJ | TRB J allele assignment(s). If ties, multiple calls are separated by "," |
TRA_CDR3 | TRA CDR3 AA sequence |
TRA_CDR3nuc | TRA CRD3 nucleotide sequence |
TRA_CDR3_UMIcount | Number of different UMI sequences corresponding to the same barcode and TRA CDR3 sequence |
TRA_CDR3_error | Mean consensus building error corresponding to a barcode and TRA CDR3 sequence, weighted by consensus counts |
TRA_nReads | Number of reads corresponding to the TRA call |
TRAV | TRA V allele assignment(s). If ties, multiple calls are separated by "," |
TRAJ | TRA J allele assignment(s). If ties, multiple calls are separated by "," |
TRA.2_CDR3 | TRA 2nd allele CDR3 AA sequence |
TRA.2_CDR3nuc | TRA 2nd allele CRD3 nucleotides sequence |
TRA.2_CDR3_UMIcount | Number of different UMI sequences corresponding to the same barcode and TRA 2nd allele CDR3 sequence |
TRA.2_CDR3_error | Mean consensus building error corresponding to a barcode and TRA 2nd allele CDR3 sequence, weighted by consensus counts |
TRA.2_nReads | Number of reads corresponding to the TRA 2nd allele call |
TRAV.2 | TRA V 2nd allele assignment(s). If ties, multiple calls are separated by "," |
TRAJ.2 | TRA J 2nd allele assignment(s). If ties, multiple calls are separated by "," |
InRNAseq | Cell barcode overlaps with RNAseq data: TRUE = Yes, FALSE = No |
RNAannotation | Cell barcode annotation coming from RNAseq data |
TRA_CDR3_CloneSize | Number of cells (barcodes overlapping with RNAseq) with the same TRA CDR3 sequence |
TRA_CDR3_CloneSize_Norm | Number of cells with the same TRA CDR3 sequence divided by the total number of cells with TRA CDR3 sequence |
TRA_CloneID | Unique ID per TRA clone, ordered by clone size |
TRB_CDR3_CloneSize | Number of cells (barcodes overlapping with RNAseq) with the same TRB CDR3 sequence |
TRB_CDR3_CloneSize_Norm | Number of cells with the same TRB CDR3 sequence divided by the total number of cells with TRA CDR3 sequence |
TRB_CloneID | Unique ID per TRB clone, ordered by clone size |