Skip to content

Output Tables

Peter van Galen edited this page Apr 26, 2022 · 6 revisions

Here we describe the two result tables generated by the downstream command.

Barcode UMI results

The first table, barcode_UMI_results.csv, contains results at the level of transcripts. That is, each row corresponds to the V(D)J alignment for a single UMI coming from a cell. The barcode_UMI_results.csv table contains the following columns.

Column name Description
barcode Cell barcode
UMI UMI
UMI_count Number of different UMI sequences corresponding to the same barcode and CDR3 sequence
UMI_count_BC Number of different UMI sequences corresponding to the same barcode
consensus_count Number of reads contributing to the consensus sequence
sequence_id BC + UMI + cluster to which the consensus and alignment correspond
sequence TCR 150nt consensus sequence
rev_comp Reverse complementary (True or False)
productive A rearranged IG or TR (genomic or cDNA) whose coding region has an open reading frame, with no stop codon and no defect described in the initiation codon, splicing sites and/or regulatory elements, and with an in-frame junction (True or False)
v_call V allele assignment(s). If ties, multiple calls are separated by ","
d_call D allele assignment(s). If ties, multiple calls are separated by ","
j_call J allele assignment(s). If ties, multiple calls are separated by ","
sequence_alignment IMGT-numbered V(D)J nucleotide sequence
germline_alignment Full IMGT-numbered germline V(D)J nucleotide sequence
junction CDR3 nucleotide sequence + 2 anchor AA triplets at the beginning (C) and end (F or W)
junction_aa Junction AA sequence
v_cigar CIGAR string for the alignment of the V allele
d_cigar CIGAR string for the alignment of the D allele
j_cigar CIGAR string for the alignment of the J allele
stop_codon stop codon is present in sample V(D)J nucleotide sequence
vj_in_frame sample junction region nucleotide sequence is in-frame
locus TRA, TRB, TRD or TRG
junction_length Number of junction nucleotides
np1_length Number of nucleotides between sample V and D sequences
np2_length Number of nucleotides between sample D and J sequences
v_sequence_start Position of the first V nucleotide in "sequence"
v_sequence_end Position of the last V nucleotide in "sequence"
v_sequence_end Position of the last V nucleotide in "sequence"
v_germline_start Position of "v_sequence_start" in IMGT numbered germline V(D)J sequence
v_germline_end Position of "v_sequence_end" in IMGT numbered germline V(D)J sequence
d_sequence_start Position of the first D nucleotide in "sequence"
d_sequence_end Position of the last D nucleotide in "sequence"
d_germline_start Position of "d_sequence_start" in IMGT numbered germline V(D)J sequence
d_germline_end Position of "d_sequence_end" in IMGT numbered germline V(D)J sequence
j_sequence_start Position of the first J nucleotide in "sequence"
j_sequence_end Position of the last J nucleotide in "sequence"
j_germline_start Position of "j_sequence_start" in IMGT numbered germline V(D)J sequence
j_germline_end Position of "j_sequence_end" in IMGT numbered germline V(D)J sequence
v_score Alignment bit score for the V allele
v_identity Alignment identity for the V allele
v_support E value for the alignment of the V allele
d_score Alignment bit score for the D allele
d_identity Alignment identity for the D allele
d_support E value for the alignment of the D allele
j_score Alignment bit score for the J allele
j_identity Alignment identity for the J allele
j_support E value for the alignment of the J allele
fwr1 FWR1 nucleotide sequence
fwr2 FWR2 nucleotide sequence
fwr3 FWR3 nucleotide sequence
fwr4 FWR4 nucleotide sequence
cdr1 CDR1 nucleotide sequence
cdr2 CDR2 nucleotide sequence
cdr3 CDR3 nucleotide sequence
error_rate Consensus building error rate
weighted_mean_error_rate_BC Mean consensus building error per barcode, weighted by consensus counts
weighted_mean_error_rate_BC_cdr3 Mean consensus building error per barcode and CDR3 sequence, weighted by consensus counts

Barcode results

The second table, barcode_results.csv contains selected TRA and TRB calls at the cell barcode level. Results are summarized so that there is no more than one call for the TRB gene and two calls for the TRA gene per cell (see FAQ for more information). Calls are selected based on UMI and read counts. The barcode_results.csv table contains the following columns.

Column name Description
BC Cell barcode
TCR_Recovery TCR gene(s) recovered for a barcode: TRA only / TRB only / TRA and TRB
TRB_CDR3 TRB CDR3 AA sequence
TRB_CDR3nuc TRB CRD3 nucleotide sequence
TRB_CDR3_UMIcount Number of different UMI sequences corresponding to the same barcode and TRB CDR3 sequence
TRB_CDR3_error Mean consensus building error corresponding to a barcode and TRB CDR3 sequence, weighted by consensus counts
TRB_nReads Number of reads corresponding to the TRB call
TRBV TRB V allele assignment(s). If ties, multiple calls are separated by ","
TRBD TRB D allele assignment(s). If ties, multiple calls are separated by ","
TRBJ TRB J allele assignment(s). If ties, multiple calls are separated by ","
TRA_CDR3 TRA CDR3 AA sequence
TRA_CDR3nuc TRA CRD3 nucleotide sequence
TRA_CDR3_UMIcount Number of different UMI sequences corresponding to the same barcode and TRA CDR3 sequence
TRA_CDR3_error Mean consensus building error corresponding to a barcode and TRA CDR3 sequence, weighted by consensus counts
TRA_nReads Number of reads corresponding to the TRA call
TRAV TRA V allele assignment(s). If ties, multiple calls are separated by ","
TRAJ TRA J allele assignment(s). If ties, multiple calls are separated by ","
TRA.2_CDR3 TRA 2nd allele CDR3 AA sequence
TRA.2_CDR3nuc TRA 2nd allele CRD3 nucleotides sequence
TRA.2_CDR3_UMIcount Number of different UMI sequences corresponding to the same barcode and TRA 2nd allele CDR3 sequence
TRA.2_CDR3_error Mean consensus building error corresponding to a barcode and TRA 2nd allele CDR3 sequence, weighted by consensus counts
TRA.2_nReads Number of reads corresponding to the TRA 2nd allele call
TRAV.2 TRA V 2nd allele assignment(s). If ties, multiple calls are separated by ","
TRAJ.2 TRA J 2nd allele assignment(s). If ties, multiple calls are separated by ","
InRNAseq Cell barcode overlaps with RNAseq data: TRUE = Yes, FALSE = No
RNAannotation Cell barcode annotation coming from RNAseq data
TRA_CDR3_CloneSize Number of cells (barcodes overlapping with RNAseq) with the same TRA CDR3 sequence
TRA_CDR3_CloneSize_Norm Number of cells with the same TRA CDR3 sequence divided by the total number of cells with TRA CDR3 sequence
TRA_CloneID Unique ID per TRA clone, ordered by clone size
TRB_CDR3_CloneSize Number of cells (barcodes overlapping with RNAseq) with the same TRB CDR3 sequence
TRB_CDR3_CloneSize_Norm Number of cells with the same TRB CDR3 sequence divided by the total number of cells with TRA CDR3 sequence
TRB_CloneID Unique ID per TRB clone, ordered by clone size
Clone this wiki locally