You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are only a finite number of 10x Genomics cell barcodes; 737280. When data is collected over many months in different sequencing batches, some cell barcodes will recur because, each time, all of the 737280 barcodes are used to label cells. Basically,
Each mapped read in a 10x Genomics Single Cell 3’ v2 Gene Expression Library can be annotated by four labels: (1) A sample barcode, (2) cell-barcode index, (3) Unique Molecular Identifier (UMI) (4) gene ID.
A 16 bp cell-barcode index is randomly selected out of a set containing 737280 possible combinations. In scRNA-seq data, a cell is identified by a unique cell barcode.
There are only a finite number of 10x Genomics cell barcodes; 737280. When data is collected over many months in different sequencing batches, some cell barcodes will recur because, each time, all of the 737280 barcodes are used to label cells. Basically,
For a real data set, a barcode appears between one and four times.
scClassify doesn't allow empty column names in the test matrix, which is what
read10xCounts
produces by default.Because it is multi-batch data, setting column names on the matrix to be Barcodes also fails.
The obscure solution is to paste the patient ID to cell barcode to ensure uniqueness.
This could be much smoother for end-users.
The text was updated successfully, but these errors were encountered: