Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hicMergeMatrixBins changes chromosome sizes #906

Open
qolba opened this issue Jul 22, 2024 · 2 comments
Open

hicMergeMatrixBins changes chromosome sizes #906

qolba opened this issue Jul 22, 2024 · 2 comments

Comments

@qolba
Copy link

qolba commented Jul 22, 2024

Hi, I have a question about hicMergeMatrixBins behavior and I would like to ask you for some clarification.

I use hicexplorer 3.7.5 working with h5 files.

I've noticed, that hicMergeMatrixBins command change the chromosome sizes. I mean:

(hicexplorer) user@naboo:/home/dir$ hicInfo -m mymatrix.h5
# Matrix information file. Created with HiCExplorer's hicInfo version 3.7.5
File: mymatrix.h5
Size: 623,472
Bin_length: 5000
Sum of matrix: 290013357.0
Chromosomes:length: chr1: 248387328 bp; chr2: 242696752 bp; chr3: 201105948 bp; chr4: 193574945 bp; chr5: 182045439 bp; chr6: 172126628 bp; chr7: 160567428 bp; chr8: 146259331 bp; chr9: 150617247 bp; chr10: 134758134 bp; chr11: 135127769 bp; chr12: 133324548 bp; chr13: 113566686 bp; chr14: 101161492 bp; chr15: 99753195 bp; chr16: 96330374 bp; chr17: 84276897 bp; chr18: 80542538 bp; chr19: 61707364 bp; chr20: 66210255 bp; chr21: 45090682 bp; chr22: 51324926 bp; chrX: 154259566 bp; chrY: 62460029 bp; chrM: 16569 bp;
Non-zero elements: 399,928,510
Minimum (non zero): 1.0
Maximum: 69427.0
NaN bins: 0

(hicexplorer) user@naboo:/home/dir$ hicMergeMatrixBins -m mymatrix.h5 -o mymatrix_nb10.h5 -nb 10

(hicexplorer) user@naboo:/home/dir$ hicInfo -m mymatrix_nb10.h5
# Matrix information file. Created with HiCExplorer's hicInfo version 3.7.5
File: mymatrix_nb10.h5
Size: 62,348
Bin_length: 50000
Sum of matrix: 290006984.0
Chromosomes:length: chr1: 248387328 bp; chr2: 242696752 bp; chr3: 201100000 bp; chr4: 193574945 bp; chr5: 182045439 bp; chr6: 172126628 bp; chr7: 160550000 bp; chr8: 146250000 bp; chr9: 150600000 bp; chr10: 134750000 bp; chr11: 135127769 bp; chr12: 133324548 bp; chr13: 113550000 bp; chr14: 101150000 bp; chr15: 99750000 bp; chr16: 96330374 bp; chr17: 84276897 bp; chr18: 80542538 bp; chr19: 61700000 bp; chr20: 66200000 bp; chr21: 45090682 bp; chr22: 51324926 bp; chrX: 154250000 bp; chrY: 62450000 bp; chrM: 16569 bp;
Non-zero elements: 210,406,959
Minimum (non zero): 1.0
Maximum: 322638.0
NaN bins: 920

You can see, that some chromosomes (for instance chr3, chr7, chr8 etc) became shorter but not all of them.

Could you kindly explain the reasoning behind this behavior? I rely on HicExplorer output for downstream analysis, and this issue adds some complexity. I would greatly appreciate knowing in which cases I should expect this behavior to occur.

@joachimwolff
Copy link
Collaborator

Hi,

That should not happen. Did you use a chromosome size file for creating the matrices?

@qolba
Copy link
Author

qolba commented Jul 24, 2024

Those matrices were created with HiC-Pro, it uses chrom.size file at some point of matrix creation.
So the naive file format was hicpro (matrix + .bed), than i converted it with:

(hicexplorer) user@naboo:/home/dir$ hicConvertFormat -m mymatrix.matrix --bedFileHicpro mymatrix_abs.bed --inputFormat hicpro --outputFormat h5 -o mymatrix.h5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants