-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the pHASE-Stitcher wiki!
Use the methods (Part-A: Step 01-A) from this page (https://github.com/everestial/phase-Extender/wiki/phase-Extender-Tutorial).
Note: Parameters that are not called are set at default value.
- Call for help -
python3 phase-Stitcher.py -h
-
Example 01 (trio data set) -
assuming, ms02g sample is f1 hybrid, MA605 is mother and Sp21 is the father.
python3 phase-Stitcher.py --nt 1 --input example_01/haplotype_file01.txt --mat MA605 --pat Sp21 --f1Sample ms02g --culLH maxSum --lods 3
- Example 02 - when several samples are provided as maternal vs. paternal background.
python3 phase-Stitcher.py --nt 1 --input example_01/haplotype_file01.txt --mat MA605,MA622 --pat Sp21,Sp164 --f1Sample ms02g --outPatMatID Sp,My --culLH maxSum --lods 3
- Example 03(a) - maternal, paternal background can also be assigned by matching prefixes of several samples at once.
python3 phase-Stitcher.py --nt 1 --input example_01/haplotype_file01.txt --mat pre:MA6 --pat pre:Sp1 --f1Sample ms02g --outPatMatID Sp,My --culLH maxPd --lods 3
Code in example 03(a) will pick samples MA605,MA625
as maternal background variants and samples Sp154,Sp164
as paternal background variants. It is also possible to provide multiple comma separated prefixes to control sample assignment to particular parental background.
- Example 03(b) - with larger dataset
python3 phase-Stitcher.py --nt 1 --input example_02/haplotype_file02.txt --mat pre:MA,Nc --pat pre:Sp --f1Sample ms02g --culLH maxPd --lods 25
Code in example 03(b) will pick all Mayodan samples as maternal background variants and all Spiterstulen samples as paternal background variants.
-
Example 03(c) - Use
--hapStats yes
to include the descriptive statistics of the final haplotype file.
python3 phase-Stitcher.py --nt 1 --input example_02/haplotype_file02.txt --mat pre:MA,Nc --pat pre:Sp --f1Sample ms02g --culLH maxPd --lods 25 --hapStats yes --outPatMatID Sp,My
with script:
python3 phase-Stitcher.py --nt 1 --input example_01/haplotype_file01.txt --mat pre:MA62 --pat pre:Sp1 --f1_Sample ms02g --outPatMatID Sp,My --culLH maxSum --lods 3
Initial phase state of sample ms02g
in input file is:
ms02g_PI ms02g_PG_al
9 C|T
9 C|A
9 C|T
9 T|C
9 T|C
9 C|C
9 A|A
9 G|A
9 C|T
9 C|T
Segregated (final) phase state of sample ms02g
(with computed LOD) in output files are:
## result in long format
CHROM POS REF all-alleles ms02g:PI ms02g:PG_al log2odds Sp_hap My_hap
2 14689421 T T,C 9 C|T 6.89 C T
2 14689451 A A,C 9 C|A 6.89 C A
2 14689538 T T,C 9 C|T 6.89 C T
2 14689544 C C,T 9 T|C 6.89 T C
2 14689583 C C,T 9 T|C 6.89 T C
2 14689641 C C,A 9 C|C 6.89 C C
2 14689652 A A,T 9 A|A 6.89 A A
2 14689658 A A,G 9 G|A 6.89 G A
2 14689688 T T,C 9 C|T 6.89 C T
2 14689691 T T,C 9 C|T 6.89 C T
## result in wide format
CHROM POS_Range ms02g:PI hap_left hap_right log2odds Sp_hap My_hap
2 14689421-14689691 9 C-C-C-T-T-C-A-G-C-C T-A-T-C-C-C-A-A-T-T 6.89 C-C-C-T-T-C-A-G-C-C T-A-T-C-C-C-A-A-T-T
## statistics of the final haplotype
CHROM phasedBlock unphasedBlock numVarsInPhasedBlock numVarsInUnPhasedBlock log2oddsInPhasedBlock log2oddsInUnPhasedBlock totalNumOfBlock totalNumOfVars
2 9 . 10 . 6.89 . 1 10
What does the results tell us?
- We used
maxSum
as a method to estimate the likelihood and set cutoff threshold at3
to segregate the haplotype. - Since, the |computed lods| > lods threshold phase-Stitcher proceeds with haplotype segregation for the given RBphased haplotype.
- Since, the compute lods is a negative value, the left haplotype is assigned to paternal and right haplotype is assigned to maternal background.
- If |computed lods| < lods threshold the haplotype block won't be segregated into maternal vs. paternal background.