Skip to content

Commit

Permalink
let hmmratac output narrowpeak with summit and score from foldchange …
Browse files Browse the repository at this point in the history
…signals
  • Loading branch information
taoliu committed May 16, 2024
1 parent 769b8e1 commit 3b9c17c
Show file tree
Hide file tree
Showing 10 changed files with 11,116 additions and 11,093 deletions.
12 changes: 7 additions & 5 deletions MACS3/Commands/hmmratac_cmd.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Time-stamp: <2024-05-15 10:18:27 Tao Liu>
# Time-stamp: <2024-05-15 19:40:45 Tao Liu>

"""Description: Main HMMR command
Expand Down Expand Up @@ -460,7 +460,7 @@ def run( args ):

options.info( f"# Write accessible regions in a narrowPeak file: {options.name}_accessible_regions.narrowPeak")
with open( accessible_file, "w" ) as ofhd:
save_accessible_regions( states_path, ofhd, options.openregion_minlen )
save_accessible_regions( states_path, ofhd, options.openregion_minlen, fc_bdg )

options.info( f"# Finished")

Expand Down Expand Up @@ -565,7 +565,7 @@ def generate_states_path(predicted_proba_file, binsize, i_open, i_nuc, i_bg):
prev_bin_end = end_pos
return ret_states_path

def save_accessible_regions(states_path, accessible_region_file, openregion_minlen):
def save_accessible_regions(states_path, accessible_region_file, openregion_minlen, bdgscore):
# Function to add regions to the list
def add_regions(i, regions):
for j in range(i, i+3):
Expand All @@ -588,7 +588,9 @@ def add_regions(i, regions):
# Generate broadpeak object
openpeak = PeakIO()
for region in accessible_regions[:-1]:
openpeak.add(chromosome=region[0], start=region[1], end=region[2],
summit=(region[1]+region[2])//2) # use middle point for summit for now
openpeak.add(chromosome=region[0], start=region[1], end=region[2])

# refine peak summit and score using bedGraphTrackI with scores
openpeak = bdgscore.refine_peaks( openpeak )
openpeak.write_to_narrowPeak(accessible_region_file)
return
21 changes: 21 additions & 0 deletions docs/source/docs/hmmratac.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,27 @@ space in the 'tmp' directory of your system.
Please use `macs3 hmmratac --help` to see all the options. Here we
list the essential ones.

## Output

The final output file from `hmmratac` is in narrowPeak format
containing the accessible regions (open state in `hmmratac` HMM). The
columns are:

1. chromosome name
2. start position of the accessible region
3. end position of the accesssible region
4. peak name
5. peak score. The score is the maximum foldchange (signal/average
signal) within the peak. By default, the signal is the total
pileup of all types of fragments from short to tri-nuc size
fragments.
6. Not used
7. Not used
8. Not used
9. peak summit position. It's the relative position from the start
position to the peak summit which is defined as the position with
the maximum foldchange score.

## Essential Options

### `-i INPUT_FILE [INPUT_FILE ...]` / `--input INPUT_FILE [INPUT_FILE ...]`
Expand Down
Loading

0 comments on commit 3b9c17c

Please sign in to comment.