Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same result for "-ratio_ortho = 0.3", "0.5", and "= 0.7". Is this expected? #15

Open
V-JJ opened this issue Apr 11, 2023 · 0 comments
Open

Comments

@V-JJ
Copy link

V-JJ commented Apr 11, 2023

Hello!

We've tried to run broccoli with different ratio_ortho values: 0.5 (default), 0.3 and 0.7.
It turned out that the results were IDENTICAL for all the values. Is this expected? We've doubled checked that our command and jobs were run correctly.

Here you have the command:

# Input data location
proteome_dir=input_dataset_v1
nthr=8
r=0.3

# ML phylogeny method
phylo_method=ml

mkdir -p ML_parameter_r03

python broccoli.py -dir $proteome_dir -phylogenies $phylo_method -threads $nthr -ratio_ortho $r \
        -path_diamond $HOME/Programs/diamond_2.1.4/diamond \
        -path_fasttree $HOME/Programs/FastTree/FastTree

mv dir_* ML_parameter_r03

And here are the stdout files. No errors were detected.

r=0.3

            Broccoli v1.1


 --- STEP 1: kmer clustering

 # parameters
 input dir     : input_dataset_v1
 kmer size     : 100
 kmer nb aa    : 15

 # check input files
 3 input files
 95947 sequences

 # kmer clustering
 3 proteomes on 8 threads
 -> 85022 proteins saved for the next step


 --- STEP 2: phylomes

 # parameters
 e_value     : 0.001
 nb_hits     : 6
 gaps        : 0.7
 phylogenies : maximum likelihood
 threads     : 8

 # check input files
 3 input fasta files
 85022 sequences

 # build phylomes ... be patient
 done


 --- STEP 3: network analysis

 ## parameters
 species overlap  : 0.5
 min edge weight  : 0.1
 min nb hits      : 2
 chimeric edges   : 0.5
 chimeric species : 3
 threads          : 8

 ## get ortho and para
 extract ortho from similarity
 extract ortho from trees
 remove ortho found only once
 extract para from trees
 ## network analysis
 build network:
      _ 68710 nodes
      _ 139250 edges
 load similarity search outputs
 compute lcc for each node
 apply LPA and corrections:
      _ 14984 connected components
      _ 17377 communities
      _ 2 chimeric proteins
      _ 1437 spurious hits removed


 --- STEP 4: orthologous pairs

 ## parameters
 ratio ortho  : 0.3
 not same sp  : False
 threads      : 8

 ## load data
 load NO tree results
 load tree results
 load OGs

 ## analyse 15509 orthologous groups 1 by 1
 done

r=0.5

            Broccoli v1.1


 --- STEP 1: kmer clustering

 # parameters
 input dir     : input_dataset_v1
 kmer size     : 100
 kmer nb aa    : 15

 # check input files
 3 input files
 95947 sequences

 # kmer clustering
 3 proteomes on 8 threads
 -> 85022 proteins saved for the next step


 --- STEP 2: phylomes

 # parameters
 e_value     : 0.001
 nb_hits     : 6
 gaps        : 0.7
 phylogenies : maximum likelihood
 threads     : 8

 # check input files
 3 input fasta files
 85022 sequences

 # build phylomes ... be patient
 done

 --- STEP 3: network analysis

 ## parameters
 species overlap  : 0.5
 min edge weight  : 0.1
 min nb hits      : 2
 chimeric edges   : 0.5
 chimeric species : 3
 threads          : 8

 ## get ortho and para
 extract ortho from similarity
 extract ortho from trees
 remove ortho found only once
 extract para from trees

 ## network analysis
 build network:
      _ 68710 nodes
      _ 139250 edges
 load similarity search outputs
 compute lcc for each node
 apply LPA and corrections:
      _ 14984 connected components
      _ 17377 communities
      _ 2 chimeric proteins
      _ 1437 spurious hits removed


 --- STEP 4: orthologous pairs

 ## parameters
 ratio ortho  : 0.5
 not same sp  : False
 threads      : 8

 ## load data
 load NO tree results
 load tree results
 load OGs

 ## analyse 15509 orthologous groups 1 by 1
 done

r=0.7

            Broccoli v1.1


 --- STEP 1: kmer clustering

 # parameters
 input dir     : input_dataset_v1
 kmer size     : 100
 kmer nb aa    : 15

 # check input files
 3 input files
 95947 sequences

 # kmer clustering
 3 proteomes on 8 threads
 -> 85022 proteins saved for the next step


 --- STEP 2: phylomes

 # parameters
 e_value     : 0.001
 nb_hits     : 6
 gaps        : 0.7
 phylogenies : maximum likelihood
 threads     : 8

 # check input files
 3 input fasta files
 85022 sequences

 # build phylomes ... be patient
 done


 --- STEP 3: network analysis

 ## parameters
 species overlap  : 0.5
 min edge weight  : 0.1
 min nb hits      : 2
 chimeric edges   : 0.5
 chimeric species : 3
 threads          : 8

 ## get ortho and para
 extract ortho from similarity
 extract ortho from trees
 remove ortho found only once
 extract para from trees

## network analysis
 build network:
      _ 68710 nodes
      _ 139250 edges
 load similarity search outputs
 compute lcc for each node
 apply LPA and corrections:
      _ 14984 connected components
      _ 17377 communities
      _ 2 chimeric proteins
      _ 1437 spurious hits removed


 --- STEP 4: orthologous pairs

 ## parameters
 ratio ortho  : 0.7
 not same sp  : False
 threads      : 8

 ## load data
 load NO tree results
 load tree results
 load OGs

 ## analyse 15509 orthologous groups 1 by 1
 done

Thanks in advance,

@V-JJ V-JJ changed the title Same result for "-ratio_ortho = 0.3" and "= 0.7". Is this expected? Same result for "-ratio_ortho = 0.3", "0.5", and "= 0.7". Is this expected? Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant