Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No phylogenies after step 2 #17

Open
JesseGibson opened this issue Mar 26, 2024 · 1 comment
Open

No phylogenies after step 2 #17

JesseGibson opened this issue Mar 26, 2024 · 1 comment

Comments

@JesseGibson
Copy link

Hi,

I'm trying to run on the example dataset using the command

python ~/Broccoli/broccoli.py -dir ~/Broccoli/example_dataset/ -ext '.fasta'

with diamond version 4.0.515 and fasttree version 2.1.11 but am getting completely empty results. It seems like step 1 is able to read the proteome files just fine and perform the kmer clustering, but by the time we reach step three it says that there are 0 nodes/0 edges for the graph clustering analysis. Any idea what could be happening? Here's the complete output

            Broccoli v1.1


 --- STEP 1: kmer clustering

 # parameters
 input dir     : /home/gary/Broccoli/example_dataset
 kmer size     : 100
 kmer nb aa    : 15

 # check input files
 6 input files
 879 sequences

 # kmer clustering
 6 proteomes on 1 threads
 -> 868 proteins saved for the next step


 --- STEP 2: phylomes

 # parameters
 e_value     : 0.001
 nb_hits     : 6
 gaps        : 0.7
 phylogenies : neighbor joining
 threads     : 1

 # check input files
 6 input fasta files
 868 sequences

 # build phylomes ... be patient
 done


 --- STEP 3: network analysis

 ## parameters
 species overlap  : 0.5
 min edge weight  : 0.1
 min nb hits      : 2
 chimeric edges   : 0.5
 chimeric species : 3
 threads          : 1

 ## get ortho and para
 extract ortho from similarity
 extract ortho from trees
 remove ortho found only once
 extract para from trees

 ## network analysis
 build network:
      _ 0 nodes
      _ 0 edges
 load similarity search outputs
 compute lcc for each node
 apply LPA and corrections:
      _ 0 connected components
      _ 0 communities
      _ 0 chimeric proteins
      _ 0 spurious hits removed


 --- STEP 4: orthologous pairs

 ## parameters
 ratio ortho  : 0.5
 not same sp  : False
 threads      : 1

 ## load data
 load NO tree results
 load tree results
 load OGs

 ## analyse 0 orthologous groups 1 by 1
 done

Thanks!
Jesse

@rderelle
Copy link
Owner

Hi Jesse,

my apologies for this late reply.

It might be due to the new version of Diamond you are using. What the files in dir_step2 and dir_step3 look like? Are they empty?

Thanks
Romain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants