Skip to content
This repository has been archived by the owner on Jul 10, 2020. It is now read-only.

ToDo from feedback by He and Thiseas #2

Open
7 of 9 tasks
nevrome opened this issue Jun 8, 2020 · 4 comments
Open
7 of 9 tasks

ToDo from feedback by He and Thiseas #2

nevrome opened this issue Jun 8, 2020 · 4 comments

Comments

@nevrome
Copy link
Member

nevrome commented Jun 8, 2020

He

  • For the merge module, it may be better to have a checkpoint, about the number of bed/bim/fam/janno file found in each package path. That may be the job of package maintaining though, but would be helpful in some cases. For example when there is an extra ._PACKAGE.bim file, the file list will also include it and create error in plink.
  • For the covert module, would it be better to have the output directory specified as well? My understanding is that the poseidon packages are maintained in the project folder and people can convert/merge them for their own usage. Now it only writes converted files in the input folder. You can set it as a non-mandatory parameter. If the fourth parameter is found, then write files in the given path, or automatically write in the input path.
  • For the convert module as well, why you generate a .pedind file for the convertf run? I thought we can use .fam file as the input individual file, but maybe I’m wrong.

@TCLamnidis

  • No check for too many arguments. Could lead to odd results that "worked" when a user provides multiple input files for eg.

  • Log folder could get messy if someone submits jobs in a loop, as multiple jobs will have same time stamp. could cause issues when jobs are sbatched. Perhaps add a hash to the end of the folder? Or maybe deter people from looping with a check for this logdir existing and an error?

  • _print_packages also prints commented out lines or comments in the lines. Bug or feature?

  • poseidon_merge

    • line 101,146: No need to create empty bash array. easier to use unset _janno_files. the current fomrulation will trip text editors into thinking _janno_files is a function definition. (minor thing)
    • _plink_merge, line 167: --keep-allele-order? I remembed there was the issue with plink that it would sometimes flip alleles to Maj/Min rather than Ref/Alt which is what we generally use. could cause issues in downstream merging with Boston datasets in non-Poseidon format. Elina Salmela was the one that ran into these issues. Maybe she could explain how she solved it? maybe it isn't exactly relevant and I'm misremembering.
@nevrome
Copy link
Member Author

nevrome commented Jun 21, 2020

@TCLamnidis

  • I don't understand the unset _janno_files thing - could you elaborate? Isn't it necessary to create an empty array when I want to append to it?

@AyGhal

  • If I remember correctly this .pedind file is indeed necessary due to column order and we can consider this question raised by He above as solved, ja?
  • Could you comment on the --keep-allele-order thing? I'm sure Elina would not mind commenting this. I even believe she has a Github account, but I don't know the name.

@AyGhal
Copy link
Collaborator

AyGhal commented Jun 26, 2020

@nevrome

  • Yes, the pedind file is necessary because of the location(column) of population label in plink & eigensoft.

  • --keep-allele-order option was added.

@TCLamnidis
Copy link
Member

Yes. There is no need to create an empty array before appending to it. See below:

$ echo ${test[@]}      # test is empty

$ test+=("banana")    # append "banana"
$ echo ${test[@]}
banana
$ test+=("banana")    # append again
$ echo ${test[@]}
banana banana
$ test+=("banana")    # and again
$ echo ${test[@]}
banana banana banana
$ unset test          # unset to make empty again
$ test+=("banana")    # append again
$ echo ${test[@]}
banana

The current code isn't buggy or anything, so maybe this is just aesthetics/me being stuck in my ways, but the the syntax highlighting on my text editor did get tripped up by it.

@TCLamnidis
Copy link
Member

@esalmela could you comment on the --keep-allele-order? I cannot remember the outcome of our discussions on the plink merging and converting to Eigenstrat.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants