Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of things to consider cleaning up in the light curve notebook #177

Closed
17 tasks done
troyraen opened this issue Nov 22, 2023 · 9 comments · Fixed by #198
Closed
17 tasks done

List of things to consider cleaning up in the light curve notebook #177

troyraen opened this issue Nov 22, 2023 · 9 comments · Fixed by #198

Comments

@troyraen
Copy link
Contributor

troyraen commented Nov 22, 2023

Collecting a list of things we may consider doing as we finish up the notebook light_curve_generator.md.

Function docstrings

Notebook text

Code

Note that several MAST functions still iterate over individual sample objects rather than loading in bulk. This is a known issue being handled separately (see #165, #166, #179).

General

fluxconversions.py

- [ ] See if we can use Astropy methods in place of any functions in this file (prompted by #181 (comment)). moved to #188

heasarc_functions.py

mast_functions.py

panstarrs.py

plot_functions.py moved to #199

sample_lc.py

- [ ] Update it to, e.g., use sample_table. (This is a script version of the parallel section of the notebook, meant to be called from command line, but has not been kept up to date.) Alternately, consider making the modules directly callable from the command line and write a .sh script to execute them. This should speed up bulk runs because ZTF could use multiple workers. (moved to #195)

sample_selection.py

- [ ] nonunique_sample function: either remove it or update it to return a Table like clean_sample does. deferred (not used in this notebook)

tde_functions.py

TESS_Kepler_functions.py

ztf_functions.py

@troyraen
Copy link
Contributor Author

troyraen commented Dec 6, 2023

@bsipocz what's your input on the Formatting section? What standards should we apply for .py and .md files?

@jkrick
Copy link
Contributor

jkrick commented Dec 6, 2023

re: sample_lc.py and nonunique_sample inside of sample_selection.py
Both of these are not part of light_curve_generator.md instead ML_AGNzoo.md

What to do with them depends on our goal for the ML notebooks. I think they should start with a pre-made sample of light curves, and not need to run light_curve_generator. My reasoning is twofold. 1) speed of running the ML notebooks will be faster without first having to generate the light curves. 2) for anyone wanting to use the ML notebooks they wouldn't have to understand the extra pieces of code that generate the light curves, they could just focus on the code in the ML notebooks themselves.

If that is ok with everyone, then we should remove sample_lc.py (but make sure @xoubish has a copy somewhere since she needs it to generate the initial sample), and also remove nonunique_sample

@bsipocz
Copy link
Member

bsipocz commented Dec 6, 2023

what's your input on the Formatting section? What standards should we apply for .py and .md files?

I would say we could add a CI job to check on the .py formatting (though would not enable pre-commit or anything annoying that messes with the commits/branches, only a failing CI job on the PR). But I don't know anything from the top of my head for md files, will have to google for that one.

@troyraen
Copy link
Contributor Author

troyraen commented Dec 6, 2023

I would say we could add a CI job to check on the .py formatting

I'm in favor of that, but I'd like to keep the scope of this Issue more narrowly focused on cleaning up just the light curve notebook.

I have specific formatting preferences, especially for python code. While I know that your preferences are different than mine, I don't know exactly what your preferences are. So I'd like our group to come to a consensus before I make any broad changes there. Do you want to discuss specifics to be applied to this notebook? Or do you think we should leave the formatting alone for now?

@bsipocz
Copy link
Member

bsipocz commented Dec 6, 2023

I have specific formatting preferences, especially for python code

Something that is PEP8 compatible. I don't like to be as strict and whitespace heavy as black and would rather just make flake8 or similar pass on the codebase, but I agree, getting a consensus would be nice and I suppose can be done quickly. Shall we put it on the agenda for this afternoon?

@troyraen
Copy link
Contributor Author

troyraen commented Dec 6, 2023

@jkrick That's all fine with me. We'll just need to figure out where to store the pre-made sample of light curves. (I know we're using Google Drive for this right now, but want to figure out a more general solution.)

@jkrick
Copy link
Contributor

jkrick commented Dec 6, 2023

I have added formatting preferences to the agenda for this afternoon, but I would be super happy to just adopt whatever you two think is best. This is definitely your domain.

@jkrick
Copy link
Contributor

jkrick commented Dec 6, 2023

And I am pretty sure I have a gitlab issue in with the Navteca people about where we should be storing data as a long term solution instead of google drive.

@jkrick
Copy link
Contributor

jkrick commented Dec 16, 2023

After looking over everything in the directory, I fully agree that MAST_functions.py can be removed. I don't see it used anywhere.

troyraen added a commit that referenced this issue Jan 11, 2024
Cleanup light curve notebook (complete Issue #177)
github-actions bot pushed a commit that referenced this issue Jan 11, 2024
Cleanup light curve notebook (complete Issue #177) 6491ad0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants