-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spectroscopy Notebook Updates #281
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on the JWST spectra and fixing and cleaning up the rest of the notebook.
As a general question: do we want to delete the fits files from the various archives after we have downloaded them and read them into our df_spec? I think the answer is yes, but can also see the reasoning in keeping them so you don't have to re-download every time you make a slight change to your sample. Especially on Fornax, we are going to run into space issues if we don't delete the fits files. @troyraen @bsipocz what do you recommend?
spectroscopy/spectra_generator.md
Outdated
|
||
```python | ||
%%time | ||
## Get DESI and BOSS spectra with SPARCL | ||
## Get DESI and BOSS and SDSS spectra with SPARCL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we would want there to be duplicates in the final plot for DR16 and DR17? Of the example spectra in this notebook, the two spectra are mostly overlapping. Does the difference give us some sort of information? Without knowing if the difference is due to a different reduction or a second observation, it is hard to evaluate what we would learn. I vote for no DR16 and just using sparcle for DESI and BOSS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed SDSS-DR16 here to avoid duplication.
spectroscopy/spectra_generator.md
Outdated
|
||
|
||
sample_table = clean_sample(coords, labels , verbose=1) | ||
sample_table = clean_sample(coords, labels, precision=2.0* u.arcsecond , verbose=1) | ||
|
||
print("Number of sources in sample table: {}".format(len(sample_table))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is repetitive because the same information is printed out at the bottom of clean_sample
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, removed that line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 174 fails for me because there is no directory 'data', can you please add a line to try to make a 'data' directory and catch the error if it already exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
spectroscopy/spectra_generator.md
Outdated
```python | ||
%%time | ||
## Get Spectra for HST | ||
df_jwst = JWST_get_spec(sample_table , search_radius_arcsec = 0.5, datadir = "./data/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this less verbose in the default? It has a lot of output that is good for debugging, but not necessary for most users, ie., listing out the fits files it is downloading, plots, etc. None of the other get_spec functions include plots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Downloading URL..." output is coming from the mastquery
. I can suppress that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a verbose
argument to the function. If set to True the use can enable extra talking. In the other case, I was able to suppress the output of the mastquery
function by using:
from contextlib import redirect_stdout
trap = io.StringIO()
with redirect_stdout(trap):
myfun()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Downloading URL..." output is coming from the mastquery. I can suppress that.
We should fix that in astroquery, too, it should not generate such noise without explicit opt-in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(can you please open an issue for it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done: astropy/astroquery#3029
@@ -265,12 +272,8 @@ We show flux in mJy as a function of time for all available bands for each objec | |||
```python | |||
### Plotting #### | |||
create_figures(df_spec = df_spec, | |||
bin_factor=10, | |||
bin_factor=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is wrong with the X-axis wavelengths of the JWST spectra. I am getting the clear prism from 0.0001 to 0.0005 microns. I'm guessing there is a unit problem here.
Also, can the plotting be changed to eliminate the downward excursions on the log plots? ie, are there zeros in the spectra or are those error bars that extend to the bottoms of the log plots? or???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I changed the plotting. Opened a can of worms, the units were somehow wrong. I made it consistent now by using the astropy units throughout.
- these are very small values that are making the plot ugly. I can try to sigma-clip them out...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for #2, I sigma clipped and also removed negative fluxes (just for plotting)
I would think the normal workflow is to hoard data one is actively working on, so I would not delete (but I'm a dinosaur of an astronomer). So I would instead propagate this issue upstream to say we/the users/ will need access to some temporary space for all of these. Temporary in the sense of a scratch space, so nobackups, or maybe even no survival of a large restart, but to be around for a few weeks while someone is actively working on a use case. |
I agree with that. |
ok, I made the following updates according to the comments above:
@jkrick : Ready to review again... then I will merge. |
I am working on Herschel module and am at 10G and not even done downloading tar files for a single target Arp220 (herschel likes to give you lots of files....too many files.... but I can't control that. I think we should delete tar files. |
Since this is a Fornax notebook I think we have to make it usable on the Fonax Console, which means respecting the 10G user disk space.
If the full notebook needs less than 5G(?) disk space, then my vote is to write this function and make it an optional thing, so it's available for the to user run or not as they wish. (Choosing 5G to leave space for other things.)
Do you know how much disk space would be needed for all the Herschel stuff you want to download?
I agree that's a good thing to push for. I don't think we should count on that to come in time for this notebook to rely on it. |
|
That looks to be a case for using an equivalency? https://docs.astropy.org/en/stable/units/equivalencies.html#spectral-flux-and-luminosity-density-units |
should all be resolved now. Issued ticket to astropy/specutils to fix the bug regarding reading in wrong units: astropy/specutils#1143 |
Ran through it successfully. |
The following changes were made: - Include JWST MSA and Slit spectra (main update!) SPARCL query (DESI search) now includes SDSS-DR16 that can be used for comparison to DR17 search from SDSS archive. Note that according to the SPARCL webpage, DR17 is not yet included. - Imported nddata package (astropy) in desi_functions.py. This was a bug. - Bug fix in HST (and accordingly JWST) search functions: the order of file in the as output in the mast download function is not the same as the order of files in the product input list. This led to a mismatch if there are multiple spectra (especially the case for JWST). This has now been fixed. - The clean_sample() function now has a precision argument where the user can set the precision (as astropy unit) at which coordinates are taken to be the same. This was implemented because the JWST MSA slits were too close together and in the current hard-coded threshold would be counted as overlapping. - The SDSS_get_spec() function now has a data_release argument where the user can choose the SDSS data release. Currently, up to DR17 is supported. DR18 is not supported, yet, due to some inconsistencies on the astroquery backend (an issue has been submitted, the likely culprit is a broken link to DR18). - Fixed bug related to specutils, which reads in the wrong unit for the JWST spectrum - General updates in the documentation of spectra_generator.md. c07ad7b
The following changes were made:
nddata
package (astropy) indesi_functions.py
. This was a bug.clean_sample()
function now has aprecision
argument where the user can set the precision (as astropy unit) at which coordinates are taken to be the same. This was implemented because the JWST MSA slits were too close together and in the current hard-coded threshold would be counted as overlapping.SDSS_get_spec()
function now has adata_release
argument where the user can choose the SDSS data release. Currently, up to DR17 is supported. DR18 is not supported, yet, due to some inconsistencies on theastroquery
backend (an issue has been submitted, the likely culprit is a broken link to DR18).spectra_generator.md
.