Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: cleanup rendering warnings #321

Merged
merged 9 commits into from
Sep 12, 2024
Merged
14 changes: 12 additions & 2 deletions conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# -- Project information -----------------------------------------------------

project = 'Fornax Demo Notebooks'
copyright = '2022-2023, Fornax developers'
copyright = '2022-2024, Fornax developers'
author = 'Fornax developers'


Expand All @@ -36,7 +36,17 @@
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'notes', '.tox', '.tmp', '.pytest_cache']

# MyST-NB configuration
# Top level README file's sole purpose is for the repo. We also don't include
# the data and output directories that are to be populated while running the notebooks.
exclude_patterns += ['README.md', '*/data/*', '*/output/*']

# We exclude the documentation index.md as its sole purpose is for their CI.
exclude_patterns += ['documentation/index.md',]

# Not yet included in the rendering:
exclude_patterns += ['documentation/notebook_review_process.md', 'spectroscopy/*', '*/code_src/*']

# Myst-NB configuration
nb_execution_timeout = 900

# -- Options for HTML output -------------------------------------------------
Expand Down
6 changes: 3 additions & 3 deletions forced_photometry/multiband_photometry.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kernelspec:
---

# Automated Multiband Forced Photometry on Large Datasets
***


## Learning Goals:
By the end of this tutorial, you will be able to:
Expand Down Expand Up @@ -177,7 +177,7 @@ print("Number of objects: ", len(cosmos_table))

+++

#### Use the fornax cloud access API to obtain the IRAC data from the IRSA S3 bucket.
### Use the fornax cloud access API to obtain the IRAC data from the IRSA S3 bucket.

Details here may change as the prototype code is being added to the appropriate libraries, as well as the data holding to the appropriate NGAP storage as opposed to IRSA resources.

Expand Down Expand Up @@ -250,7 +250,7 @@ fornax_download(spitzer, access_url_column='sia_url', fname_filter='go2_sci',
data_subdirectory='IRAC', verbose=False)
```

#### Use IVOA image search and Fornax download to obtain Galex from the MAST archive
### Use IVOA image search and Fornax download to obtain Galex from the MAST archive

```{code-cell} ipython3
#the Galex mosaic of COSMOS is broken into 4 seperate images
Expand Down
8 changes: 2 additions & 6 deletions light_curves/ML_AGNzoo.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had this notebook running for about an hour and it's still not done. It has been stuck on the second cell in section '4) Repeating the above, this time with ZTF + WISE manifold' for most of the time. It hasn't crashed (top shows that the CPU is still in use), though there are a bunch of warnings. I don't know whether this is normal/expected or not.

Copy link
Member Author

@bsipocz bsipocz Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on that as I got stuck with the data download both locally and in CI, maybe open a separate issue for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can confirm that I also timeout on that cell locally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #324

Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ kernelspec:

By the IPAC Science Platform Team, last edit: Feb 16th, 2024

***



## Learning Goals
Expand Down Expand Up @@ -97,7 +97,7 @@ colors = [
custom_cmap = LinearSegmentedColormap.from_list("custom_theme", colors[1:])
```

***



## 1) Loading data
Expand Down Expand Up @@ -879,7 +879,3 @@ Datasets:
Packages:
* [`SOMPY`](https://github.com/sevamoo/SOMPY)
* [`umap`](https://github.com/lmcinnes/umap)



[Top of Page](#top)
4 changes: 2 additions & 2 deletions light_curves/light_curve_classifier.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kernelspec:
---

# Light Curve Classifier
***


## Learning Goals
By the end of this tutorial, you will be able to:
Expand Down Expand Up @@ -55,7 +55,7 @@ Trained classifiers as well as estimates of their accuracy and plots of confusio
As of 2024 August, this notebook takes ~170s to run to completion on Fornax using the 'Astrophysics Default Image' and the 'Large' server with 16GB RAM/ 4CPU.

## Authors
Jessica Krick, Shooby Hemmati, Troy Raen, Brigitta Sipocz, Andreas Faisst, Vandana Desai, Dave Shoop
Jessica Krick, Shoubaneh Hemmati, Troy Raen, Brigitta Sipőcz, Andreas Faisst, Vandana Desai, David Shupe

## Acknowledgements
Stephanie La Massa
Expand Down
69 changes: 35 additions & 34 deletions light_curves/light_curve_generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,60 +12,61 @@ kernelspec:
---

# Make Multi-Wavelength Light Curves Using Archival Data
***


## Learning Goals
By the end of this tutorial, you will be able to:
• Automatically load a catalog of target sources
• Automatically & efficiently search NASA and non-NASA resources for the light curves of up to ~500 targets
• Store & manipulate light curves in a Pandas MultiIndex dataframe
• Plot all light curves on the same plot
* Automatically load a catalog of target sources
* Automatically & efficiently search NASA and non-NASA resources for the light curves of up to ~500 targets
* Store & manipulate light curves in a Pandas MultiIndex dataframe
* Plot all light curves on the same plot


## Introduction:
• A user has a sample of interesting targets for which they would like to see a plot of available archival light curves. We start with a small set of changing look AGN from Yang et al., 2018, which are automatically downloaded. Changing look AGN are cases where the broad emission lines appear or disappear (and not just that the flux is variable).
* A user has a sample of interesting targets for which they would like to see a plot of available archival light curves. We start with a small set of changing look AGN from Yang et al., 2018, which are automatically downloaded. Changing look AGN are cases where the broad emission lines appear or disappear (and not just that the flux is variable).

• We model light curve plots after van Velzen et al. 2021. We search through a curated list of time-domain NASA holdings as well as non-NASA sources. HEASARC catalogs used are Fermi and Beppo-Sax, IRSA catalogs used are ZTF and WISE, and MAST catalogs used are Pan-STARRS, TESS, Kepler, and K2. Non-NASA sources are Gaia and IceCube. This list is generalized enough to include many types of targets to make this notebook interesting for many types of science. All of these time-domain archives are searched in an automated and efficient fashion using astroquery, pyvo, pyarrow or APIs.
* We model light curve plots after van Velzen et al. 2021. We search through a curated list of time-domain NASA holdings as well as non-NASA sources. HEASARC catalogs used are Fermi and Beppo-Sax, IRSA catalogs used are ZTF and WISE, and MAST catalogs used are Pan-STARRS, TESS, Kepler, and K2. Non-NASA sources are Gaia and IceCube. This list is generalized enough to include many types of targets to make this notebook interesting for many types of science. All of these time-domain archives are searched in an automated and efficient fashion using astroquery, pyvo, pyarrow or APIs.

• Light curve data storage is a tricky problem. Currently we are using a MultiIndex Pandas dataframe, as the best existing choice for right now. One downside is that we need to manually track the units of flux and time instead of relying on an astropy storage scheme which would be able to do some of the units worrying for us (even astropy can't do all magnitude to flux conversions). Astropy does not currently have a good option for multi-band light curve storage.
* Light curve data storage is a tricky problem. Currently we are using a MultiIndex Pandas dataframe, as the best existing choice for right now. One downside is that we need to manually track the units of flux and time instead of relying on an astropy storage scheme which would be able to do some of the units worrying for us (even astropy can't do all magnitude to flux conversions). Astropy does not currently have a good option for multi-band light curve storage.

• This notebook walks through the individual steps required to collect the targets and their light curves and create figures. It also shows how to speed up the collection of light curves using python's `multiprocessing`. This is expected to be sufficient for up to ~500 targets. For a larger number of targets, consider using the bash script demonstrated in the neighboring notebook [scale_up](scale_up.md).
* This notebook walks through the individual steps required to collect the targets and their light curves and create figures. It also shows how to speed up the collection of light curves using python's `multiprocessing`. This is expected to be sufficient for up to ~500 targets. For a larger number of targets, consider using the bash script demonstrated in the neighboring notebook [scale_up](scale_up.md).

• ML work using these time-series light curves is in two neighboring notebooks: [ML_AGNzoo](ML_AGNzoo.md) and [light_curve_classifier](light_curve_classifier.md).
* ML work using these time-series light curves is in two neighboring notebooks: [ML_AGNzoo](ML_AGNzoo.md) and [light_curve_classifier](light_curve_classifier.md).

As written, this notebook is expected to require at least 2 CPU and 8G RAM.

## Input:
• choose from a list of known changing look AGN from the literature
* choose from a list of known changing look AGN from the literature
OR -
• input your own sample
* input your own sample

## Output:
• an archival optical + IR + neutrino light curve
* an archival optical + IR + neutrino light curve

## Authors:
Jessica Krick, Shoubaneh Hemmati, Andreas Faisst, Troy Raen, Brigitta Sipőcz, Dave Shupe
Jessica Krick, Shoubaneh Hemmati, Andreas Faisst, Troy Raen, Brigitta Sipőcz, David Shupe

## Acknowledgements:
Suvi Gezari, Antara Basu-zych, Stephanie LaMassa
MAST, HEASARC, & IRSA Fornax teams

## Imports:
• `acstools` to work with HST magnitude to flux conversion
• `astropy` to work with coordinates/units and data structures
• `astroquery` to interface with archives APIs
• `hpgeom` to locate coordinates in HEALPix space
• `lightkurve` to search TESS, Kepler, and K2 archives
• `matplotlib` for plotting
• `multiprocessing` to use the power of multiple CPUs to get work done faster
• `numpy` for numerical processing
• `pandas` for their data structure DataFrame and all the accompanying functions
• `pyarrow` to work with Parquet files for WISE and ZTF
• `pyvo` for accessing Virtual Observatory(VO) standard data
• `requests` to get information from URLs
• `scipy` to do statistics
• `tqdm` to track progress on long running jobs
• `urllib` to handle archive searches with website interface
* `acstools` to work with HST magnitude to flux conversion
* `astropy` to work with coordinates/units and data structures
* `astroquery` to interface with archives APIs
* `hpgeom` to locate coordinates in HEALPix space
* `lightkurve` to search TESS, Kepler, and K2 archives
* `matplotlib` for plotting
* `multiprocessing` to use the power of multiple CPUs to get work done faster
* `numpy` for numerical processing
* `pandas` with their `[aws]` extras for their data structure DataFrame and all the accompanying functions
* `pyarrow` to work with Parquet files for WISE and ZTF
* `pyvo` for accessing Virtual Observatory(VO) standard data
* `requests` to get information from URLs
* `scipy` to do statistics
* `tqdm` to track progress on long running jobs
* `urllib` to handle archive searches with website interface


This cell will install them if needed:

Expand Down Expand Up @@ -433,11 +434,11 @@ _ = create_figures(df_lc = parallel_df_lc, # either df_lc (serial call) or paral

This work made use of:

• Astroquery; Ginsburg et al., 2019, 2019AJ....157...98G
• Astropy; Astropy Collaboration 2022, Astropy Collaboration 2018, Astropy Collaboration 2013, 2022ApJ...935..167A, 2018AJ....156..123A, 2013A&A...558A..33A
• Lightkurve; Lightkurve Collaboration 2018, 2018ascl.soft12013L
• acstools; https://zenodo.org/record/7406933#.ZBH1HS-B0eY
• unWISE light curves; Meisner et al., 2023, 2023AJ....165...36M
* Astroquery; Ginsburg et al., 2019, 2019AJ....157...98G
* Astropy; Astropy Collaboration 2022, Astropy Collaboration 2018, Astropy Collaboration 2013, 2022ApJ...935..167A, 2018AJ....156..123A, 2013A&A...558A..33A
* Lightkurve; Lightkurve Collaboration 2018, 2018ascl.soft12013L
* acstools; https://zenodo.org/record/7406933#.ZBH1HS-B0eY
* unWISE light curves; Meisner et al., 2023, 2023AJ....165...36M

```{code-cell} ipython3

Expand Down
5 changes: 3 additions & 2 deletions light_curves/requirements_ML_AGNzoo.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with a clean environment, I had to make a few additions to this file.

Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@
# beginning of the notebook, make sure the lists are consistent and only
# contain dependencies that are actually used in the notebook.
tqdm
numpy
numpy<2 # SOMPY incompatibility
scipy
pandas
pandas[parquet]
matplotlib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matplotlib
matplotlib
scikit-image # sompy requires scikit-image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this one, thanks! will rather add it to fada9e7

scikit-learn
scikit-image
astropy
umap-learn
git+https://github.com/sevamoo/SOMPY
Expand Down
8 changes: 7 additions & 1 deletion light_curves/requirements_light_curve_classifier.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,18 @@
# beginning of the notebook, make sure the lists are consistent and only
# contain dependencies that are actually used in the notebook.
numpy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the science_demo environment, I had to also install numba.

Suggested change
numpy
numba # required by Arsenal (sktime)
numpy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets pulled without issues both locally and in CI, so I would rather not list it as we don't use numba directly.
(see the log in #309)

pandas
pandas[parquet]
matplotlib
astropy
sktime
tqdm
googledrivedownloader
scikit-learn
acstools
## Optional indirect dependencies required by functionalities used in the notebook
# Required by functionality we use from acstools
scikit-image
# Required by functionality we use from sktime
numba
# Required for sensible progress bars
ipywidgets
5 changes: 3 additions & 2 deletions light_curves/requirements_light_curve_generator.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@ requests
tqdm
numpy
scipy
pandas
pyarrow
pandas[aws, parquet]
matplotlib
hpgeom
astropy
Expand All @@ -15,5 +14,7 @@ astroquery>=0.4.8.dev0
acstools
lightkurve
alerce
# Required by functionality we use from acstools
scikit-image
# Required for sensible progress bars
ipywidgets
2 changes: 1 addition & 1 deletion spectroscopy/explore_Euclid_data.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!-- #region -->
# Explore Euclid Data
***


## Learning Goals
By the end of this tutorial, you will be able to:
Expand Down
4 changes: 2 additions & 2 deletions spectroscopy/spectra_generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jupyter:

<!-- #region -->
# Extract Multi-Wavelength Spectroscopy from Archival Data
***


## Learning Goals
By the end of this tutorial, you will be able to:
Expand Down Expand Up @@ -75,7 +75,7 @@ The ones with an asterisk (*) are the challenging ones.
As of 2024 August, this notebook takes ~300s to run to completion on Fornax using the 'Astrophysics Default Image' and the 'Large' server with 16GB RAM/ 4CPU.

## Authors:
Andreas Faisst, Jessica Krick, Shoubaneh Hemmati, Troy Raen, Brigitta Sipőcz, Dave Shupe
Andreas Faisst, Jessica Krick, Shoubaneh Hemmati, Troy Raen, Brigitta Sipőcz, David Shupe

## Acknowledgements:
...
Expand Down
2 changes: 1 addition & 1 deletion tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ allowlist_externals =
commands =
pip freeze

buildhtml: git clone --depth 1 https://github.com/nasa-fornax/fornax-documentation.git documentation
buildhtml: bash -c 'if [[ ! -d documentation ]]; then git clone --depth 1 https://github.com/nasa-fornax/fornax-documentation.git documentation; else cd documentation; git fetch --all; git pull; cd ..; fi'
buildhtml: sphinx-build -b html . _build/html -D nb_execution_mode=off -nT --keep-going
# SED magic to remove the toctree captions from the rendered index page while keeping them in the sidebar TOC
buildhtml: sed -E -i.bak '/caption-text/{N; s/.+caption-text.+\n<ul>/<ul>/; P;D;}' _build/html/index.html
Expand Down