Skip to content

Commit

Permalink
Merge pull request #198 from /issues/177
Browse files Browse the repository at this point in the history
Cleanup light curve notebook (complete Issue #177) 6491ad0
  • Loading branch information
troyraen committed Jan 11, 2024
1 parent 42539d2 commit e9b9af8
Show file tree
Hide file tree
Showing 9 changed files with 255 additions and 261 deletions.
Binary file modified .doctrees/environment.pickle
Binary file not shown.
Binary file modified .doctrees/light_curves/light_curve_generator.doctree
Binary file not shown.
122 changes: 61 additions & 61 deletions _sources/forced_photometry/multiband_photometry.ipynb

Large diffs are not rendered by default.

82 changes: 41 additions & 41 deletions _sources/light_curves/ML_AGNzoo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "67ecc6b8",
"id": "0e4da272",
"metadata": {},
"source": [
"# How do AGNs selected with different techniques compare? \n",
Expand All @@ -17,7 +17,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "34011258",
"id": "449054d8",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -68,7 +68,7 @@
},
{
"cell_type": "markdown",
"id": "29c88b1c",
"id": "abc40524",
"metadata": {},
"source": [
"Here we load a parquet file of light curves generated using the multiband_lc notebook. One can build the sample from different sources in the literature and grab the data from archives of interes."
Expand All @@ -77,7 +77,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "2882f5dc",
"id": "6d85ad99",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -88,7 +88,7 @@
},
{
"cell_type": "markdown",
"id": "38e38d14",
"id": "ec2e1d57",
"metadata": {},
"source": [
"## What is in this sample?\n",
Expand All @@ -99,7 +99,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "e67b1eb1",
"id": "572a967c",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -134,7 +134,7 @@
},
{
"cell_type": "markdown",
"id": "fb39af35",
"id": "1729f977",
"metadata": {},
"source": [
"In this particular example, the largest three subsamples are AGNs selected from [gamma ray observations by the Fermi Large Area Telescope](https://ui.adsabs.harvard.edu/abs/2015yCat..18100014A/similar) (with more than 98% blazars), AGNs from the optical spectra by the [SDSS quasar sample DR16Q](https://www.sdss4.org/dr17/algorithms/qso_catalog/) with a criteria on redshift (z<2), and a subset of AGNs selected in MIR WISE bands based on their variability ([csv in data folder credit RChary](https://ui.adsabs.harvard.edu/abs/2019AAS...23333004P/abstract)). We also include some smaller samples from the literature to see where they sit compared to the rest of the population and if they are localized on the 2D projection. These include the Changing Look AGNs from the literature (e.g., [LaMassa et al. 2015](https://ui.adsabs.harvard.edu/abs/2015ApJ...800..144L/abstract), [Lyu et al. 2022](https://ui.adsabs.harvard.edu/abs/2022ApJ...927..227L/abstract), [Hon et al. 2022](https://ui.adsabs.harvard.edu/abs/2022MNRAS.511...54H/abstract)), a sample which showed variability in Galex UV images ([Wasleske et al. 2022](https://ui.adsabs.harvard.edu/abs/2022ApJ...933...37W/abstract)), a sample of variable sources identified in optical Palomar observarions ([Baldassare et al. 2020](https://ui.adsabs.harvard.edu/abs/2020ApJ...896...10B/abstract)), and the optically variable AGNs in the COSMOS field from a three year program on VLT([De Cicco et al. 2019](https://ui.adsabs.harvard.edu/abs/2019A%26A...627A..33D/abstract)). We also include 30 Tidal Disruption Event coordinates identified from ZTF light curves [Hammerstein et al. 2023](https://iopscience.iop.org/article/10.3847/1538-4357/aca283/meta)."
Expand All @@ -143,7 +143,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "66e7e3ef",
"id": "7c722f45",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -162,7 +162,7 @@
},
{
"cell_type": "markdown",
"id": "7610afce",
"id": "045cb6b6",
"metadata": {},
"source": [
"The histogram shows the number of lightcurves which ended up in the multi-index data frame from each of the archive calls in different wavebands/filters. We note that the IceCube peak should be corrected as it also include non detections in the figure above."
Expand All @@ -171,7 +171,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "d7803ac8",
"id": "a215a464",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -206,15 +206,15 @@
},
{
"cell_type": "markdown",
"id": "d4c7a62c",
"id": "069cc8b0",
"metadata": {},
"source": [
"While from the histogram plot we see which bands have the highest number of observed lightcurves, what might matter more in finding/selecting variability or changing look in lightcurves is the cadence and the average baseline of observations. For instance, Panstarrs has a large number of lightcurve detections in our sample, but from the figure above we see that the average number of visits and the baseline for those observations are considerably less than ZTF. WISE also shows the longest baseline of observations which is suitable to finding longer term variability in objects."
]
},
{
"cell_type": "markdown",
"id": "21d86498",
"id": "ff07b410",
"metadata": {},
"source": [
"## Looking at ZTF lightcurves alone\n",
Expand All @@ -225,7 +225,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "1b8693a8",
"id": "d1c39a42",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -259,7 +259,7 @@
},
{
"cell_type": "markdown",
"id": "391aca63",
"id": "c746161c",
"metadata": {},
"source": [
"The combination of the tree bands into one longer arrays in order of increasing wavelength, can be seen as providing both the SED shape as well as variability in each from the light curve. Figure below demonstrates this as well as our normalization choice. We normalize the data in ZTF R band as it has a higher average numbe of visits compared to G and I band. We remove outliers before measuring the mean and max of the light curve and normalizing by it. This normalization can be skipped if one is mearly interested in comparing brightnesses of the data in this sample, but as dependence on flux is strong to look for variability and compare shapes of light curves a normalization helps."
Expand All @@ -268,7 +268,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "03f48817",
"id": "0b16923b",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -306,7 +306,7 @@
},
{
"cell_type": "markdown",
"id": "eff9b72f",
"id": "9e549fbc",
"metadata": {},
"source": [
"Now we can train a UMAP with the processed data vectors above. Different choices for the number of neighbors, minimum distance and metric can be made and a parameter space can be explored. We show here our preferred combination given this data. We choose manhattan distance (also called [the L1 distance](https://en.wikipedia.org/wiki/Taxicab_geometry)) as it is optimal for the kind of grid we interpolated on, for instance we want the distance to not change if there are observations missing. Another metric appropriate for our purpose in time domain analysis is Dynamic Time Warping ([DTW](https://en.wikipedia.org/wiki/Dynamic_time_warping)), which is insensitive to a shift in time. This is helpful as we interpolate the observations onto a grid starting from time 0 and when discussing variability we care less about when it happens and more about whether and how strong it happened. As the measurement of the DTW distance takes longer compared to the other metrics we show examples here with manhattan and only show one example exploring the parameter space including a DTW metric in the last cell of this notebook."
Expand All @@ -315,7 +315,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "89a122f7",
"id": "8ac38d5b",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -351,7 +351,7 @@
},
{
"cell_type": "markdown",
"id": "bb3ba587",
"id": "5b432d1d",
"metadata": {},
"source": [
"The left panel is colorcoded by the origin of the sample. The middle panel shows the sum of mean brightnesses in three bands (arbitrary unit) demonstrating that after normalization we see no correlation with brightness. The panel on the right is color coded by a statistical measure of variability (i.e. the fractional variation [see here](https://ned.ipac.caltech.edu/level5/Sept01/Peterson2/Peter2_1.html)). As with the plotting above it is not easy to see all the data points and correlations in the next two cells measure probability of belonging to each original sample as well as the mean statistical property on an interpolated grid on this reduced 2D projected surface."
Expand All @@ -360,7 +360,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "796ebb86",
"id": "99f5c14c",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -405,7 +405,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "bd8b4bb5",
"id": "f4327539",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -440,7 +440,7 @@
},
{
"cell_type": "markdown",
"id": "2933b348",
"id": "dbd1e9f5",
"metadata": {},
"source": [
"Figure above shows how with ZTF light curves alone we can separate some of these AGN samples, where they have overlaps. We can do a similar exercise with other dimensionality reduction techniques. Below we show two SOMs one with normalized and another with no normalization. The advantage of Umaps to SOMs is that in practice you may change the parameters to separate classes of vastly different data points, as distance is preserved on a umap. On a SOM however only topology of higher dimensions is preserved and not distance hence, the change on the 2d grid does not need to be smooth and from one cell to next there might be larg jumps. On the other hand, an advantage of the SOM is that by definition it has a grid and no need for a posterior interpolation (as we did above) is needed to map more data or to measure probabilities, etc."
Expand All @@ -449,7 +449,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "be397cb5",
"id": "31622d65",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -462,7 +462,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "8adb248d",
"id": "4356cdc1",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -561,7 +561,7 @@
},
{
"cell_type": "markdown",
"id": "64cc4dfa",
"id": "5f3a4c44",
"metadata": {},
"source": [
"The above SOMs are colored by the mean fractional variation of the lightcurves in all bands (a measure of AGN variability). The crosses are different samples mapped to the trained SOM to see if they are distinguishable on a normalized lightcurve som."
Expand All @@ -570,7 +570,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "fdb65d5e",
"id": "bbef7fb2",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -590,7 +590,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "0e661978",
"id": "92d0413c",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -689,15 +689,15 @@
},
{
"cell_type": "markdown",
"id": "0b4307c4",
"id": "a107c02e",
"metadata": {},
"source": [
"skipping the normalization of lightcurves, can show for example how the Cicco et al. 2019 sample, from the 3year VLT observations of the COSMOS field are all fainter compared to the rest."
]
},
{
"cell_type": "markdown",
"id": "acf58875",
"id": "1ff72531",
"metadata": {},
"source": [
"# Repeating the above, this time with Panstarrs observations"
Expand All @@ -706,7 +706,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "2a70f4e2",
"id": "90652dd9",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -741,7 +741,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "eb3ad6a1",
"id": "01351fe6",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -777,7 +777,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "bba46630",
"id": "8d9df081",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -812,7 +812,7 @@
},
{
"cell_type": "markdown",
"id": "99139bd8",
"id": "510f8947",
"metadata": {},
"source": [
"# ZTF + WISE"
Expand All @@ -821,7 +821,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "4deaf085",
"id": "0de41162",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -846,7 +846,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "897e8e71",
"id": "dd639f8b",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -881,7 +881,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "c662c895",
"id": "0d0de1a6",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -916,7 +916,7 @@
},
{
"cell_type": "markdown",
"id": "7c950e07",
"id": "3b41e410",
"metadata": {},
"source": [
"# Wise alone"
Expand All @@ -925,7 +925,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "09681aee",
"id": "e693e1ca",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -950,7 +950,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "7b309cc4",
"id": "37071111",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -986,7 +986,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "fd44d81c",
"id": "856c1db8",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -1023,7 +1023,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "6ecbdd76",
"id": "871fd64e",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -1072,7 +1072,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "ac6afa6b",
"id": "9ad75e25",
"metadata": {},
"outputs": [],
"source": []
Expand Down
Loading

0 comments on commit e9b9af8

Please sign in to comment.