-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #132 from e-koch/the_docs
The Docs
- Loading branch information
Showing
57 changed files
with
1,897 additions
and
384 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule astropy_helpers
updated
76 files
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
|
||
********** | ||
Bispectrum | ||
********** | ||
|
||
Overview | ||
-------- | ||
|
||
The `bispectrum <https://en.wikipedia.org/wiki/Bispectrum>`_ is the Fourier transform of the three-point covariance function. It represents the next higher-order expansion upon the more commonly used two-point statistics, whose autocorrelation function is the power spectrum. The bispectrum is computed using: | ||
|
||
.. math:: | ||
B(k_1, k_2) = F^{\ast}(k_1 + k_2)\,F(k_1)\,F(k_2) | ||
where :math:`\ast` denotes the complex conjugate, :math:`F` is the Fourier transform of some signal, and :math:`k_1,\,k_2` are wavenumbers. | ||
|
||
The bispectrum retains phase information which is lost in the power spectrum, and is therefore useful for investigating phase coherence and coupling. | ||
|
||
The use of the bispectrum in the ISM was introduced by :ref:`Burkhart et al. 2009 <ref-burkhart2009>`, and recently extended in :ref:`Burkhart et al. 2016 <ref-burkhart2016>`. | ||
|
||
The phase information retained by the bispectrum requires it to be a complex quantity. A real, normalized version can be expressed through the `bicoherence <https://en.wikipedia.org/wiki/Bicoherence>`_. The bicoherence is a measure of phase coupling alone, where the maximal values of 1 and 0 represent complete coupled and uncoupled, respectively. The form that is used here is defined by :ref:`Hagihira et al. 2001 <ref-hagihira2001>`: | ||
|
||
.. math:: | ||
b(k_1, k_2) = \frac{|B(k_1, k_2)|}{\sum_{k_1, k_2} |F(k_1)F(k_2)F^{\ast}(k_1 + k_2)|} | ||
The denominator normalizes out the "power" at the modes :math:`k_1,\,k_2`; this is effectively dividing out the value of the power spectrum, leaving a fractional difference that is entirely the result of the phase coupling. Alternatively, the denominator can be thought of as the value attained if the modes :math:`k_1\,k_2` are completely phase coupled, and therefore is the maximal value attainable. | ||
|
||
In the recent work of :ref:`Burkhart et al. 2016 <ref-burkhart2016>`, they used a *phase coherence technique* (PCI), alternative to the bispectrum, to measure phase coherence as a function of radial and azimuthal changes (so called *lags* averaged over radius or angle in the bispectrum plane). This is defined as | ||
|
||
.. math:: | ||
C(R) = \frac{L_{\rm PRS}(R) - L_{\rm ORG}(R)}{L_{\rm PRS}(R) - L_{\rm PCS}(R)} | ||
where :math:`L(R)` is the relevant averaged quantity calculated at different lags (denoted generally by :math:`R`). ORG is the original signal, PRS is the signal with randomized phases, and PCS is the signal with phases set to be correlated. For each value of R, this yields a normalized value (:math:`C(R)`) between 0 and 1, similar to the bispectrum. **Note: PCI will be added to the existing TurbuStat code, but is not yet complete.**. | ||
|
||
Using | ||
----- | ||
|
||
**The data in this tutorial are available** `here <https://girder.hub.yt/#user/57b31aee7b6f080001528c6d/folder/57e55670a909a80001d301ae>`_. | ||
|
||
We need to import the `~turbustat.statistics.BiSpectrum` code, along with a few other common packages: | ||
|
||
>>> from turbustat.statistics import BiSpectrum | ||
>>> from astropy.io import fits | ||
|
||
And we load in the data: | ||
|
||
>>> moment0 = fits.open("Design4_21_0_0_flatrho_0021_13co.moment0.fits")[0] # doctest: +SKIP | ||
|
||
While the bispectrum can be extended to sample in N-dimensions, the current implementation requires a 2D input. In all previous work, the computation is performed on an integrated intensity or column density map. | ||
|
||
First, the `~turbustat.statistics.BiSpectrum` class is initialized: | ||
|
||
>>> bispec = BiSpectrum(moment0) # doctest: +SKIP | ||
|
||
The bispectrum requires only the image, not a header, so passing any arbitrary 2D array will work. | ||
|
||
The entire computation is performed through `~turbustat.statistics.BiSpectrum.run`: | ||
|
||
>>> bispec.run(verbose=True, nsamples=1.e3) # doctest: +SKIP | ||
|
||
.. image:: images/bispectrum_design4.png | ||
|
||
Even in small 2D image (128x128 here), the number of possible combinations for :math:`k_1,\,k_2` is already massive (the maximum value of :math:`k_1,\,k_2` is half of the largest dimension size in the image). To minimize the computational time, we can randomly sample some number of *phases* for each value of :math:`k_1,\,k_2` (so :math:`k_1 + k_2`, the coupling term, changes). This is set by `nsamples`. There is shot noise associated with this random sampling, and so different `nsamples` should be tested. In this, structure begins to become more apparent with around 1000, and improves beyond this (at the expense of time). | ||
|
||
`~turbustat.statistics.BiSpectrum.run` is really only hiding a single step: `~turbustat.statistics.BiSpectrum.compute_bispectrum`. There are two additional inputs which may be set: | ||
|
||
>>> bispec.run(nsamples=1.e3, mean_subtract=True, seed=4242424) # doctest: +SKIP | ||
|
||
The two additional arguments here are `seed`, to set the random seed for sampling, and `mean_subtract`, which first removes the mean from the data before computing the bispectrum. The removes the "zero frequency" power defined based on the largest scale in the image and removes much the phase coupling along the :math:`k_1 = k_2` line, highlighting the non-linear mode interactions: | ||
|
||
.. image:: images/bispectrum_w_and_wo_meansub_coherence.png | ||
|
||
The figure shows the effect on the bicoherence from subtracting the mean. The colorbar is limited between 0 and 1, with black representing 1. | ||
|
||
References | ||
---------- | ||
|
||
.. _ref-burkhart2009: | ||
|
||
`Burkhart et al. 2009 <https://ui.adsabs.harvard.edu/#abs/2009ApJ...693..250B/abstract>`_ | ||
|
||
.. _ref-burkhart2016: | ||
|
||
`Burkhart et al. 2016 <https://ui.adsabs.harvard.edu/#abs/2016ApJ...827...26B/abstract>`_ | ||
|
||
.. _ref-hagihira2001: | ||
|
||
`Hagihira et al. 2001 <https://www.ncbi.nlm.nih.gov/pubmed/11574365>`_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
|
||
************** | ||
Delta-Variance | ||
************** | ||
|
||
Overview | ||
-------- | ||
|
||
The :math:`\Delta`-variance technique was introduced by :ref:`Stutzki et al. 1998 <_ref-stutzki1998>` as a generalization of *Allan-variance*, a technique used to study the drift in instrumentation. They found that molecular cloud structures are well characterized by fractional Brownian motion structure, which results from a power-law power spectrum with a random phase distribution. The technique was extended by :ref:`Bensch et al. 2001 <_ref-bensch2001>` to account for the effects of beam smoothing and edge effects on a discretely sampled grid. With this, they identified a functional form to recover the index of the near power-law relation. The technique was extended again by :ref:`Ossenkopf et al. 2008a <_ref-ossenkopf2008a>`, where the computation using filters of different scales was moved in to the Fourier domain, allowing for a significant improvement in speed. The following description uses their formulation. | ||
|
||
Delta-variance measures the amount of structure on a given range of scales. Each delta-variance point is calculated by filtering an image with a spherically symmetric kernel - a French-hat or Mexican hat (Ricker) kernels - and computing the variance of the filtered map. Due to the effects of a finite grid that typically does not have periodic boundaries and the effects of noise, :ref:`Ossenkopf et al. 2008a <_ref-ossenkopf2008a>` proposed a convolution based method that splits the kernel into its central peak and outer annulus, convolves the separate regions, and subtracts the annulus-convolved map from the peak-convolved map. The Mexican-hat kernel separation can be defined using two Gaussian functions. A weight map was also introduced to minimize noise effects where there is low S/N regions in the data. Altogether, this is expressed as: | ||
|
||
.. math:: | ||
F(r) = \frac{G_{\rm core}(r)}{W_{\rm core}(r)} - \frac{G_{\rm ann}(r)}{W_{\rm ann}(r)} | ||
where :math:`r` is the kernel size, :math:`G` is the convolved image map, and :math:`W` is the convolved weight map. The delta-variance is then, | ||
|
||
.. math:: | ||
\sigma_{\delta}(r) = \frac{\Sigma_{\rm map} \mathrm{Var}(F(r)) W_{\rm tot}(r)}{\Sigma_{\rm map} W_{\rm tot}(r)} | ||
where :math:`W_{\rm tot}(r) = W_{\rm core}(r)\,W_{\rm ann}(r)`. | ||
|
||
Since the kernel is separated into two components, the ratio between their widths can be set independently. :ref:`Ossenkopf et al. 2008a <_ref-ossenkopf2008a>` find an optimal ratio of 1.5 for the Mexican-hat kernel, which is the element used in TurbuStat. | ||
|
||
Performing this operation yields a power-law-like relation between the scales :math:`r` and the delta-variance. | ||
|
||
Using | ||
----- | ||
|
||
**The data in this tutorial are available** `here <https://girder.hub.yt/#user/57b31aee7b6f080001528c6d/folder/57e55670a909a80001d301ae>`_. | ||
|
||
We need to import the `~turbustat.statistics.DeltaVariance` code, along with a few other common packages: | ||
|
||
>>> from turbustat.statistics import DeltaVariance | ||
>>> from astropy.io import fits | ||
|
||
Then, we load in the data and the associated error array: | ||
|
||
>>> moment0 = fits.open("Design4_21_0_0_flatrho_0021_13co.moment0.fits")[0] # doctest: +SKIP | ||
>>> moment0_err = fits.open("Design4_21_0_0_flatrho_0021_13co.moment0_error.fits")[0] # doctest: +SKIP | ||
|
||
Next, we initialize the `~turbustat.statistics.DeltaVariance` class: | ||
|
||
>>> delvar = DeltaVariance(moment0, weights=moment0_err) # doctest: +SKIP | ||
|
||
The weight array is optional, but is recommended. Note that this is not the exact form of a weight array used by :ref:`Ossenkopf et al. 2008b <_ref-ossenkopf2008b>`, who used the sqrt of the number of elements along the line of sight used to create the integrated intensity map. This doesn't take into account the varying S/N of each element used, however. In the case with the simulated data, the two are nearly identical, since the noise value associated with each element is constant. If no weights are given, a uniform array of ones is used. | ||
|
||
By default, 25 lag values will be used, logarithmically spaced between 3 pixels to half of the minimum axis size. Alternative lags can be specified by setting the `lags` keyword. If a `~numpy.ndarray` is passed, it is assumed to be in pixel units. Lags can also be given in angular units. | ||
|
||
The entire process is performed through `~turbustat.statistics.DeltaVariance.run`: | ||
|
||
>>> delvar.run(verbose=True, ang_units=True, unit=u.arcmin) # doctest: +SKIP | ||
|
||
.. image:: images/delvar_design4.png | ||
|
||
`ang_units` sets showing angular scales in the plot, and `unit` is the unit to show them in. | ||
|
||
References | ||
---------- | ||
|
||
.. _ref-stutzki1998: | ||
|
||
`Stutzki et al. 1998 <https://ui.adsabs.harvard.edu/#abs/1998A&A...336..697S/abstract>`_ | ||
|
||
.. _ref-bensch2001: | ||
|
||
`Bensch, F. <https://ui.adsabs.harvard.edu/#abs/2001A&A...366..636B/abstract>`_ | ||
|
||
.. _ref-ossenkopf2008a: | ||
|
||
`Ossenkopf at al. 2008a <https://ui.adsabs.harvard.edu/#abs/2008A&A...485..917O/abstract>`_ | ||
|
||
.. _ref-ossenkopf2008b: | ||
|
||
`Ossenkopf at al. 2008b <https://ui.adsabs.harvard.edu/#abs/2008A&A...485..719O/abstract>`_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
|
||
*********** | ||
Dendrograms | ||
*********** | ||
|
||
Overview | ||
-------- | ||
|
||
In general, dendrograms provide a hierarchical description of datasets, which may be used to identify clusters of similar objects or variables. This is known as `hierarchical clustering <https://en.wikipedia.org/wiki/Hierarchical_clustering>`_. In the case of position-position-velocity (PPV) cubes, a dendrogram is a hierarchical decomposition of the emission in the cube. This decomposition was introduced by :ref:`Rosolowsky et al. 2008 <ref-rosolowsky2008>` to calculate the multiscale properties of molecular gas in nearby clouds. The tree structure is comprised of branches and leaves. Branches are the connections, while leaves are the tips of the branches. | ||
|
||
:ref:`Burkhart et al. 2013 <ref-burkhart2013>` introduced two statistics for comparing the dendrograms of two cubes: the relationship between the number of leaves and branches in the tree versus the minimum branch length, and a histogram comparison of the peak intensity in a branch or leaf. The former statistic shows a power-law like turn-off with increasing branch length. | ||
|
||
Using | ||
----- | ||
|
||
**The data in this tutorial are available** `here <https://girder.hub.yt/#user/57b31aee7b6f080001528c6d/folder/57e55670a909a80001d301ae>`_. | ||
|
||
**Requires the optional astrodendro package to be installed. See documentation** `here <http://dendrograms.org/>`_ | ||
|
||
Importing the dendrograms code, along with a few other common packages: | ||
|
||
>>> from turbustat.statistics import Dendrogram_Stats | ||
>>> from astropy.io import fits | ||
>>> import astropy.units as u | ||
>>> from astrodendro import Dendrogram # doctest: +SKIP | ||
>>> import matplotlib.pyplot as plt | ||
>>> import numpy as np | ||
|
||
And we load in the data: | ||
|
||
>>> cube = fits.open("Design4_21_0_0_flatrho_0021_13co.fits")[0] # doctest: +SKIP | ||
|
||
Before running the statistics side, we can first compute the dendrogram itself to see what we're dealing with: | ||
|
||
>>> d = Dendrogram.compute(cube, min_value=0.005, min_delta=0.1, min_npix=50, verbose=True) # doctest: +SKIP | ||
>>> ax = plt.subplot(111) # doctest: +SKIP | ||
>>> d.plotter().plot_tree(ax) # doctest: +SKIP | ||
>>> plt.ylabel("Intensity (K)") # doctest: +SKIP | ||
|
||
.. image:: images/design4_dendrogram.png | ||
|
||
We see a number of leaves of varying height throughout the tree. Their minimum height is set by `min_delta`. As we increase this value, the tree becomes *pruned*: more and more structure will be deleted, leaving only the brightest regions on the tree. | ||
|
||
**While this tutorial uses a PPV cube, a 2D image may also be used! The same tutorial code can be used for both, likely with changes needed for the choice of `min_delta`.** | ||
|
||
The statistics are computed through `~turbustat.statistics.Dendrogram_Stats`: | ||
|
||
>>> dend_stat = Dendrogram_Stats(cube, min_deltas=np.logspace(-2, 0, 50), dendro_params={"min_value": 0.005, "min_npix": 50}) # doctest: +SKIP | ||
|
||
I've specified the values that `min_delta` should take. These are completely dependent on the range of intensities within your data cube. I've also specified the minimum number of pixels are region must have (`min_npix`) and the minimum intensity of the data to consider (`min_value`). | ||
|
||
To run the statistics, we use `~turbustat.statistics.Dendrogram_Stats.run`: | ||
|
||
>>> dend_stat.run(verbose=True) # doctest: +SKIP | ||
OLS Regression Results | ||
============================================================================== | ||
Dep. Variable: y R-squared: 0.973 | ||
Model: OLS Adj. R-squared: 0.971 | ||
Method: Least Squares F-statistic: 640.6 | ||
Date: Mon, 03 Oct 2016 Prob (F-statistic): 1.60e-15 | ||
Time: 17:08:42 Log-Likelihood: 9.6972 | ||
No. Observations: 20 AIC: -15.39 | ||
Df Residuals: 18 BIC: -13.40 | ||
Df Model: 1 | ||
Covariance Type: nonrobust | ||
============================================================================== | ||
coef std err t P>|t| [95.0% Conf. Int.] | ||
------------------------------------------------------------------------------ | ||
const -0.5729 0.101 -5.688 0.000 -0.785 -0.361 | ||
x1 -3.7769 0.149 -25.311 0.000 -4.090 -3.463 | ||
============================================================================== | ||
Omnibus: 1.882 Durbin-Watson: 0.386 | ||
Prob(Omnibus): 0.390 Jarque-Bera (JB): 1.135 | ||
Skew: -0.256 Prob(JB): 0.567 | ||
Kurtosis: 1.951 Cond. No. 6.02 | ||
============================================================================== | ||
|
||
.. image:: images/design4_dendrogram_stats.png | ||
|
||
On the left is the relationship between the value of `min_delta` and the number of features in the tree. On the right is a stack of histograms, showing the distribution of peak intensities for all values of `min_delta`. The results of the linear fit are also printed, where `x1` is the slope of the power-law tail. | ||
|
||
Computing dendrograms can be time-consuming when working with large datasets. We can avoid recomputing a dendrogram by loading from an HDF5 file: | ||
|
||
>>> dend_stat = Dendrogram_Stats.load_dendrogram("design4_dendrogram.hdf5", min_deltas=np.logspace(-2, 0, 50)) # doctest: +SKIP | ||
|
||
Saving the dendrogram structure is explained in the `astrodendro documentation <http://dendrograms.org/>`_. **The saved dendrogram must have `min_delta` set to the minimum of the given `min_deltas`. Otherwise pruning is ineffective.** | ||
|
||
If the dendrogram isn't saved (say you have just run it in the same terminal), you may pass the computed dendrogram into `~turbustat.statistics.Dendrogram_Stats.run`: | ||
>>> d = Dendrogram.compute(cube, min_value=0.005, min_delta=0.01, min_npix=50, verbose=True) # doctest: +SKIP | ||
>>> dend_stat = Dendrogram_Stats(cube, min_deltas=np.logspace(-2, 0, 50)) # doctest: +SKIP | ||
>>> dend_stat.run(verbose=True, dendro_obj=d) # doctest: +SKIP | ||
|
||
Once the statistics have been run, the results can be saved as a pickle file: | ||
>>> dend_stat.save_results(output_name="Design4_Dendrogram_Stats.pkl", keep_data=False) # doctest: +SKIP | ||
|
||
`keep_data=False` will avoid saving the entire cube, and is the default setting. | ||
|
||
Saving can also be enabled with `~turbustat.statistics.Dendrogram_Stats.run`: | ||
>>> dend_stat.run(save_results=True, output_name="Design4_Dendrogram_Stats.pkl") # doctest: +SKIP | ||
|
||
The results may then be reloaded: | ||
>>> dend_stat = Dendrogram_Stats.load_results("Design4_Dendrogram_Stats.pkl") # doctest: +SKIP | ||
|
||
Note that the dendrogram and data are **NOT** saved, and only the statistic outputs will be accessible. | ||
|
||
References | ||
---------- | ||
|
||
.. _ref-rosolowsky2008: | ||
|
||
`Rosolowsky et al. 2008 <https://ui.adsabs.harvard.edu/#abs/2008ApJ...679.1338R/abstract>`_ | ||
|
||
.. _ref-goodman2009: | ||
|
||
`Goodman et al. 2008 <https://ui.adsabs.harvard.edu/#abs/2009Natur.457...63G/abstract>`_ | ||
|
||
.. _ref-burkhart2013: | ||
|
||
`Burkhart et al. 2013 <https://ui.adsabs.harvard.edu/#abs/2013ApJ...770..141B/abstract>`_ |
Oops, something went wrong.