The KDEpy project uses GitHub Actions to build wheels for Linux, Mac and Windows. Wheels are built using the cibuildwheel Python package. After developing, the following will push a tagged commit -- from there CI will build wheels to distribute to PyPI automatically.
$ <run tests and linting>
$ git commit -m "Release of v0.X.Y"
$ git tag v0.X.Y
$ git push origin V0.X.Y
The list below roughly shows what needs to be done.
-
univariate BaseKDE (todo: check if more common code can be moved)
-
univariate NaiveKDE
-
univariate TreeKDE
-
univariate FFTKDE (implement linbin even faster in cython)
-
univariate DiffusionKDE
-
Refactor kernel funcs - add solver for effective bandwidth
-
Implement univariate, fixed bandwidth KDEs naively
-
Implement weighted, fixed bandwidth, univariate KDEs naively
-
Implement variable bandwidth KDEs naively
-
Implement TreeKDE, test against other implementations
-
Implement Scott and Silverman rules for bandwidth selection
-
Make sure that speed and functionally matches
statsmodels
,scikit-learn
andscipy
-
Implement methods taking care of boundaries
-
Make sure TreeKDE works without finite support too
I hope to follow these guidelines for this project:
- Import as few external dependencies as possible, ideally only NumPy.
- Use test driven development, have tests and docs for every method.
- Cite literature and implement recent methods.
- Unless it's a bottleneck computation, readability trumps speed.
- Employ object orientation, but resist the temptation to implement many methods - stick to the basics.
- Follow PEP8
sklearn/neighbors/kde.py
scipy/stats/kde.py
statsmodels/nonparametric/*
seaborn/distributions.py
- https://github.com/cooperlab/AdaptiveKDE
- https://github.com/tillahoffmann/asymmetric_kde
- http://pythonhosted.org/PyQt-Fit/KDE_tut.html
- https://github.com/Daniel-B-Smith/KDE-for-SciPy
- MATLAB: adaptive kernel density estimation in one-dimension
- MATLAB: Kernel Density Estimator for High Dimensions
- Silverman, B. W. Density Estimation for Statistics and Data Analysis. Boca Raton: Chapman and Hall, 1986. -- Page 99 for reference to kd-tree
- Wand, M. P., and M. C. Jones. Kernel Smoothing. London ; New York: Chapman and Hall/CRC, 1995. -- Page 182 for computation using linbin and fft
- Wiki - Kernel density estimation
- Wiki - Variable kernel density estimation
- Wiki - Kernel (statistics)
- Histograms and kernel density estimation KDE 2
- Jakevdp - Kernel Density Estimation in Python
- arXiv - Efficient statistical classification of satellite measurements
- arXiv - UNIFIED TREATMENT OF THE ASYMPTOTICS OF ASYMMETRIC KERNEL DENSITY ESTIMATORS
- arXiv - A Review of Kernel Density Estimation with Applications to Econometrics
- A Reliable Data-Based Bandwidth Selection Method for Kernel Density Estimation
- KERNEL DENSITY ESTIMATION VIA DIFFUSION
- Variable Kernel Density Estimation
- Bayesian Approach to Bandwidth Selection for Multivariate Kernel Density Estimation
- BOOTSTRAP BANDWIDTH SELECTION IN KERNEL DENSITY ESTIMATION
- Kernel Estimator and Bandwidth Selection for Density and its Derivatives
- Variable Kernel Density Estimation - 20 slides
- Lecture Notes on Nonparametrics - 25 pages
- APPLIED SMOOTHING TECHNIQUES - Part 1: Kernel Density Estimation - 20 pages
- Kernel density estimation - 26 slides
- Density Estimation - 32 pages
- An Algorithm for Finding Best Matches in Logarithmic Expected Time, Friedman et al, DOI 10.1145/355744.355745
https://www.ics.uci.edu/~ihler/code/kde.html
http://www-stat.wharton.upenn.edu/~lzhao/papers/MyPublication/Fast_jcgs.2010.pdf
https://indico.cern.ch/event/397113/contributions/1837849/attachments/1213965/1771772/main.pdf
http://www.cs.ubc.ca/~nando/papers/empirical.pdf
http://iopscience.iop.org/article/10.1088/1742-6596/762/1/012042/pdf