Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Kriging improvements (log likelihood, noise variance, more tests) #379

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

archermarx
Copy link
Contributor

This addresses #375 as well as adds some more testing utilities to move us toward more rigorous and comprehensive property-based testing of surrogates.

1. Noise variance

The Kriging object now has a new field, noise_variance, which defaults to zero. This is a number added to the main diagonal of the covariance matrix which allows Kriging to model noisy functions

Here's what including noise variance looks like:

Random.seed!(1234)
  lb = 0.0
  ub = 10.0
  func = x -> sin(x)
  n_samples = 50
  x = sample(n_samples, lb, ub, SobolSample())

  # Add some random noise to the function
  y = func.(x) .+ 0.1 * randn(n_samples)

  # Build a kriging surrogate without noise
  my_k = Kriging(x, y, lb, ub, noise_variance = 0.0)

  x_fine = LinRange(lb, ub, 10000)
  y_fine = func.(x_fine)
  pred_fine = my_k.(x_fine)
  errs = std_error_at_point.(my_k, x_fine)
  p = Plots.plot(x_fine, y_fine, label = "True function", legend = :outertop)
  Plots.plot!(x_fine, pred_fine, label = "surrogate", ribbon= 3 * errs, ylims = (-2, 2))
  Plots.scatter!(my_k.x, my_k.y, mc = :black, label = "Data")

plot_258

This interpolates the data but doesn't tell us much about how the signal varies

If we now instead include a noise variance, we get a much more accurate picture of the underlying function

my_k = Kriging(x, y, lb, ub, noise_variance = 0.1)

plot_257

2. Log likelihood function for kriging

There is now a Surrogates.kriging_log_likelihood(x, y, p, theta, noise_variance) function which is differentiable and can be used for hyperparameter optimization.

3. Property-based testing

I have created a new file, test_utils.jl with two functions: _random_surrogate and _check_interpolation. The first generates a random surrogate (random dimension, number of points) of the given type and optionally using a provided function. The second checks that the input surrogate correctly interpolates its input data. Interpolation is an important property to check for most surrogates, and in cases when the surrogate regresses rather than interpolates (kriging with noise_variance not equal to zero, linearSurrogate, SecondOrderPolynomialSurrogate), we can check that that behavior also holds. I have added interpolation-checking tests to all surrogates except for Wendland, GEK, GEKPLS, and Earth. In testing, I found that Wendland doesn't seem to work at all, so that should be fixed.

4. Miscellaneous

I changed Kriging's default p from 2.0 to 1.99 to help numerical stability

Still to do

  • Add examples for kriging hyperparameter optimization to docs
  • Add example for kriging noise variance to docs
  • Fix Wendland surrogate
  • Many doc examples are broken, would be good to fix those.

- kriging logpdf added

- modified default kriging params

- add tests for interpolation condition for most surrogates

- wendland doesn't work, need to figure out why

- slightly modify sampling
@codecov
Copy link

codecov bot commented Jul 14, 2022

Codecov Report

Merging #379 (28a6c45) into master (6eb339e) will decrease coverage by 0.27%.
The diff coverage is 68.29%.

@@            Coverage Diff             @@
##           master     #379      +/-   ##
==========================================
- Coverage   79.18%   78.91%   -0.28%     
==========================================
  Files          16       16              
  Lines        2306     2319      +13     
==========================================
+ Hits         1826     1830       +4     
- Misses        480      489       +9     
Impacted Files Coverage Δ
src/LinearSurrogate.jl 100.00% <ø> (ø)
src/Kriging.jl 87.80% <64.86%> (-6.84%) ⬇️
src/Radials.jl 90.99% <100.00%> (+2.70%) ⬆️
src/Sampling.jl 100.00% <100.00%> (ø)
src/Optimization.jl 71.97% <0.00%> (-0.28%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more


# Need autodiff to ignore these checks. When optimizing hyperparameters, these won't
# matter as the optiization will be constrained to satisfy these by default.
Zygote.ignore() do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ChainRulesCore.@ignore_derivatives

@vikram-s-narayan
Copy link
Contributor

@archermarx - One of the tasks you've listed is:

Many doc examples are broken, would be good to fix those.

Is this related to #363? If yes, are you working on this?

I'm studying the issue and want to make sure that we both don't work on the same thing :)

Thanks!

@archermarx
Copy link
Contributor Author

I'm not sure yet, I haven't dug too deeply down into the issue, but most of the examples in the documentation display no results.

@vikram-s-narayan
Copy link
Contributor

vikram-s-narayan commented Jul 22, 2022

I'm not sure yet, I haven't dug too deeply down into the issue, but most of the examples in the documentation display no results.

Okay thanks. I'll work on the documentation examples.

@ChrisRackauckas
Copy link
Member

Is this still needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants