Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GEKPLS differentiable #371

Closed
vikram-s-narayan opened this issue Jul 8, 2022 · 1 comment
Closed

Make GEKPLS differentiable #371

vikram-s-narayan opened this issue Jul 8, 2022 · 1 comment

Comments

@vikram-s-narayan
Copy link
Contributor

This issue is related to #364. The goal is to make GEKPLS differentiable so that users can perform gradient-based optimization on hyperparameters similar to what is suggested in #368

@vikram-s-narayan
Copy link
Contributor Author

vikram-s-narayan commented Jul 12, 2022

Doing something like the following:

function min_rlfv(theta)
    g = GEKPLS(X, y, grads, n_comp, delta_x, xlimits, extra_points, theta)
    return -g.reduced_likelihood_function_value
end

Zygote.gradient(min_rlfv, [0.01, 0.1])

throws the following error:

ERROR: Need an adjoint for constructor LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}}. Gradient is of type LinearAlgebra.Transpose{Float64, Matrix{Float64}}

The stacktrace leads to the following function in src/GEKPLS.jl (comments removed for readability). The highlighted line in my code editor is Q, G = qr(Ft)

function _reduced_likelihood_function(theta, kernel_type, d, nt, ij, y_norma, noise = 0.0)
    reduced_likelihood_function_value = -Inf
    nugget = 1000000.0 * eps() #a jitter for numerical stability; 
    if kernel_type == "squar_exp" 
        r = squar_exp(theta, d)
    end
    R = (I + zeros(nt, nt)) .* (1.0 + nugget + noise)
    for k in 1:size(ij)[1]
        R[ij[k, 1], ij[k, 2]] = r[k]
        R[ij[k, 2], ij[k, 1]] = r[k]
    end
    C = cholesky(R).L 
    F = ones(nt, 1) 
    Ft = C \ F
    Q, G = qr(Ft)
    Q = Array(Q)
    Yt = C \ y_norma
    beta = G \ [(transpose(Q) ⋅ Yt)]
    rho = Yt .- (Ft .* beta)
    gamma = transpose(C) \ rho
    sigma2 = sum((rho) .^ 2, dims = 1) / nt
    detR = prod(diag(C) .^ (2.0 / nt))
    reduced_likelihood_function_value = -nt * log10(sum(sigma2)) - nt * log10(detR)
    return beta, gamma, reduced_likelihood_function_value
end

Any pointers or thoughts on how this can be fixed?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant