Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved GEKPLS Function #456

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/pages.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ pages = ["index.md"
"Variable Fidelity" => "variablefidelity.md",
"Gradient Enhanced Kriging" => "gek.md",
"GEKPLS" => "gekpls.md",
"Improved GEKPLS" => "Improvedgekpls.md",
"MOE" => "moe.md",
"Parallel Optimization" => "parallel.md",
]
Expand Down
55 changes: 55 additions & 0 deletions docs/src/Improvedgekpls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# GEKPLS Function

Gradient Enhanced Kriging with Partial Least Squares Method (GEKPLS) is a surrogate modelling technique that brings down computation time and returns improved accuracy for high-dimensional problems. The Julia implementation of GEKPLS is adapted from the Python version by [SMT](https://github.com/SMTorg) which is based on this [paper](https://arxiv.org/pdf/1708.02663.pdf).

# Modifications for Improved GEKPLS Function:

To enhance the GEKPLS function, sampling method was changed from ```SobolSample()``` to ```HaltonSample()```.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not an enhancement, that's just a change in the sampling method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but gekpls.jl might improve if we use Halton sampling instead of sobol sampling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that's a user choice. That's not inherent to gekpls but just an option for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Multiple choices might work for multiple applications. But at the core, we should focus on optimizing complex mathematical functions like matrix multiplication and also focus on optimal featurization which works better for real world applications.

Yes, but that's a user choice. That's not inherent to gekpls but just an option for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your proof this is a better default? Can you show some head-to-head benchmarks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently hit a PR, where I proposed changes in the readme file of tensor_prod.md
#457
But I think the kind of example functions that we are using are also not completely justified for the benchmark label.
We should use better examples to explain the surrogate optimization and as @sathvikbhagavan rightly said, we need to dive deeper into explaining the mathematical concepts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changes in tensor_prod.md is in the right direction.

Copy link
Contributor Author

@Spinachboul Spinachboul Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathvikbhagavan
So is it ready to merge or still do you think it can be changed at a few spots? Like I think I did not explain mathematically about the function and just textually explained the function.
Should I add in the whole concept image available online?
Link: https://projecteuclid.org/journals/tohoku-mathematical-journal/volume-17/issue-2/The-tensor-product-of-function-algebras/10.2748/tmj/1178243579.pdf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clean up the PRs - use either this or #457 and close one of them, remove the extra tutorial which is not needed and then we can have a round of reviews?

Copy link
Contributor Author

@Spinachboul Spinachboul Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathvikbhagavan and @ChrisRackauckas
Yeah so let's close this one as I am still experimenting with the sampling methods for better rmse!!
Will maybe again hit a PR for the same.



```@example gekpls_water_flow

using Surrogates
using Zygote

function water_flow(x)
r_w = x[1]
r = x[2]
T_u = x[3]
H_u = x[4]
T_l = x[5]
H_l = x[6]
L = x[7]
K_w = x[8]
log_val = log(r/r_w)
return (2*pi*T_u*(H_u - H_l))/ ( log_val*(1 + (2*L*T_u/(log_val*r_w^2*K_w)) + T_u/T_l))
end

n = 1000
lb = [0.05,100,63070,990,63.1,700,1120,9855]
ub = [0.15,50000,115600,1110,116,820,1680,12045]
x = sample(n,lb,ub,HaltonSample())
grads = gradient.(water_flow, x)
y = water_flow.(x)
n_test = 100
x_test = sample(n_test,lb,ub,GoldenSample())
y_true = water_flow.(x_test)
n_comp = 2
delta_x = 0.0001
extra_points = 2
initial_theta = [0.01 for i in 1:n_comp]
g = GEKPLS(x, y, grads, n_comp, delta_x, lb, ub, extra_points, initial_theta)
y_pred = g.(x_test)
rmse = sqrt(sum(((y_pred - y_true).^2)/n_test)) #root mean squared error
println(rmse) #0.0347
```

<br>
<br>



| **Sampling Method** | **RMSE** | **Differences** |
|----------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Sobol Sampling** | 0.021472963465423097 | Utilizes digital nets to generate quasi-random numbers, offering low discrepancy points for improved coverage. - Requires careful handling, especially in higher dimensions. |
| **Halton Sampling** | 0.02144270998045834 | Uses a deterministic sequence based on prime numbers to generate points, allowing for quasi-random, low-discrepancy sampling. - Simpler to implement but may exhibit correlations in some dimensions affecting coverage.
Loading