diff --git a/Project.toml b/Project.toml
index 78869596..ddfbbd5b 100644
--- a/Project.toml
+++ b/Project.toml
@@ -19,7 +19,7 @@ Aqua = "0.8"
 Cubature = "1.5"
 Distributions = "0.25.71"
 ExtendableSparse = "1"
-Flux = "0.14"
+Flux = "0.13, 0.14"
 ForwardDiff = "0.10.19"
 GLM = "1.5"
 IterativeSolvers = "0.9"
diff --git a/docs/src/BraninFunction.md b/docs/src/BraninFunction.md
index 7ac9c584..c58abe12 100644
--- a/docs/src/BraninFunction.md
+++ b/docs/src/BraninFunction.md
@@ -1,13 +1,13 @@
 # Branin Function
 
-The Branin Function is commonly used as a test function for metamodelling in computer experiments, especially in the context of optimization.
+The Branin function is commonly used as a test function for metamodelling in computer experiments, especially in the context of optimization.
 
 The expression of the Branin Function is given as:
 ``f(x) = (x_2 - \frac{5.1}{4\pi^2}x_1^{2} + \frac{5}{\pi}x_1 - 6)^2 + 10(1-\frac{1}{8\pi})\cos(x_1) + 10``
 
 where ``x = (x_1, x_2)`` with ``-5\leq x_1 \leq 10, 0 \leq x_2 \leq 15``
 
-First of all we will import these two packages `Surrogates` and `Plots`.
+First of all, we will import these two packages: `Surrogates` and `Plots`.
 
 ```@example BraninFunction
 using Surrogates
@@ -50,7 +50,7 @@ scatter!(xs, ys)
 plot(p1, p2, title="True function")
 ```
 
-Now it's time to try fitting different surrogates and then we will plot them.
+Now it's time to try fitting different surrogates, and then we will plot them.
 We will have a look at the radial basis surrogate `Radial Basis Surrogate`. :
 
 ```@example BraninFunction
@@ -65,7 +65,7 @@ scatter!(xs, ys, marker_z=zs)
 plot(p1, p2, title="Radial Surrogate")
 ```
 
-Now, we will have a look on `Inverse Distance Surrogate`:
+Now, we will have a look at `Inverse Distance Surrogate`:
 ```@example BraninFunction
 InverseDistance = InverseDistanceSurrogate(xys, zs,  lower_bound, upper_bound)
 ```
diff --git a/docs/src/InverseDistance.md b/docs/src/InverseDistance.md
index f90bc3f2..2ed3115d 100644
--- a/docs/src/InverseDistance.md
+++ b/docs/src/InverseDistance.md
@@ -1,4 +1,4 @@
-The **Inverse Distance Surrogate** is an interpolating method and in this method the unknown points are calculated with a weighted average of the sampling points. This model uses the inverse distance between the unknown and training points to predict the unknown point. We do not need to fit this model because the response of an unknown point x is computed with respect to the distance between x and the training points.
+The **Inverse Distance Surrogate** is an interpolating method, and in this method, the unknown points are calculated with a weighted average of the sampling points. This model uses the inverse distance between the unknown and training points to predict the unknown point. We do not need to fit this model because the response of an unknown point x is computed with respect to the distance between x and the training points.
 
 Let's optimize the following function to use Inverse Distance Surrogate:
 
@@ -53,7 +53,7 @@ plot!(InverseDistance, label="Surrogate function",  xlims=(lower_bound, upper_bo
 
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example Inverse_Distance1D
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, InverseDistance, SobolSample())
@@ -65,7 +65,7 @@ plot!(InverseDistance, label="Surrogate function",  xlims=(lower_bound, upper_bo
 
 ## Inverse Distance Surrogate Tutorial (ND):
 
-First of all we will define the `Schaffer` function we are going to build surrogate for. Notice, one how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
+First of all we will define the `Schaffer` function we are going to build a surrogate for. Notice, how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
 
 ```@example Inverse_DistanceND
 using Plots # hide
@@ -84,7 +84,7 @@ end
 
 ### Sampling
 
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all the sampling points.
 
 ```@example Inverse_DistanceND
 n_samples = 60
@@ -124,7 +124,7 @@ plot(p1, p2, title="Surrogate") # hide
 
 
 ### Optimizing
-With our surrogate we can now search for the minima of the function.
+With our surrogate, we can now search for the minima of the function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/LinearSurrogate.md b/docs/src/LinearSurrogate.md
index 2acff979..2e1ca9de 100644
--- a/docs/src/LinearSurrogate.md
+++ b/docs/src/LinearSurrogate.md
@@ -28,7 +28,7 @@ plot!(f, label="True function", xlims=(lower_bound, upper_bound))
 
 ## Building a Surrogate
 
-With our sampled points we can build the **Linear Surrogate** using the `LinearSurrogate` function.
+With our sampled points, we can build the **Linear Surrogate** using the `LinearSurrogate` function.
 
 We can simply calculate `linear_surrogate` for any value.
 
@@ -51,7 +51,7 @@ plot!(my_linear_surr_1D, label="Surrogate function",  xlims=(lower_bound, upper_
 
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example linear_surrogate1D
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, my_linear_surr_1D, SobolSample())
@@ -63,7 +63,7 @@ plot!(my_linear_surr_1D, label="Surrogate function",  xlims=(lower_bound, upper_
 
 ## Linear Surrogate tutorial (ND)
 
-First of all we will define the `Egg Holder` function we are going to build surrogate for. Notice, one how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
+First of all we will define the `Egg Holder` function we are going to build a surrogate for. Notice, one how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
 
 ```@example linear_surrogateND
 using Plots # hide
@@ -104,7 +104,7 @@ plot(p1, p2, title="True function") # hide
 ```
 
 ### Building a surrogate
-Using the sampled points we build the surrogate, the steps are analogous to the 1-dimensional case.
+Using the sampled points, we build the surrogate, the steps are analogous to the 1-dimensional case.
 
 ```@example linear_surrogateND
 my_linear_ND = LinearSurrogate(xys, zs,  lower_bound, upper_bound)
@@ -119,7 +119,7 @@ plot(p1, p2, title="Surrogate") # hide
 ```
 
 ### Optimizing
-With our surrogate we can now search for the minima of the function.
+With our surrogate, we can now search for the minima of the function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/Salustowicz.md b/docs/src/Salustowicz.md
index c89d7316..9a4ea083 100644
--- a/docs/src/Salustowicz.md
+++ b/docs/src/Salustowicz.md
@@ -39,7 +39,7 @@ scatter(x, y, label="Sampled points", xlims=(lower_bound, upper_bound), legend=:
 plot!(xs, salustowicz.(xs), label="True function", legend=:top)
 ```
 
-Now, let's fit Salustowicz Function with different Surrogates:
+Now, let's fit the Salustowicz function with different surrogates:
 
 ```@example salustowicz1D
 InverseDistance = InverseDistanceSurrogate(x, y, lower_bound, upper_bound)
diff --git a/docs/src/abstractgps.md b/docs/src/abstractgps.md
index 9dd77376..2f711af5 100644
--- a/docs/src/abstractgps.md
+++ b/docs/src/abstractgps.md
@@ -1,12 +1,12 @@
 # Gaussian Process Surrogate Tutorial
 
 !!! note
-    This surrogate requires the 'SurrogatesAbstractGPs' module which can be added by inputting "]add SurrogatesAbstractGPs" from the Julia command line. 
+    This surrogate requires the 'SurrogatesAbstractGPs' module, which can be added by inputting "]add SurrogatesAbstractGPs" from the Julia command line. 
 
 Gaussian Process regression in Surrogates.jl is implemented as a simple wrapper around the [AbstractGPs.jl](https://github.com/JuliaGaussianProcesses/AbstractGPs.jl) package. AbstractGPs comes with a variety of covariance functions (kernels). See [KernelFunctions.jl](https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/) for examples.
 
 !!! tip
-    The examples below demonstrate the use of AbstractGPs with out-of-the-box settings without hyperparameter optimization (i.e. without changing parameters like lengthscale, signal variance and noise variance.) Beyond hyperparameter optimization, careful initialization of hyperparameters and priors on the parameters is required for this surrogate to work properly. For more details on how to fit GPs in practice, check out [A Practical Guide to Gaussian Processes](https://infallible-thompson-49de36.netlify.app/).
+    The examples below demonstrate the use of AbstractGPs with out-of-the-box settings without hyperparameter optimization (i.e. without changing parameters like lengthscale, signal variance, and noise variance). Beyond hyperparameter optimization, careful initialization of hyperparameters and priors on the parameters is required for this surrogate to work properly. For more details on how to fit GPs in practice, check out [A Practical Guide to Gaussian Processes](https://infallible-thompson-49de36.netlify.app/).
     
     Also see this [example](https://juliagaussianprocesses.github.io/AbstractGPs.jl/stable/examples/1-mauna-loa/#Hyperparameter-Optimization) to understand hyperparameter optimization with AbstractGPs.
 ## 1D Example 
diff --git a/docs/src/ackley.md b/docs/src/ackley.md
index 3c903c76..fe38d4e9 100644
--- a/docs/src/ackley.md
+++ b/docs/src/ackley.md
@@ -64,4 +64,4 @@ plot!(xs, ackley.(xs), label="True function", legend=:top)
 plot!(xs, my_rad.(xs), label="Radial basis optimized", legend=:top)
 ```
 
-The DYCORS methods successfully finds the minimum.
+The DYCORS method successfully finds the minimum.
diff --git a/docs/src/cantilever.md b/docs/src/cantilever.md
index b25f72b9..d259f89d 100644
--- a/docs/src/cantilever.md
+++ b/docs/src/cantilever.md
@@ -42,7 +42,7 @@ plot(p1, p2, title="True function")
 ```
 
 
-Fitting different Surrogates:
+Fitting different surrogates:
 ```@example beam
 mypoly = PolynomialChaosSurrogate(xys, zs,  lb, ub)
 loba = LobachevskySurrogate(xys, zs,  lb, ub)
diff --git a/docs/src/gek.md b/docs/src/gek.md
index 869628dd..7d780f41 100644
--- a/docs/src/gek.md
+++ b/docs/src/gek.md
@@ -1,6 +1,6 @@
 ## Gradient Enhanced Kriging
 
-Gradient-enhanced Kriging is an extension of kriging which supports gradient information. GEK is usually more accurate than kriging, however, it is not computationally efficient when the number of inputs, the number of sampling points, or both, are high. This is mainly due to the size of the corresponding correlation matrix that increases proportionally with both the number of inputs and the number of sampling points.
+Gradient-enhanced Kriging is an extension of kriging which supports gradient information. GEK is usually more accurate than kriging. However, it is not computationally efficient when the number of inputs, the number of sampling points, or both, are high. This is mainly due to the size of the corresponding correlation matrix, which increases proportionally with both the number of inputs and the number of sampling points.
 
 Let's have a look at the following function to use Gradient Enhanced Surrogate:
 ``f(x) = sin(x) + 2*x^2``
@@ -15,7 +15,7 @@ default()
 
 ### Sampling
 
-We choose to sample f in 8 points between 0 to 1 using the `sample` function. The sampling points are chosen using a Sobol sequence, this can be done by passing `SobolSample()` to the `sample` function.
+We choose to sample f in 8 points between 0 and 1 using the `sample` function. The sampling points are chosen using a Sobol sequence, this can be done by passing `SobolSample()` to the `sample` function.
 
 ```@example GEK1D
 n_samples = 10
@@ -34,7 +34,7 @@ plot!(f, label="True function", xlims=(lower_bound, upper_bound), legend=:top)
 
 ### Building a surrogate
 
-With our sampled points we can build the Gradient Enhanced Kriging surrogate using the `GEK` function.
+With our sampled points, we can build the Gradient Enhanced Kriging surrogate using the `GEK` function.
 
 ```@example GEK1D
 
@@ -47,7 +47,7 @@ plot!(my_gek, label="Surrogate function", ribbon=p->std_error_at_point(my_gek, p
 
 ## Gradient Enhanced Kriging Surrogate Tutorial (ND)
 
-First of all let's define the function we are going to build a surrogate for.
+First of all, let's define the function we are going to build a surrogate for.
 
 ```@example GEK_ND
 using Plots # hide
@@ -69,7 +69,7 @@ end
 
 ### Sampling
 
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `0, 10`, and `0, 10` for the second dimension. We are taking 80 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular, we want our first dimension `x` to have bounds `0, 10`, and `0, 10` for the second dimension. We are taking 80 samples of the space using Sobol Sequences. We then evaluate our function on all the sampling points.
 
 ```@example GEK_ND
 n_samples = 45
@@ -91,7 +91,7 @@ plot(p1, p2, title="True function") # hide
 ```
 
 ### Building a surrogate
-Using the sampled points we build the surrogate, the steps are analogous to the 1-dimensional case.
+Using the sampled points, we build the surrogate, the steps are analogous to the 1-dimensional case.
 
 ```@example GEK_ND
 grad1 = x1 -> 2*(300*(x[1])^5 - 300*(x[1])^2*x[2] + x[1] -1)
diff --git a/docs/src/gekpls.md b/docs/src/gekpls.md
index d76e0a8f..5ea4af75 100644
--- a/docs/src/gekpls.md
+++ b/docs/src/gekpls.md
@@ -1,6 +1,6 @@
 ## GEKPLS Surrogate Tutorial
 
-Gradient Enhanced Kriging with Partial Least Squares Method (GEKPLS) is a surrogate modelling technique that brings down computation time and returns improved accuracy for high-dimensional problems. The Julia implementation of GEKPLS is adapted from the Python version by [SMT](https://github.com/SMTorg) which is based on this [paper](https://arxiv.org/pdf/1708.02663.pdf).  
+Gradient Enhanced Kriging with Partial Least Squares Method (GEKPLS) is a surrogate modeling technique that brings down computation time and returns improved accuracy for high-dimensional problems. The Julia implementation of GEKPLS is adapted from the Python version by [SMT](https://github.com/SMTorg) which is based on this [paper](https://arxiv.org/pdf/1708.02663.pdf).  
 
 The following are the inputs when building a GEKPLS surrogate: 
 
diff --git a/docs/src/gramacylee.md b/docs/src/gramacylee.md
index 1221673f..ad92e00e 100644
--- a/docs/src/gramacylee.md
+++ b/docs/src/gramacylee.md
@@ -1,6 +1,6 @@
 ## Gramacy & Lee Function
 
-Gramacy & Lee Function is a continuous function. It is not convex. The function is defined on 1-dimensional space. It is an unimodal. The function can be defined on any input domain but it is usually evaluated on
+the Gramacy & Lee function is a continuous function. It is not convex. The function is defined on a 1-dimensional space. It is unimodal. The function can be defined on any input domain, but it is usually evaluated on
 ``x \in [-0.5, 2.5]``.
 
 The Gramacy & Lee is as follows:
@@ -25,7 +25,7 @@ function gramacylee(x)
 end
 ```
 
-Let's sample f in 25 points between -0.5 and 2.5 using the `sample` function. The sampling points are chosen using a Sobol Sample, this can be done by passing `SobolSample()` to the `sample` function.
+Let's sample f in 25 points between -0.5 and 2.5 using the `sample` function. The sampling points are chosen using a Sobol sample, this can be done by passing `SobolSample()` to the `sample` function.
 
 ```@example gramacylee1D
 n = 25
@@ -38,7 +38,7 @@ scatter(x, y, label="Sampled points", xlims=(lower_bound, upper_bound), ylims=(-
 plot!(xs, gramacylee.(xs), label="True function", legend=:top)
 ```
 
-Now, let's fit Gramacy & Lee Function with different Surrogates:
+Now, let's fit Gramacy & Lee function with different surrogates:
 
 ```@example gramacylee1D
 my_pol = PolynomialChaosSurrogate(x, y, lower_bound, upper_bound)
diff --git a/docs/src/index.md b/docs/src/index.md
index 51538ba6..e0484b70 100644
--- a/docs/src/index.md
+++ b/docs/src/index.md
@@ -8,7 +8,7 @@ The construction of a surrogate model can be seen as a three-step process:
 2. Construction of the surrogate model
 3. Surrogate optimization
 
-The sampling methods are super important for the behavior of the Surrogate. Sampling can be done through [QuasiMonteCarlo.jl](https://github.com/SciML/QuasiMonteCarlo.jl), all the functions available there can be used in Surrogates.jl.
+The sampling methods are super important for the behavior of the surrogate. Sampling can be done through [QuasiMonteCarlo.jl](https://github.com/SciML/QuasiMonteCarlo.jl), all the functions available there can be used in Surrogates.jl.
 
 The available surrogates are:
 
@@ -27,7 +27,7 @@ That is, simultaneously looking for a minimum **and** sampling the most unknown
 The available optimization methods are:
 
 - Stochastic RBF (SRBF)
-- Lower confidence bound strategy (LCBS)
+- Lower confidence-bound strategy (LCBS)
 - Expected improvement (EI)
 - Dynamic coordinate search (DYCORS)
 
diff --git a/docs/src/kriging.md b/docs/src/kriging.md
index d64a6f09..b6e93fa9 100644
--- a/docs/src/kriging.md
+++ b/docs/src/kriging.md
@@ -1,10 +1,10 @@
 ## Kriging surrogate tutorial (1D)
 
-Kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process.
+Kriging or Gaussian process regression, is a method of interpolation in which the interpolated values are modeled by a Gaussian process.
 
 We are going to use a Kriging surrogate to optimize $f(x)=(6x-2)^2sin(12x-4)$. (function from Forrester et al. (2008)).
 
-First of all import `Surrogates` and `Plots`.
+First of all, import `Surrogates` and `Plots`.
 ```@example kriging_tutorial1d
 using Surrogates
 using Plots
@@ -12,7 +12,7 @@ default()
 ```
 ### Sampling
 
-We choose to sample f in 4 points between 0 and 1 using the `sample` function. The sampling points are chosen using a Sobol sequence, this can be done by passing `SobolSample()` to the `sample` function.
+We choose to sample f in 4 points between 0 and 1 using the `sample` function. The sampling points are chosen using a Sobol sequence; This can be done by passing `SobolSample()` to the `sample` function.
 
 ```@example kriging_tutorial1d
 # https://www.sfu.ca/~ssurjano/forretal08.html
@@ -33,9 +33,9 @@ plot!(xs, f.(xs), label="True function", legend=:top)
 ```
 ### Building a surrogate
 
-With our sampled points we can build the Kriging surrogate using the `Kriging` function.
+With our sampled points, we can build the Kriging surrogate using the `Kriging` function.
 
-`kriging_surrogate` behaves like an ordinary function which we can simply plot. A nice statistical property of this surrogate is being able to calculate the error of the function at each point, we plot this as a confidence interval using the `ribbon` argument.
+`kriging_surrogate` behaves like an ordinary function, which we can simply plot. A nice statistical property of this surrogate is being able to calculate the error of the function at each point. We plot this as a confidence interval using the `ribbon` argument.
 
 ```@example kriging_tutorial1d
 kriging_surrogate = Kriging(x, y, lower_bound, upper_bound);
@@ -47,7 +47,7 @@ plot!(xs, kriging_surrogate.(xs), label="Surrogate function", ribbon=p->std_erro
 ### Optimizing
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate, we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example kriging_tutorial1d
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, kriging_surrogate, SobolSample())
@@ -60,7 +60,7 @@ plot!(xs, kriging_surrogate.(xs), label="Surrogate function", ribbon=p->std_erro
 
 ## Kriging surrogate tutorial (ND)
 
-First of all let's define the function we are going to build a surrogate for. Notice how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
+First of all, let's define the function we are going to build a surrogate for. Notice how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
 
 ```@example kriging_tutorialnd
 using Plots # hide
@@ -81,7 +81,7 @@ end
 ```
 
 ### Sampling
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 50 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular, we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 50 samples of the space using Sobol sequences. We then evaluate our function on all the sampling points.
 
 ```@example kriging_tutorialnd
 n_samples = 10
@@ -103,7 +103,7 @@ scatter!(xs, ys) # hide
 plot(p1, p2, title="True function") # hide
 ```
 ### Building a surrogate
-Using the sampled points we build the surrogate, the steps are analogous to the 1-dimensional case.
+Using the sampled points, we build the surrogate, the steps are analogous to the 1-dimensional case.
 
 ```@example kriging_tutorialnd
 kriging_surrogate = Kriging(xys, zs, lower_bound, upper_bound, p=[2.0, 2.0], theta=[0.03, 0.003])
@@ -118,7 +118,7 @@ plot(p1, p2, title="Surrogate") # hide
 ```
 
 ### Optimizing
-With our surrogate we can now search for the minima of the branin function.
+With our surrogate, we can now search for the minima of the branin function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/lobachevsky.md b/docs/src/lobachevsky.md
index d8ad9dcc..76584aa6 100644
--- a/docs/src/lobachevsky.md
+++ b/docs/src/lobachevsky.md
@@ -26,9 +26,9 @@ plot!(f, label="True function", xlims=(lower_bound, upper_bound))
 ```
 ## Building a surrogate
 
-With our sampled points we can build the Lobachevsky surrogate using the `LobachevskySurrogate` function.
+With our sampled points, we can build the Lobachevsky surrogate using the `LobachevskySurrogate` function.
 
-`lobachevsky_surrogate` behaves like an ordinary function which we can simply plot. Alpha is the shape parameters and n specify how close you want lobachevsky function to radial basis function.
+`lobachevsky_surrogate` behaves like an ordinary function, which we can simply plot. Alpha is the shape parameter, and n specifies how close you want Lobachevsky function to be to the radial basis function.
 
 ```@example LobachevskySurrogate_tutorial
 alpha = 2.0
@@ -41,7 +41,7 @@ plot!(lobachevsky_surrogate, label="Surrogate function",  xlims=(lower_bound, up
 ## Optimizing
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example LobachevskySurrogate_tutorial
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, lobachevsky_surrogate, SobolSample())
@@ -55,7 +55,7 @@ In the example below, it shows how to use `lobachevsky_surrogate` for higher dim
 
 # Lobachevsky Surrogate Tutorial (ND):
 
-First of all we will define the `Schaffer` function we are going to build surrogate for. Notice, one how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
+First of all, we will define the `Schaffer` function we are going to build surrogate for. Notice, one how its argument is a vector of numbers, one for each coordinate, and its output is a scalar.
 
 ```@example LobachevskySurrogate_ND
 using Plots # hide
@@ -74,7 +74,7 @@ end
 
 ## Sampling
 
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `0, 8`, and `0, 8` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular, we want our first dimension `x` to have bounds `0, 8`, and `0, 8` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
 
 ```@example LobachevskySurrogate_ND
 n_samples = 60
@@ -98,7 +98,7 @@ plot(p1, p2, title="True function") # hide
 
 
 ## Building a surrogate
-Using the sampled points we build the surrogate, the steps are analogous to the 1-dimensional case.
+Using the sampled points, we build the surrogate, the steps are analogous to the 1-dimensional case.
 
 ```@example LobachevskySurrogate_ND
 Lobachevsky = LobachevskySurrogate(xys, zs,  lower_bound, upper_bound, alpha = [2.4,2.4], n=8)
@@ -114,7 +114,7 @@ plot(p1, p2, title="Surrogate") # hide
 
 
 ## Optimizing
-With our surrogate we can now search for the minima of the function.
+With our surrogate, we can now search for the minima of the function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/lp.md b/docs/src/lp.md
index 8a7902c8..1e47e97d 100644
--- a/docs/src/lp.md
+++ b/docs/src/lp.md
@@ -32,7 +32,7 @@ plot(x, y, seriestype=:scatter, label="Sampled points", xlims=(lb, ub), ylims=(0
 plot!(xs,f.(xs,p), label="True function", legend=:top)
 ```
 
-Fitting different Surrogates:
+Fitting different surrogates:
 ```@example lp
 my_pol = PolynomialChaosSurrogate(x,y,lb,ub)
 loba_1 = LobachevskySurrogate(x,y,lb,ub)
diff --git a/docs/src/moe.md b/docs/src/moe.md
index 0bcd432f..a448bc6e 100644
--- a/docs/src/moe.md
+++ b/docs/src/moe.md
@@ -1,7 +1,7 @@
 ## Mixture of Experts (MOE)
 
 !!! note
-    This surrogate requires the 'SurrogatesMOE' module which can be added by inputting "]add SurrogatesMOE" from the Julia command line. 
+    This surrogate requires the 'SurrogatesMOE' module, which can be added by inputting "]add SurrogatesMOE" from the Julia command line. 
 
 The Mixture of Experts (MOE) Surrogate model represents the interpolating function as a combination of other surrogate models. SurrogatesMOE is a Julia implementation of the [Python version from SMT](https://smt.readthedocs.io/en/latest/_src_docs/applications/moe.html).
 
@@ -43,7 +43,7 @@ RAD_1D = RadialBasis(x, y, lb, ub, rad = linearRadial(), scale_factor = 1.0, spa
 RAD_at0 = RAD_1D(0.0) #true value should be 5.0
 ```
 
-As we can see, the prediction is far away from the ground truth. Now, how does the MOE perform?
+As we can see, the prediction is far from the ground truth. Now, how does the MOE perform?
 
 ```@example MOE_1D
 expert_types = [
@@ -57,7 +57,7 @@ MOE_1D_RAD_RAD = MOE(x, y, expert_types)
 MOE_at0 = MOE_1D_RAD_RAD(0.0)
 ```
 
-As we can see the accuracy is significantly better. 
+As we can see, the accuracy is significantly better. 
 
 ### Under the Hood - How SurrogatesMOE Works
 
diff --git a/docs/src/multi_objective_opt.md b/docs/src/multi_objective_opt.md
index b9150963..eb066664 100644
--- a/docs/src/multi_objective_opt.md
+++ b/docs/src/multi_objective_opt.md
@@ -1,6 +1,6 @@
-# Multi objective optimization 
+# Multi-objective optimization 
 
-## Case 1: Non colliding objective functions
+## Case 1: Non-colliding objective functions
 
 ```@example multi_obj
 using Surrogates
diff --git a/docs/src/neural.md b/docs/src/neural.md
index d4638522..3a8311b8 100644
--- a/docs/src/neural.md
+++ b/docs/src/neural.md
@@ -1,11 +1,11 @@
 # Neural network tutorial
 !!! note
-    This surrogate requires the 'SurrogatesFlux' module which can be added by inputting "]add SurrogatesFlux" from the Julia command line. 
+    This surrogate requires the 'SurrogatesFlux' module, which can be added by inputting "]add SurrogatesFlux" from the Julia command line. 
 
 It's possible to define a neural network as a surrogate, using Flux.
 This is useful because we can call optimization methods on it.
 
-First of all we will define the `Schaffer` function we are going to build surrogate for.
+First of all we will define the `Schaffer` function we are going to build a surrogate for.
 
 ```@example Neural_surrogate
 using Plots
@@ -26,7 +26,7 @@ end
 
 ## Sampling
 
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `0, 8`, and `0, 8` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `0, 8`, and `0, 8` for the second dimension. We are taking 60 samples of the space using Sobol Sequences. We then evaluate our function on all the sampling points.
 
 ```@example Neural_surrogate
 n_samples = 60
@@ -50,8 +50,8 @@ plot(p1, p2, title="True function") # hide
 
 
 ## Building a surrogate
-You can specify your own model, optimization function, loss functions and epochs.
-As always, getting the model right is hardest thing.
+You can specify your own model, optimization function, loss functions, and epochs.
+As always, getting the model right is the hardest thing.
 
 ```@example Neural_surrogate
 model1 = Chain(
diff --git a/docs/src/parallel.md b/docs/src/parallel.md
index 9bff2f4e..fccffbbd 100755
--- a/docs/src/parallel.md
+++ b/docs/src/parallel.md
@@ -11,7 +11,7 @@ To enable parallel optimization, we make use of an Ask-Tell interface. The user
 To ensure that points of interest returned by `potential_optimal_points` are sufficiently far from each other, the function makes use of *virtual points*. They are used as follows:
 1. `potential_optimal_points` is told to return `n` points.
 2. The point with the highest merit function value is selected.
-3. This point is now treated as a virtual point and is assigned a temporary value that changes the landscape of the merit function. How the the temporary value is chosen depends on the strategy used. (see below)
+3. This point is now treated as a virtual point and is assigned a temporary value that changes the landscape of the merit function. How the temporary value is chosen depends on the strategy used. (see below)
 4. The point with the new highest merit is selected.
 5. The process is repeated until `n` points have been selected.
 
@@ -22,9 +22,9 @@ The following strategies are available for virtual point selection for all optim
 - "Mean Constant Liar (MeanConstantLiar)":
   - The virtual point is assigned using the mean of the merit function across all evaluated points.
 - "Maximum Constant Liar (MaximumConstantLiar)":
-  - The virtual point is assigned using the great known value of the merit function across all evaluated points.
+  - The virtual point is assigned using the greatest known value of the merit function across all evaluated points.
 
-For Kriging surrogates, specifically, the above and follow strategies are available:  
+For Kriging surrogates, specifically, the above and following strategies are available:  
 
 - "Kriging Believer (KrigingBeliever):
   - The virtual point is assigned using the mean of the Kriging surrogate at the virtual point.
@@ -34,7 +34,7 @@ For Kriging surrogates, specifically, the above and follow strategies are availa
   - The virtual point is assigned using 3$\sigma$ below the mean of the Kriging surrogate at the virtual point.
 
 
-In general, MinimumConstantLiar and KrigingBelieverLowerBound tend to favor exploitation while MaximumConstantLiar and KrigingBelieverUpperBound tend to favor exploration. MeanConstantLiar and KrigingBeliever tend to be a compromise between the two.
+In general, MinimumConstantLiar and KrigingBelieverLowerBound tend to favor exploitation, while MaximumConstantLiar and KrigingBelieverUpperBound tend to favor exploration. MeanConstantLiar and KrigingBeliever tend to be compromises between the two.
 
 ## Examples
 
diff --git a/docs/src/polychaos.md b/docs/src/polychaos.md
index 24b36857..51644509 100644
--- a/docs/src/polychaos.md
+++ b/docs/src/polychaos.md
@@ -9,7 +9,7 @@ we are trying to fit. Under the hood, PolyChaos.jl has been used.
 It is possible to specify a type of polynomial for each dimension of the problem.
 ### Sampling
 
-We choose to sample f in 25 points between 0 and 10 using the `sample` function. The sampling points are chosen using a Low Discrepancy, this can be done by passing `HaltonSample()` to the `sample` function.
+We choose to sample f in 25 points between 0 and 10 using the `sample` function. The sampling points are chosen using a Low Discrepancy. This can be done by passing `HaltonSample()` to the `sample` function.
 
 ```@example polychaos
 using Surrogates
diff --git a/docs/src/radials.md b/docs/src/radials.md
index aa88629f..420f6683 100644
--- a/docs/src/radials.md
+++ b/docs/src/radials.md
@@ -45,7 +45,7 @@ radial_surrogate = RadialBasis(x, y, lower_bound, upper_bound, rad = cubicRadial
 val = radial_surrogate(5.4)
 ```
 
-Currently available radial basis functions are `linearRadial` (the default), `cubicRadial`, `multiquadricRadial`, and `thinplateRadial`.
+Currently, available radial basis functions are `linearRadial` (the default), `cubicRadial`, `multiquadricRadial`, and `thinplateRadial`.
 
 Now, we will simply plot `radial_surrogate`:
 
@@ -60,7 +60,7 @@ plot!(radial_surrogate, label="Surrogate function",  xlims=(lower_bound, upper_b
 
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate, we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example RadialBasisSurrogate
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, radial_surrogate, SobolSample())
@@ -72,7 +72,7 @@ plot!(radial_surrogate, label="Surrogate function",  xlims=(lower_bound, upper_b
 
 ## Radial Basis Surrogate tutorial (ND)
 
-First of all we will define the `Booth` function we are going to build the surrogate for:
+First of all, we will define the `Booth` function we are going to build the surrogate for:
 
 $f(x) = (x_1 + 2*x_2 - 7)^2 + (2*x_1 + x_2 - 5)^2$
 
@@ -132,7 +132,7 @@ plot(p1, p2, title="Surrogate") # hide
 ```
 
 ### Optimizing
-With our surrogate we can now search for the minima of the function.
+With our surrogate, we can now search for the minima of the function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/randomforest.md b/docs/src/randomforest.md
index 8609bb85..4086a0b5 100644
--- a/docs/src/randomforest.md
+++ b/docs/src/randomforest.md
@@ -1,11 +1,11 @@
 ## Random forests surrogate tutorial
 
 !!! note
-    This surrogate requires the 'SurrogatesRandomForest' module which can be added by inputting "]add SurrogatesRandomForest" from the Julia command line. 
+    This surrogate requires the 'SurrogatesRandomForest' module, which can be added by inputting "]add SurrogatesRandomForest" from the Julia command line. 
 
 Random forests is a supervised learning algorithm that randomly creates and merges multiple decision trees into one forest.
 
-We are going to use a Random forests surrogate to optimize $f(x)=sin(x)+sin(10/3 * x)$.
+We are going to use a random forests surrogate to optimize $f(x)=sin(x)+sin(10/3 * x)$.
 
 First of all import `Surrogates` and `Plots`.
 ```@example RandomForestSurrogate_tutorial
@@ -30,9 +30,9 @@ plot!(f, label="True function", xlims=(lower_bound, upper_bound), legend=:top)
 ```
 ### Building a surrogate
 
-With our sampled points we can build the Random forests surrogate using the `RandomForestSurrogate` function.
+With our sampled points, we can build the Random forests surrogate using the `RandomForestSurrogate` function.
 
-`randomforest_surrogate` behaves like an ordinary function which we can simply plot. Additionally you can specify the number of trees created
+`randomforest_surrogate` behaves like an ordinary function, which we can simply plot. Additionally, you can specify the number of trees created
 using the parameter num_round
 
 ```@example RandomForestSurrogate_tutorial
@@ -45,7 +45,7 @@ plot!(randomforest_surrogate, label="Surrogate function",  xlims=(lower_bound, u
 ### Optimizing
 Having built a surrogate, we can now use it to search for minima in our original function `f`.
 
-To optimize using our surrogate we call `surrogate_optimize` method. We choose to use Stochastic RBF as optimization technique and again Sobol sampling as sampling technique.
+To optimize using our surrogate, we call `surrogate_optimize` method. We choose to use Stochastic RBF as the optimization technique and again Sobol sampling as the sampling technique.
 
 ```@example RandomForestSurrogate_tutorial
 @show surrogate_optimize(f, SRBF(), lower_bound, upper_bound, randomforest_surrogate, SobolSample())
@@ -57,7 +57,7 @@ plot!(randomforest_surrogate, label="Surrogate function",  xlims=(lower_bound, u
 
 ## Random Forest ND
 
-First of all we will define the `Bukin Function N. 6` function we are going to build surrogate for.
+First of all we will define the `Bukin Function N. 6` function we are going to build a surrogate for.
 
 ```@example RandomForestSurrogateND
 using Plots # hide
@@ -76,7 +76,7 @@ end
 
 ### Sampling
 
-Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 50 samples of the space using Sobol Sequences. We then evaluate our function on all of the sampling points.
+Let's define our bounds, this time we are working in two dimensions. In particular we want our first dimension `x` to have bounds `-5, 10`, and `0, 15` for the second dimension. We are taking 50 samples of the space using Sobol Sequences. We then evaluate our function on all the sampling points.
 
 ```@example RandomForestSurrogateND
 n_samples = 50
@@ -100,7 +100,7 @@ plot(p1, p2, title="True function") # hide
 
 
 ### Building a surrogate
-Using the sampled points we build the surrogate, the steps are analogous to the 1-dimensional case.
+Using the sampled points, we build the surrogate, the steps are analogous to the 1-dimensional case.
 
 ```@example RandomForestSurrogateND
 using SurrogatesRandomForest
@@ -117,7 +117,7 @@ plot(p1, p2, title="Surrogate") # hide
 
 
 ### Optimizing
-With our surrogate we can now search for the minima of the function.
+With our surrogate, we can now search for the minima of the function.
 
 Notice how the new sampled points, which were created during the optimization process, are appended to the `xys` array.
 This is why its size changes.
diff --git a/docs/src/rosenbrock.md b/docs/src/rosenbrock.md
index 046bb4c0..a882b438 100644
--- a/docs/src/rosenbrock.md
+++ b/docs/src/rosenbrock.md
@@ -38,7 +38,7 @@ scatter!(xs, ys)
 plot(p1, p2, title="True function")
 ```
 
-Fitting different Surrogates:
+Fitting different surrogates:
 ```@example rosen
 mypoly = PolynomialChaosSurrogate(xys, zs,  lb, ub)
 loba = LobachevskySurrogate(xys, zs, lb, ub)
diff --git a/docs/src/secondorderpoly.md b/docs/src/secondorderpoly.md
index 97826e85..790f34d3 100644
--- a/docs/src/secondorderpoly.md
+++ b/docs/src/secondorderpoly.md
@@ -3,7 +3,7 @@
 The square polynomial model can be expressed by:
 ``y = Xβ + ϵ``
 Where X is the matrix of the linear model augmented by adding 2d columns,
-containing pair by pair product of variables and variables squared.
+containing pair by pair products of variables and variables squared.
 
 ```@example second_order_tut
 using Surrogates
@@ -40,4 +40,4 @@ scatter(x, y, label="Sampled points")
 plot!(f, label="True function",  xlims=(lb, ub))
 plot!(sec, label="Surrogate function",  xlims=(lb, ub))
 ```
-The optimization method successfully found the minima.
+The optimization method successfully found the minimum.
diff --git a/docs/src/sphere_function.md b/docs/src/sphere_function.md
index 0ad974ec..7c6257d4 100644
--- a/docs/src/sphere_function.md
+++ b/docs/src/sphere_function.md
@@ -42,7 +42,7 @@ plot!(xs, rad_1d_cubic.(xs), label="Radial surrogate with cubic", legend=:top)
 plot!(xs, rad_1d_multiquadric.(xs), label="Radial surrogate with multiquadric", legend=:top)
 ```
 
-Fitting Lobachevsky Surrogate with different values of hyperparameters alpha:
+Fitting Lobachevsky Surrogate with different values of hyperparameter alpha:
 ```@example sphere_function
 loba_1 = LobachevskySurrogate(x,y,lb,ub)
 loba_2 = LobachevskySurrogate(x,y,lb,ub,alpha = 1.5, n = 6)
diff --git a/docs/src/tutorials.md b/docs/src/tutorials.md
index eec1fc58..3dd7d8fa 100644
--- a/docs/src/tutorials.md
+++ b/docs/src/tutorials.md
@@ -35,7 +35,7 @@ approx = my_radial_basis((1.0,1.4))
 ## Kriging standard error
 Let's now use the Kriging surrogate, which is a single-output Gaussian process.
 This surrogate has a nice feature: not only does it approximate the solution at a
-point, it also calculates the standard error at such point.
+point, it also calculates the standard error at such a point.
 Let's see an example:
 
 ```@example kriging
@@ -55,7 +55,7 @@ approx = my_krig(5.4)
 std_err = std_error_at_point(my_krig,5.4)
 ```
 
-Let's now optimize the Kriging surrogate using Lower confidence bound method, this is just a one-liner:
+Let's now optimize the Kriging surrogate using the lower confidence bound method. This is just a one-liner:
 
 ```@example kriging
 surrogate_optimize(f,LCBS(),lb,ub,my_krig,RandomSample(); maxiters = 10, num_new_samples = 10)
diff --git a/docs/src/variablefidelity.md b/docs/src/variablefidelity.md
index c375a097..18047510 100644
--- a/docs/src/variablefidelity.md
+++ b/docs/src/variablefidelity.md
@@ -1,7 +1,7 @@
 # Variable fidelity Surrogates
 
-With the variable fidelity surrogate, we can specify two different surrogates: one for high fidelity data and one for low fidelity data.
-By default, the first half samples are considered high fidelity and the second half low fidelity.
+With the variable fidelity surrogate, we can specify two different surrogates: one for high-fidelity data and one for low-fidelity data.
+By default, the first half of the samples are considered high-fidelity and the second half low-fidelity.
 
 ```@example variablefid
 using Surrogates
diff --git a/docs/src/welded_beam.md b/docs/src/welded_beam.md
index 8e22a508..c207bce4 100644
--- a/docs/src/welded_beam.md
+++ b/docs/src/welded_beam.md
@@ -1,4 +1,4 @@
-#Welded beam function
+# Welded beam function
 
 The welded beam function is defined as:
 ``f(h,l,t) = \sqrt{\frac{a^2 + b^2 + abl}{\sqrt{0.25(l^2+(h+t)^2)}}}``
diff --git a/docs/src/wendland.md b/docs/src/wendland.md
index b3de1667..88e97521 100644
--- a/docs/src/wendland.md
+++ b/docs/src/wendland.md
@@ -19,7 +19,7 @@ x = sample(n,lower_bound,upper_bound,SobolSample())
 y = f.(x)
 ```
 
-We choose to sample f in 30 points between 5 to 25 using `sample` function. The sampling points are chosen using a Sobol sequence, this can be done by passing `SobolSample()` to the `sample` function.
+We choose to sample f in 30 points between 5 and 25 using `sample` function. The sampling points are chosen using a Sobol sequence, this can be done by passing `SobolSample()` to the `sample` function.
 
 ## Building Surrogate
 
diff --git a/lib/SurrogatesFlux/Project.toml b/lib/SurrogatesFlux/Project.toml
index 8fc1c5de..5abdf5a5 100644
--- a/lib/SurrogatesFlux/Project.toml
+++ b/lib/SurrogatesFlux/Project.toml
@@ -8,7 +8,7 @@ Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
 Surrogates = "6fc51010-71bc-11e9-0e15-a3fcc6593c49"
 
 [compat]
-Flux = "0.14"
+Flux = "0.13, 0.14"
 Surrogates = "6"
 julia = "1.10"
 
diff --git a/src/Optimization.jl b/src/Optimization.jl
index b7d7e3f8..cfc3e0ba 100755
--- a/src/Optimization.jl
+++ b/src/Optimization.jl
@@ -57,13 +57,13 @@ function merit_function(point, w, surr::AbstractSurrogate, s_max, s_min, d_max,
 end
 
 """
-The main idea is to pick the new evaluations from a set of candidate points where each candidate point is generated as an N(0, sigma^2)
+The main idea is to pick the new evaluations from a set of candidate points, where each candidate point is generated as an N(0, sigma^2)
 distributed perturbation from the current best solution.
 The value of sigma is modified based on progress and follows the same logic as
 in many trust region methods: we increase sigma if we make a lot of progress
 (the surrogate is accurate) and decrease sigma when we aren’t able to make progress
 (the surrogate model is inaccurate).
-More details about how sigma is updated is given in the original papers.
+More details about how sigma is updated are given in the original papers.
 
 After generating the candidate points, we predict their objective function value
 and compute the minimum distance to the previously evaluated point.
@@ -543,7 +543,7 @@ function potential_optimal_points(::SRBF, strategy, lb::Number, ub::Number,
     # Loop until we have n_parallel new points
     while new_addition < n_parallel
 
-        #3) Evaluate merit function at the sampled points in parallel 
+        #3) Evaluate merit function at the sampled points in parallel
         evaluation_of_merit_function = merit_function.(new_sample, w, tmp_surr, s_max,
             s_min, d_max, d_min, box_size)
 
@@ -1117,10 +1117,10 @@ Combining radial basis function surrogates and dynamic coordinate search in high
 Engineering Optimization, 45(5): 529–555, 2013.
 This is an extension of the SRBF strategy that changes how the
 candidate points are generated. The main idea is that many objective
-functions depend only on a few directions so it may be advantageous to
+functions depend only on a few directions, so it may be advantageous to
 perturb only a few directions. In particular, we use a perturbation probability
 to perturb a given coordinate and decrease this probability after each function
-evaluation so fewer coordinates are perturbed later in the optimization.
+evaluation, so fewer coordinates are perturbed later in the optimization.
 """
 function surrogate_optimize(obj::Function, ::DYCORS, lb, ub, surrn::AbstractSurrogate,
         sample_type::SamplingAlgorithm; maxiters = 100,