Automatic step sizes for SVRG #207

bagibence · 2024-08-08T19:53:46Z

Attempt to automatically determine the batch- and step sizes for SVRG when fitting a GLM with Poisson observations and a softplus inverse link function.
Based on this paper.

… by SVRG

bagibence · 2024-08-09T17:55:40Z

This needs to be more exhaustively tested to make sure it works on real datasets.
I have even encountered toy examples where my current implementation didn't work either.

bagibence · 2024-08-09T18:20:20Z

The regularization strength might have to be added to the L and L_max constants determined.
It could also be useful as bound for the convexity when using ridge.

BalzaniEdoardo · 2024-08-26T12:30:38Z

This PR provides the infrastructure for computing an optimal stepsize and batch_size for SVRG based on the GLM configurations.

The optimal hyperparameters depends on the loss function L-smoothness. This means that for each model configuration (observation noise, link function, regularization), one may need to to compute a different estimate of the smoothness parameters.

Here, I implemented a look-up table that should be easy to extend whenever new estimates becomes available (for example if we derive the L-smoothness for Gamma + softplus observations).

…m jaxopt

billbrod

It is hard for me to review this (not knowing the math) without an example. Do we want to add an example as part of this PR or do a separate one to add an example for "how to use SVRG"?
We really do need to get the math for this up somewhere.
same point as Initialization #252 where it looks like black was run on some test scripts for the first time
The new test glm (and test population glm) functions are hard to parse. they look like they're doing a lot -- would it make sense to break them up?

billbrod · 2024-10-22T14:09:30Z

src/nemos/glm.py

+
+    **Fitting Large Models**
+
+    For very large models, you may consider using the Stochastic Variance Reduced Gradient


would be nice to point to example in the docs here

I agree, but for a future PR in the documentation. I'll link to this comment in the docs project

src/nemos/glm.py

billbrod · 2024-10-22T14:10:30Z

src/nemos/glm.py

+
+    **Fitting Large Models**
+
+    For very large models, you may consider using the Stochastic Variance Reduced Gradient


same point about example and doc link

subsequent pr

src/nemos/solvers/_compute_defaults.py

src/nemos/solvers/_svrg_defaults.py

tests/test_glm.py

tests/test_svrg_defaults.py

BalzaniEdoardo · 2024-10-25T15:25:48Z

This is ready for another round, I should have addressed everything. The I added the pdf as an asset, just for us to have it in a place that is easy to find. However, I would not add it to the documentation or any public site yet, since it is very much a work in progress. I want to close this PR quickly, and before the math is polished

BalzaniEdoardo

@billbrod some of the things I resolved add pending comments that I did not submit. I added the SVRG example to one of the issue in the docs project. The rest should be addressed. If you feel like the SVRG description in the GLM class can still be improved, go ahead and change it, I think that's the upper-bound of my English writing skills :)

src/nemos/solvers/_svrg_defaults.py

BalzaniEdoardo · 2024-10-25T14:49:48Z

src/nemos/glm.py

+
+    **Fitting Large Models**
+
+    For very large models, you may consider using the Stochastic Variance Reduced Gradient


I agree, but for a future PR in the documentation. I'll link to this comment in the docs project

src/nemos/glm.py

BalzaniEdoardo · 2024-10-25T14:58:16Z

src/nemos/glm.py

+
+    **Fitting Large Models**
+
+    For very large models, you may consider using the Stochastic Variance Reduced Gradient


subsequent pr

src/nemos/solvers/_svrg_defaults.py

tests/test_svrg_defaults.py

tests/test_glm.py

billbrod · 2024-10-28T15:56:29Z

src/nemos/solvers/_svrg_defaults.py

+                "Please, consider using the power method by setting the `n_power_iters` parameter "
+                "(default behavior).",


Suggested change

"Please, consider using the power method by setting the `n_power_iters` parameter "

"(default behavior).",

"Please, consider using the power method by setting the `n_power_iters` parameter ",

billbrod · 2024-10-28T15:57:16Z

src/nemos/solvers/_svrg_defaults.py

+            # Calculate the Hessian directly and find the largest eigenvalue
+            XDX = X.T.dot((0.17 * y.reshape(y.shape[0], 1) + 0.25) * X) / y.shape[0]
+            return jnp.sort(jnp.linalg.eigvalsh(XDX))[-1]
+        except RuntimeError as e:


This should be described in the docstring, under Raises (both here and in the user-facing function)

billbrod · 2024-10-28T15:57:47Z

tests/test_svrg_defaults.py

+    "n_power_iter, expectation",
+    [
+        (None, pytest.warns(UserWarning, match="Direct computation of the eigenvalues")),
+        (1, does_not_raise()),


what's the behavior with n_power_iter=0

billbrod

This is good to go once we add the RuntimeError for the eigenvalue to the docstrings.

I think the glm docstring is clear enough for now -- once we add the relevant doc, I think we can basically remove that info (moving it to the tutorial) from the docstring and point to the tutorial

bagibence added 2 commits August 7, 2024 17:07

Add automatic batch and step size for softplus-Poisson GLMs optimized…

0f9f1da

… by SVRG

Add docstrings

a88bc61

bagibence and others added 17 commits August 9, 2024 14:23

Handle stepsize not being in solver_kwargs

a021cbb

Add new way to calculate stepsize and also b_tilde

e606fc1

Merge branch 'development' into auto_stepsize_svrg

e94cbc8

started renaming vars

95870f8

added ref to algorithm

5661899

renamed function and generalized lookup

481220d

added the table calculations

68c3240

Merge branch 'development' into auto_stepsize_svrg

db5b70f

brought back maxiter to 10K.

730b789

moved pieces around

d15ad8d

improved doscrsrings and added test for config

ab4dbfd

improved naming and docstrings

886bdeb

started testing

d5baa02

changed naming

deae6a1

linted

496702a

added two missed lines for cov

e084844

added test all table cases

0d36aaf

BalzaniEdoardo marked this pull request as ready for review August 26, 2024 12:19

BalzaniEdoardo self-requested a review as a code owner August 26, 2024 12:19

BalzaniEdoardo added 2 commits August 26, 2024 14:31

linted

aada505

linted

e9028a8

BalzaniEdoardo marked this pull request as draft August 26, 2024 12:44

BalzaniEdoardo added 2 commits August 26, 2024 16:59

added glm tests

b8801b5

linted

7e8d576

BalzaniEdoardo marked this pull request as ready for review August 26, 2024 15:00

BalzaniEdoardo added 2 commits October 18, 2024 14:28

Merge branch 'development' into auto_stepsize_svrg

24e5f4b

linted

e09623b

BalzaniEdoardo requested review from sjvenditto and billbrod October 21, 2024 20:48

BalzaniEdoardo added 2 commits October 21, 2024 16:55

linted

c220a1d

modified svrg error to match GradientDescent and ProximalGradient fro…

4c9553d

…m jaxopt

billbrod requested changes Oct 22, 2024

View reviewed changes

BalzaniEdoardo added 6 commits October 25, 2024 10:57

added pdf

009561e

improved error message and provided solution

60b2f27

change default to power iteration

f0e133e

saga removed

7fc7e66

removed model.fit

c0103aa

removed test saga

78cb8ed

BalzaniEdoardo requested a review from billbrod October 25, 2024 15:25

BalzaniEdoardo added 7 commits October 25, 2024 11:45

merged development

7d87c4a

added extra example

37f6b11

added warning

0718819

changed descr of svrg usage

6ecf1a1

linted

6bb16b1

improved docstrings

d901ebb

added expectation in test

01bc153

BalzaniEdoardo reviewed Oct 28, 2024

View reviewed changes

billbrod reviewed Oct 28, 2024

View reviewed changes

billbrod approved these changes Oct 28, 2024

View reviewed changes

BalzaniEdoardo added 2 commits October 28, 2024 12:20

added typeerror and valueerror, as well as tests

9dc8e6e

Merge branch 'development' into auto_stepsize_svrg

028b26a

BalzaniEdoardo merged commit b85f408 into flatironinstitute:development Oct 28, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic step sizes for SVRG #207

Automatic step sizes for SVRG #207

bagibence commented Aug 8, 2024

bagibence commented Aug 9, 2024

bagibence commented Aug 9, 2024

BalzaniEdoardo commented Aug 26, 2024

billbrod left a comment

billbrod Oct 22, 2024

BalzaniEdoardo Oct 25, 2024

billbrod Oct 22, 2024

BalzaniEdoardo Oct 25, 2024

BalzaniEdoardo commented Oct 25, 2024

BalzaniEdoardo left a comment

BalzaniEdoardo Oct 25, 2024

BalzaniEdoardo Oct 25, 2024

billbrod Oct 28, 2024

billbrod Oct 28, 2024

billbrod Oct 28, 2024

billbrod left a comment


		Fitting Large Models

		For very large models, you may consider using the Stochastic Variance Reduced Gradient

		"Please, consider using the power method by setting the `n_power_iters` parameter "
		"(default behavior).",

Automatic step sizes for SVRG #207

Automatic step sizes for SVRG #207

Conversation

bagibence commented Aug 8, 2024

bagibence commented Aug 9, 2024

bagibence commented Aug 9, 2024

BalzaniEdoardo commented Aug 26, 2024

billbrod left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BalzaniEdoardo commented Oct 25, 2024

BalzaniEdoardo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

billbrod left a comment

Choose a reason for hiding this comment