Big variance-related bug fix to xtrace

peekxc · Dec 4, 2024 · 4a7c940 · 4a7c940
1 parent 3a0ebb6
commit 4a7c940
Show file tree

Hide file tree

Showing 7 changed files with 94 additions and 437 deletions.
diff --git a/docs/_site/basic/quickstart.html b/docs/_site/basic/quickstart.html
diff --git a/docs/_site/search.json b/docs/_site/search.json
@@ -916,9 +916,9 @@
   {
     "objectID": "basic/quickstart.html",
     "href": "basic/quickstart.html",
-    "title": "primate quickstart",
+    "title": "Quickstart",
     "section": "",
-    "text": "primate is a package that contains a variety of algorithms for estimating quantities from matrix functions, with a focus on implicit matrix representations and support for common quantities of interest, such as the trace or the diagonal.\nBelow is a quick introduction to:",
+    "text": "primate contains a variety of algorithms for estimating quantities from matrices and matrix functions, with a focus on common quantities of interest, such as the trace or the diagonal. This page briefly outlines the variety of options primate offers to approximate these quantities, and how to configure these approximations per use case.",
     "crumbs": [
       "Reference",
       "Quickstart"
@@ -927,9 +927,9 @@
   {
     "objectID": "basic/quickstart.html#trace-estimation",
     "href": "basic/quickstart.html#trace-estimation",
-    "title": "primate quickstart",
+    "title": "Quickstart",
     "section": "Trace estimation",
-    "text": "Trace estimation\nA variety of trace estimators are available in the primate.trace module. These algorithms estimate the quantity:\n \\mathrm{tr}(A) = \\sum\\limits_{i=1}^n A_{ii} = \\sum\\limits_{i=1}^n \\lambda_i \nThe following example demonstrates a variety of trace estimators, including the Girard-Hutchinson estimator (hutch), the improved Hutch++ (hutchpp), and XTrace (xtrace):\n\nfrom primate.trace import hutch, hutchpp, xtrace\nfrom primate.random import symmetric\nrng = np.random.default_rng(1234)      # for reproducibility \nA = symmetric(150, pd=True, seed=rng)  # random PD matrix \n\nprint(f\"Trace   :  {A.trace():6f}\")   ## Actual trace\nprint(f\"Hutch   :  {hutch(A):6f}\")    ## Crude Monte-Carlo estimator\nprint(f\"Hutch++ :  {hutchpp(A):6f}\")  ## Monte-Carlo estimator w/ deflation \nprint(f\"XTrace  :  {xtrace(A):6f}\")   ## Epperly's algorithm\n\nTrace   :  74.884261\nHutch   :  74.420094\nHutch++ :  74.944518\nXTrace  :  74.884261\n\n\nHere, A can be a numpy matrix, a sparse matrix, or a LinearOperator. If the spectral sum of interest is not just the sum of the eigenvalues, but rather the sum under composition with some spectral function f(\\lambda), i.e. the quantity of interest is of the form:\n \\mathrm{tr}(f(A)) = \\sum\\limits_{i=1}^n f(A)_{ii} = \\sum\\limits_{i=1}^n f(\\lambda_i) \nThen you may alternatively use the matrix_function API to construct a LinearOperator by passing in the corresponding spectral function as a Callable or by passing a string representing the name of one of the following built-in spectral functions:\n\n\n\n\n\n\n\n\n\nname\nmatrix function\nApplications\nnumpy call\n\n\n\n\nidentity\nA\nBasic matrix operations\nA\n\n\nlog\n\\log(A)\nDeterminant, entropy-like measures\nlogm(A)\n\n\nexp\ne^A\nDynamical systems, graph diffusion\nexpm(A)\n\n\ninv\nA^{-1}\nStability analysis, linear systems\ninv(A)\n\n\nsign\n\\text{sgn}_\\epsilon(A)\nRank approximation, low-rank modeling\nU @ U.T\n\n\nsqrt\nA^{1/2}\nDiffusion, kernel methods\nsqrtm(A)\n\n\nsquare\nA^2\nEnergy measures, stability\nA @ A\n\n\nPCA\nP_A (eigenspace)\nDim. reduction, feature extraction\nCustom projection matrix\n\n\nTikhonov\n(A + \\lambda I)^{-1}\nRegularized inversion, stability\ninv(A + lambda * eye(n))\n\n\nheat\ne^{-tA}\nDiffusion on graphs, spectral clustering\nexpm(-t * A)\n\n\npagerank\n(I - \\alpha A)^{-1}v\nNetwork centrality, web ranking\nIterative solver\n\n\n\n\nFor example, one might compute the log-determinant of a positive definite matrix as follows:\n\nfrom primate.operators import matrix_function, IdentityOperator\nM = matrix_function(A, fun=\"log\") # or fun=np.log \n\nprint(f\"logdet(A) :  {np.log(np.linalg.det(A)):6f}\")\nprint(f\"tr(log(A)):  {np.sum(np.log(np.linalg.eigvalsh(A))):6f}\")\nprint(f\"Hutch     :  {hutch(M):6f}\")\nprint(f\"Hutch++   :  {hutchpp(M):6f}\")\nprint(f\"XTrace    :  {xtrace(M):6f}\")\n\nlogdet(A) :  -151.529763\ntr(log(A)):  -151.529763\nHutch     :  -152.753783\nHutch++   :  -151.046977\nXTrace    :  -151.528096",
+    "text": "Trace estimation\nThe implicit matrix trace estimation problem seeks to estimate the quantity:\n \\mathrm{tr}(A) = \\sum\\limits_{i=1}^n A_{ii} = \\sum\\limits_{i=1}^n \\lambda_i \nA variety of trace estimators are available in the primate.trace module, including the Girard-Hutchinson estimator (hutch), the improved Hutch++ (hutchpp), and XTrace (xtrace):\n\nfrom primate.trace import hutch, hutchpp, xtrace\nfrom primate.random import symmetric\nrng = np.random.default_rng(1234)      # for reproducibility \nA = symmetric(150, pd=True, seed=rng)  # random PD matrix \n\nprint(f\"Trace   :  {A.trace():6f}\")   ## Actual trace\nprint(f\"Hutch   :  {hutch(A):6f}\")    ## Crude Monte-Carlo estimator\nprint(f\"Hutch++ :  {hutchpp(A):6f}\")  ## Monte-Carlo estimator w/ deflation \nprint(f\"XTrace  :  {xtrace(A):6f}\")   ## Epperly's algorithm\n\nTrace   :  74.884261\nHutch   :  74.774909\nHutch++ :  74.848081\nXTrace  :  78.027808\n\n\nHere, A can be a numpy matrix, a sparse matrix, or a LinearOperator. If the spectral sum of interest is not just the sum of the eigenvalues, but rather the sum under composition with some spectral function f(\\lambda), i.e. the quantity of interest is of the form:\n \\mathrm{tr}(f(A)) = \\sum\\limits_{i=1}^n f(A)_{ii} = \\sum\\limits_{i=1}^n f(\\lambda_i) \nThen you may alternatively use the matrix_function API to construct a LinearOperator by passing in the corresponding spectral function as a Callable or by passing a string representing the name of one of the following built-in spectral functions:\n\n\n\n\n\n\n\n\n\nname\nmatrix function\nApplications\nnumpy call\n\n\n\n\nidentity\nA\nBasic matrix operations\nA\n\n\nlog\n\\log(A)\nDeterminant, entropy-like measures\nlogm(A)\n\n\nexp\ne^A\nDynamical systems, graph diffusion\nexpm(A)\n\n\ninv\nA^{-1}\nStability analysis, linear systems\ninv(A)\n\n\nsign\n\\text{sgn}_\\epsilon(A)\nRank approximation, low-rank modeling\nU @ U.T\n\n\nsqrt\nA^{1/2}\nDiffusion, kernel methods\nsqrtm(A)\n\n\nsquare\nA^2\nEnergy measures, stability\nA @ A\n\n\nPCA\nP_A (eigenspace)\nDim. reduction, feature extraction\nCustom projection matrix\n\n\nTikhonov\n(A + \\lambda I)^{-1}\nRegularized inversion, stability\ninv(A + lambda * eye(n))\n\n\nheat\ne^{-tA}\nDiffusion on graphs, spectral clustering\nexpm(-t * A)\n\n\npagerank\n(I - \\alpha A)^{-1}v\nNetwork centrality, web ranking\nIterative solver\n\n\n\n\nFor example, one might compute the log-determinant of a positive definite matrix as follows:\n\nfrom primate.operators import matrix_function, IdentityOperator\nM = matrix_function(A, fun=\"log\") # or fun=np.log \n\nprint(f\"logdet(A) :  {np.log(np.linalg.det(A)):6f}\")\nprint(f\"tr(log(A)):  {np.sum(np.log(np.linalg.eigvalsh(A))):6f}\")\nprint(f\"Hutch     :  {hutch(M):6f}\")\nprint(f\"Hutch++   :  {hutchpp(M):6f}\")\nprint(f\"XTrace    :  {xtrace(M):6f}\")\n\nlogdet(A) :  -151.529763\ntr(log(A)):  -151.529763\nHutch     :  -152.346095\nHutch++   :  -149.577572\nXTrace    :  -160.472063",
     "crumbs": [
       "Reference",
       "Quickstart"
@@ -938,9 +938,9 @@
   {
     "objectID": "basic/quickstart.html#diagonal-estimation",
     "href": "basic/quickstart.html#diagonal-estimation",
-    "title": "primate quickstart",
+    "title": "Quickstart",
     "section": "Diagonal estimation",
-    "text": "Diagonal estimation\nThe diagonals of matrices and matrix functions (implicitly or explicitly represented) can also be estimated via nearly identical API used for the trace.\n\nfrom primate.estimators import arr_summary\nfrom primate.diagonal import diag, xdiag\n\nd1 = A.diagonal()\nd2 = diag(A, rtol=1e-4)\nd3 = xdiag(A)\n\nprint(f\"Diagonal (true): {arr_summary(d1)}\")\nprint(f\"Diagonal Hutch : {arr_summary(d2)}\")\nprint(f\"Diagonal XDiag : {arr_summary(d3)}\")\n\nDiagonal (true): [0.48,0.53,...,0.48]\nDiagonal Hutch : [0.49,0.58,...,0.52]\nDiagonal XDiag : [0.48,0.53,...,0.48]",
+    "text": "Diagonal estimation\nThe diagonals of matrices and matrix functions (implicitly or explicitly represented) can also be estimated via nearly identical API used for the trace.\n\nfrom primate.estimators import arr_summary\nfrom primate.diagonal import diag, xdiag\n\nd1 = A.diagonal()\nd2 = diag(A, rtol=1e-4)\nd3 = xdiag(A)\n\nprint(f\"Diagonal (true): {arr_summary(d1)}\")\nprint(f\"Diagonal Hutch : {arr_summary(d2)}\")\nprint(f\"Diagonal XDiag : {arr_summary(d3)}\")\n\nDiagonal (true): [0.48,0.53,...,0.48]\nDiagonal Hutch : [0.49,0.53,...,0.51]\nDiagonal XDiag : [0.48,0.54,...,0.48]",
     "crumbs": [
       "Reference",
       "Quickstart"
@@ -949,7 +949,7 @@
   {
     "objectID": "basic/quickstart.html#matrix-function-approximation",
     "href": "basic/quickstart.html#matrix-function-approximation",
-    "title": "primate quickstart",
+    "title": "Quickstart",
     "section": "Matrix function approximation",
     "text": "Matrix function approximation\nIn primate, the matrix function f(A) is not constructed explicitly but instead the action v \\mapsto f(A)v is approximated with a fixed-degree Krylov expansion. This can be useful when, for example, the matrix A itself is so large that the corresponding (typically dense) matrix function f(A) \\in \\mathbb{R}^{n \\times n} simply is too large to be explicitly represented. If you just want to approximate the action of a matrix function for a single vector v \\in \\mathbb{R}^n, simply supply the vector and the matrix alongside the matrix_function call:\n\nfrom primate.operators import matrix_function\nv = np.random.uniform(size=A.shape[0])\ny = matrix_function(A, fun=np.exp, v=v)\nprint(f\"f(A)v = {arr_summary(y.ravel())}\")\n\nf(A)v = [0.52,0.90,...,0.24]\n\n\nAlternatively, if you prefer an object-oriented approach (or you plan on doing multiple matvecs), you can construct a MatrixFunction instance and use it like any other LinearOperator:\n\nfrom primate.operators import MatrixFunction\nExpA = MatrixFunction(A, fun=np.exp)\ny = ExpA @ v\nprint(f\"exp(A)v = {arr_summary(y)}\")\n\nexp(A)v = [0.52,0.90,...,0.24]\n\n\nIf you don’t supply a vector v to the matrix_function call, a MatrixFunction instance is constructed using whatever additional arguments are passed in and returned. Note some function specializations are inherently more difficult to approximate and can depend on the smoothness of f and the conditioning of the corresponding operator f(A); in general, a MatrixFunction instance with degree k approximates the action v \\mapsto f(A)v about as well as the operator p(A), where p is a degree 2k-1 polynomial interpolant of f.\n\nfrom scipy.linalg import expm\nExpA = expm(A)\nExpA0 = MatrixFunction(A, fun=np.exp, deg=5, orth=0)\nExpA1 = MatrixFunction(A, fun=np.exp, deg=20, orth=0)\nExpA2 = MatrixFunction(A, fun=np.exp, deg=50, orth=50)\n\nw = ExpA @ v\nx = ExpA0 @ v\ny = ExpA1 @ v \nz = ExpA2 @ v\n\nprint(f\"Deg-5 approx error  (no reorth.)   : {np.linalg.norm(w - x)}\")\nprint(f\"Deg-20 approx error (no reorth.)   : {np.linalg.norm(w - y)}\")\nprint(f\"Deg-50 approx error (full reorth.) : {np.linalg.norm(w - z)}\")\n\nDeg-5 approx error  (no reorth.)   : 0.0001254249077837913\nDeg-20 approx error (no reorth.)   : 8.246535086916179e-14\nDeg-50 approx error (full reorth.) : 9.442567959828599e-14\n\n\nAs you can see, for smoother matrix functions (like \\mathrm{exp}(A)), even a low degree Krylov expansion can be more than sufficient for many application purposes—all without any re-orthogonalization! See the matrix function guide for more background on this.",
     "crumbs": [
@@ -971,9 +971,20 @@
   {
     "objectID": "basic/quickstart.html#configuring-the-output",
     "href": "basic/quickstart.html#configuring-the-output",
-    "title": "primate quickstart",
+    "title": "Quickstart",
     "section": "Configuring the output",
-    "text": "Configuring the output\n\nPassing full=True returns additional information about the computation in the form of EstimatorResult (along with the estimate itself), which contains information about execution itself, convergence information of the estimator, and other status messages.\nFor example, with the default converge=\"confidene\" criterion, the margin of error of a default-constructed confidence interval is returned:\n\nest, info = hutch(A, converge=\"confidence\", full=True)\nprint(info.message)\n\nEst: 74.443 +/- 1.393 (95% CI, #S:32)\n\n\nA more visual way of viewing the sample values and the corresponding estimate as a function of the sample size is to plot the sequence with the figure_sequence function (note this requires saving the samples with record=True):\n\nfrom primate.plotting import figure_sequence\n\nest, info = hutch(A, full=True, record=True)\np = figure_sequence(info.estimator.values)\nshow(p)",
+    "text": "Configuring the output\n\nPassing full=True returns additional information about the computation in the form of EstimatorResult (along with the estimate itself), which contains information about execution itself, convergence information of the estimator, and other status messages.\nFor example, with the default converge=\"confidene\" criterion, the margin of error of a default-constructed confidence interval is returned:\n\nest, info = hutch(A, converge=\"confidence\", full=True)\nprint(info.message)\n\nEst: 76.801 +/- 1.313 (95% CI, #S:32)\n\n\nA more visual way of viewing the sample values and the corresponding estimate as a function of the sample size is to plot the sequence with the figure_sequence function (note this requires saving the samples with record=True):\n\nfrom primate.plotting import figure_sequence\n\nest, info = hutch(A, full=True, record=True)\np = figure_sequence(info.estimator.values)\nshow(p)\n\n\n  \n\n\n\n\n\nYou can also pass a callback function, which receives as its only argument an EstimatorResult instance. This can be useful for quickly monitoring convergence status, saving intermediate information, etc.\n\nfrom primate.estimators import EstimatorResult\ndef hutch_callback(result: EstimatorResult):\n    if (result.nit % 150) == 0:\n        print(result.criterion.message(result.estimator))\n\nest, info = hutch(A, converge=\"confidence\", callback=hutch_callback)",
+    "crumbs": [
+      "Reference",
+      "Quickstart"
+    ]
+  },
+  {
+    "objectID": "basic/quickstart.html#implicit-matrix-representations",
+    "href": "basic/quickstart.html#implicit-matrix-representations",
+    "title": "Quickstart",
+    "section": "Implicit matrix representations",
+    "text": "Implicit matrix representations\nIn all of the examples below, by the term matrix we mean generically anything that comes with an associated v \\mapsto Av capability.",
     "crumbs": [
       "Reference",
       "Quickstart"

diff --git a/docs/basic/quickstart.qmd b/docs/basic/quickstart.qmd
@@ -1,19 +1,10 @@
 ---
-title: "`primate` quickstart"
+title: "Quickstart"
 ---
 
-`primate` is a package that contains a variety of algorithms for estimating quantities from matrix functions, with a focus 
-on implicit matrix representations and support for common quantities of interest, such as the trace or the diagonal.
-
-Below is a quick introduction to: 
-
-- Trace estimation
-- Diagonal estimation 
-- Matrix function approximation
-- Configuring the output 
-- Configuring the execution
-
-<br/>
+`primate` contains a variety of algorithms for estimating quantities from matrices and matrix functions, with a focus 
+on common quantities of interest, such as the trace or the diagonal. This page briefly outlines the variety of options
+`primate` offers to approximate these quantities, and how to configure these approximations per use case. 
 
 ```{python}
 #| echo: false
@@ -25,14 +16,18 @@ import numpy as np
 np.random.seed(1234)
 ```
 
+## Implicit matrix representations 
+
+In all of the examples below, by the term _matrix_ we mean generically anything that comes with an associated 
+$v \mapsto Av$ capability. 
+
 ## Trace estimation 
 
-A variety of _trace_ estimators are available in the `primate.trace` module. These algorithms estimate the quantity: 
+The _implicit matrix trace estimation_ problem seeks to estimate the quantity: 
 
 $$ \mathrm{tr}(A) = \sum\limits_{i=1}^n A_{ii} = \sum\limits_{i=1}^n \lambda_i $$
 
-The following example demonstrates a variety of trace estimators, including the Girard-Hutchinson estimator (`hutch`), the 
-improved Hutch++ (`hutchpp`), and XTrace (`xtrace`):
+A variety of _trace_ estimators are available in the `primate.trace` module, including the Girard-Hutchinson estimator (`hutch`), the improved Hutch++ (`hutchpp`), and XTrace (`xtrace`):
 
 ```{python}
 from primate.trace import hutch, hutchpp, xtrace
@@ -163,9 +158,10 @@ show(p)
 ```
 
 You can also pass a callback function, which receives as its only argument an `EstimatorResult` instance. 
-This can be useful for quickly monitoring convergence information. 
+This can be useful for quickly monitoring convergence status, saving intermediate information, etc. 
 
 ```{python}
+from primate.estimators import EstimatorResult
 def hutch_callback(result: EstimatorResult):
 	if (result.nit % 150) == 0:
 		print(result.criterion.message(result.estimator))