Skip to content

Commit

Permalink
Big update
Browse files Browse the repository at this point in the history
  • Loading branch information
jameschapman19 committed Oct 26, 2023
1 parent cab8d57 commit a17b234
Show file tree
Hide file tree
Showing 12 changed files with 85 additions and 85 deletions.
34 changes: 17 additions & 17 deletions docs/source/examples/plot_dcca.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
"""
Deep Canonical Correlation Analysis (CCALoss) using `cca_zoo`
Deep Canonical Correlation Analysis (CCA) using `cca_zoo`
========================================================
This script showcases how to implement various Deep CCALoss methods and their
This script showcases how to implement various Deep CCA methods and their
variants using the `cca_zoo` library, a dedicated tool for canonical
correlation analysis and its related techniques. The MNIST dataset is used
as an example, where images are split into two halves to treat as separate representations.
Key Features:
- Demonstrates the training process of multiple Deep CCALoss variants.
- Demonstrates the training process of multiple Deep CCA variants.
- Visualizes the results of each variant for comparative analysis.
- Leverages `cca_zoo` for canonical correlation analysis techniques.
"""
Expand Down Expand Up @@ -48,9 +48,9 @@


# %%
# Deep CCALoss
# Deep CCA
# ----------------------------
# Deep CCALoss is a method that learns nonlinear transformations of two representations
# Deep CCA is a method that learns nonlinear transformations of two representations
# such that the resulting latent representations are maximally correlated.

dcca = DCCA(latent_dimensions=LATENT_DIMS, encoders=[encoder_1, encoder_2])
Expand Down Expand Up @@ -88,9 +88,9 @@
plt.show()

# %%
# Deep CCALoss EY
# Deep CCA EY
# ----------------------------
# Deep CCALoss EY is a variant of Deep CCALoss that uses an explicit objective function
# Deep CCA EY is a variant of Deep CCA that uses an explicit objective function
# based on the eigenvalue decomposition of the cross-covariance matrix.

dcca_eg = DCCA_EY(
Expand All @@ -108,9 +108,9 @@
plt.show()

# %%
# Deep CCALoss by Non-Linear Orthogonal Iterations
# Deep CCA by Non-Linear Orthogonal Iterations
# ----------------------------------------------
# Deep CCALoss by Non-Linear Orthogonal Iterations (DCCA_NOI) is another variant of Deep CCALoss
# Deep CCA by Non-Linear Orthogonal Iterations (DCCA_NOI) is another variant of Deep CCA
# that uses an iterative algorithm to orthogonalize the latent representations.

dcca_noi = DCCA_NOI(latent_dimensions=LATENT_DIMS, encoders=[encoder_1, encoder_2])
Expand All @@ -126,9 +126,9 @@
plt.show()

# %%
# Deep CCALoss by Stochastic Decorrelation Loss
# Deep CCA by Stochastic Decorrelation Loss
# ----------------------------------------------
# Deep CCALoss by Stochastic Decorrelation Loss (DCCA_SDL) is yet another variant of Deep CCALoss
# Deep CCA by Stochastic Decorrelation Loss (DCCA_SDL) is yet another variant of Deep CCA
# that uses a stochastic gradient descent algorithm to minimize a decorrelation loss function.

dcca_sdl = DCCA_SDL(
Expand All @@ -146,10 +146,10 @@
plt.show()

# %%
# Deep CCALoss by Barlow Twins
# Deep CCA by Barlow Twins
# ----------------------------------------------
# Deep CCALoss by Barlow Twins is a self-supervised learning method that learns representations
# that are invariant to augmentations of the same data. It can be seen as a special case of Deep CCALoss
# Deep CCA by Barlow Twins is a self-supervised learning method that learns representations
# that are invariant to augmentations of the same data. It can be seen as a special case of Deep CCA
# where the two representations are random augmentations of the same input.

barlowtwins = BarlowTwins(
Expand All @@ -167,10 +167,10 @@
plt.show()

# %%
# Deep CCALoss by VICReg
# Deep CCA by VICReg
# ----------------------------------------------
# Deep CCALoss by VICReg is a self-supervised learning method that learns representations
# that are invariant to distortions of the same data. It can be seen as a special case of Deep CCALoss
# Deep CCA by VICReg is a self-supervised learning method that learns representations
# that are invariant to distortions of the same data. It can be seen as a special case of Deep CCA
# where the two representations are random distortions of the same input.

dcca_vicreg = DCCA_SDL(
Expand Down
20 changes: 10 additions & 10 deletions docs/source/examples/plot_dcca_custom_data.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
"""
Working with Custom Datasets in CCALoss-Zoo
Working with Custom Datasets in CCA-Zoo
=======================================
This script provides a guide on how to leverage custom multiview datasets with
CCALoss-Zoo. It walks through various methods, including the use of provided
CCA-Zoo. It walks through various methods, including the use of provided
utilities and the creation of a bespoke dataset class.
Key Features:
- Transforming numpy arrays into CCALoss-Zoo compatible datasets.
- Transforming numpy arrays into CCA-Zoo compatible datasets.
- Validating custom datasets.
- Creating a custom dataset class from scratch.
- Training a Deep CCALoss model on custom datasets.
- Training a Deep CCA model on custom datasets.
"""

import numpy as np
Expand All @@ -19,10 +19,10 @@
# %%
# Converting Numpy Arrays into Datasets
# -------------------------------------
# For those looking for a straightforward method, the `NumpyDataset` class from CCALoss-Zoo
# For those looking for a straightforward method, the `NumpyDataset` class from CCA-Zoo
# is a convenient way to convert numpy arrays into valid datasets. It accepts multiple
# numpy arrays, each representing a distinct view, and an optional list of labels.
# Subsequently, these datasets can be converted into dataloaders for use in CCALoss-Zoo models.
# Subsequently, these datasets can be converted into dataloaders for use in CCA-Zoo models.

from cca_zoo.deep import DCCA, architectures
from cca_zoo.deep.data import NumpyDataset, check_dataset, get_dataloaders
Expand All @@ -37,7 +37,7 @@
# Dataset Validation
# ------------------
# Before proceeding, it's a good practice to validate the constructed dataset.
# The `check_dataset` function ensures that the dataset adheres to CCALoss-Zoo's
# The `check_dataset` function ensures that the dataset adheres to CCA-Zoo's
# expected format.

check_dataset(numpy_dataset)
Expand Down Expand Up @@ -71,14 +71,14 @@ def __getitem__(self, index):
# Convert Custom Dataset into DataLoader
# --------------------------------------
# The `get_dataloaders` function can now be used to transform the custom dataset
# into dataloaders suitable for CCALoss-Zoo.
# into dataloaders suitable for CCA-Zoo.

train_loader = get_dataloaders(custom_dataset, batch_size=2)

# %%
# Training with Deep CCALoss
# Training with Deep CCA
# -----------------------
# Once the dataloaders are set, it's time to configure and train a Deep CCALoss model.
# Once the dataloaders are set, it's time to configure and train a Deep CCA model.

LATENT_DIMS = 1
EPOCHS = 10
Expand Down
22 changes: 11 additions & 11 deletions docs/source/examples/plot_dcca_multi.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
"""
Multiview Deep CCALoss Extensions
Multiview Deep CCA Extensions
=============================
This script showcases how to train extensions of Deep Canonical Correlation Analysis
(Deep CCALoss) that can handle more than two representations of data, using CCALoss-Zoo's functionalities.
(Deep CCA) that can handle more than two representations of data, using CCA-Zoo's functionalities.
Features:
- Deep MCCALoss (Multiset CCALoss)
- Deep GCCALoss (Generalized CCALoss)
- Deep TCCALoss (Tied CCALoss)
- Deep MCCA (Multiset CCA)
- Deep GCCA (Generalized CCA)
- Deep TCCA (Tied CCA)
"""

Expand All @@ -33,34 +33,34 @@
encoder_2 = architectures.Encoder(latent_dimensions=LATENT_DIMS, feature_size=392)

# %%
# Deep MCCALoss (Multiset CCALoss)
# Deep MCCA (Multiset CCA)
# ------------------------
# A multiview extension of CCALoss, aiming to find latent spaces that are maximally correlated across multiple representations.
# A multiview extension of CCA, aiming to find latent spaces that are maximally correlated across multiple representations.

dcca_mcca = DCCA(
latent_dimensions=LATENT_DIMS,
encoders=[encoder_1, encoder_2],
objective=objectives.MCCALoss,
objective=objectives.MCCA,
)
trainer_mcca = pl.Trainer(max_epochs=EPOCHS, enable_checkpointing=False, enable_model_summary=False,enable_progress_bar=False)
trainer_mcca.fit(dcca_mcca, train_loader, val_loader)

# %%
# Deep GCCALoss (Generalized CCALoss)
# Deep GCCA (Generalized CCA)
# ---------------------------
# A method that finds projections of multiple representations such that the variance explained
# by the canonical components is maximized.

dcca_gcca = DCCA(
latent_dimensions=LATENT_DIMS,
encoders=[encoder_1, encoder_2],
objective=objectives.GCCALoss,
objective=objectives.GCCA,
)
trainer_gcca = pl.Trainer(max_epochs=EPOCHS, enable_checkpointing=False, enable_model_summary=False,enable_progress_bar=False)
trainer_gcca.fit(dcca_gcca, train_loader, val_loader)

# %%
# Deep TCCALoss (Tied CCALoss)
# Deep TCCA (Tied CCA)
# --------------------
# An approach where representations share the same weight parameters during training.

Expand Down
6 changes: 3 additions & 3 deletions docs/source/examples/plot_dvcca.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Deep Variational CCALoss and Deep Canonically Correlated Autoencoders
Deep Variational CCA and Deep Canonically Correlated Autoencoders
====================================================================
This example demonstrates multiview linear which can reconstruct their inputs
Expand Down Expand Up @@ -40,7 +40,7 @@
# )
#
# # %%
# # Deep Variational CCALoss
# # Deep Variational CCA
# # ----------------------------
# encoder_1 = architectures.Encoder(
# latent_dimensions=LATENT_DIMS,
Expand Down Expand Up @@ -79,7 +79,7 @@
# plt.show()
#
# # %%
# # Deep Variational CCALoss (Private)
# # Deep Variational CCA (Private)
# # -------------------------------
# private_encoder_1 = architectures.Encoder(
# latent_dimensions=LATENT_DIMS,
Expand Down
26 changes: 13 additions & 13 deletions docs/source/examples/plot_gradient.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
"""
Gradient-based CCALoss and CCA_EYLoss
Gradient-based CCA and CCA_EY
============================
This script demonstrates how to use gradient-based methods
to perform canonical correlation analysis (CCALoss) on high-dimensional data.
We will compare the performance of CCALoss and CCA_EYLoss, which is a variant of CCALoss
to perform canonical correlation analysis (CCA) on high-dimensional data.
We will compare the performance of CCA and CCA_EY, which is a variant of CCA
that uses stochastic gradient descent to solve the optimization problem.
We will also explore the effect of different batch sizes on CCA_EYLoss and plot
We will also explore the effect of different batch sizes on CCA_EY and plot
the loss function over iterations.
"""

Expand Down Expand Up @@ -49,9 +49,9 @@
Y_test = Y[test_idx]

# %%
# CCALoss
# CCA
# ---
# We create a CCALoss object with the number of latent dimensions as 1
# We create a CCA object with the number of latent dimensions as 1
cca = CCA(latent_dimensions=latent_dims)

# We record the start time of the model fitting
Expand All @@ -69,16 +69,16 @@
score_display = ScoreScatterDisplay.from_estimator(
cca, [X_train, Y_train], [X_test, Y_test]
)
score_display.plot(title=f"CCALoss (Time: {elapsed_time:.2f} s)")
score_display.plot(title=f"CCA (Time: {elapsed_time:.2f} s)")
plt.show()

# %%
# CCA_EYLoss with different batch sizes
# CCA_EY with different batch sizes
# --------------------------------
# We create a list of batch sizes to try out
batch_sizes = [200, 100, 50, 20, 10]

# We loop over the batch sizes and create a CCA_EYLoss object for each one
# We loop over the batch sizes and create a CCA_EY object for each one
for batch_size in batch_sizes:
ccaey = CCA_EY(
latent_dimensions=latent_dims,
Expand All @@ -104,19 +104,19 @@
ccaey, [X_train, Y_train], [X_test, Y_test]
)
score_display.plot(
title=f"CCA_EYLoss (Batch size: {batch_size}, Time: {elapsed_time:.2f} s)"
title=f"CCA_EY (Batch size: {batch_size}, Time: {elapsed_time:.2f} s)"
)
plt.show()

# %%
# Comparison
# ----------
# We can see that CCA_EYLoss achieves a higher correlation than CCALoss on the test set,
# We can see that CCA_EY achieves a higher correlation than CCA on the test set,
# indicating that it can handle high-dimensional data better by using gradient descent.
# We can also see that the batch size affects the performance of CCA_EYLoss, with smaller batch sizes
# We can also see that the batch size affects the performance of CCA_EY, with smaller batch sizes
# leading to higher correlations but also higher variance. This is because smaller batch sizes
# allow for more frequent updates and exploration of the parameter space, but also introduce more noise
# and instability in the optimization process. A trade-off between batch size and learning rate may be needed
# to achieve the best results. We can also see that CCA_EYLoss converges faster than CCALoss, as it takes less time
# to achieve the best results. We can also see that CCA_EY converges faster than CCA, as it takes less time
# to fit the model. The loss function plots show how the objective value decreases over iterations for different
# batch sizes, and we can see that smaller batch sizes tend to have more fluctuations and slower convergence.
4 changes: 2 additions & 2 deletions docs/source/examples/plot_hyperparameter_selection.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""
Kernel CCALoss Hyperparameter Tuning
Kernel CCA Hyperparameter Tuning
================================
This script demonstrates hyperparameter optimization for Kernel Canonical
Correlation Analysis (Kernel CCALoss) using both grid search and randomized search methods.
Correlation Analysis (Kernel CCA) using both grid search and randomized search methods.
Note:
- The grid search approach involves exhaustively trying every combination of provided parameters.
Expand Down
10 changes: 5 additions & 5 deletions docs/source/examples/plot_kernel_cca.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""
Exploring Canonical Correlation Analysis (CCALoss) with Kernel & Nonparametric Methods
Exploring Canonical Correlation Analysis (CCA) with Kernel & Nonparametric Methods
=================================================================================
This script provides a walkthrough on using kernel and nonparametric techniques
to perform Canonical Correlation Analysis (CCALoss) on a simulated dataset.
to perform Canonical Correlation Analysis (CCA) on a simulated dataset.
"""

# %%
Expand Down Expand Up @@ -52,7 +52,7 @@ def my_kernel(X, Y, param=0, **kwargs):
).fit([X, Y])

# %%
# Linear Kernel-based CCALoss
# Linear Kernel-based CCA
# -----------------------
c_values = [0.9, 0.99]
param_grid_linear = {"kernel": ["linear"], "c": [c_values, c_values]}
Expand All @@ -66,7 +66,7 @@ def my_kernel(X, Y, param=0, **kwargs):
).fit([X, Y])

# %%
# Polynomial Kernel-based CCALoss
# Polynomial Kernel-based CCA
# ---------------------------
degrees = [2, 3]
param_grid_poly = {
Expand All @@ -88,7 +88,7 @@ def my_kernel(X, Y, param=0, **kwargs):
)

# %%
# Gaussian/RBF Kernel-based CCALoss
# Gaussian/RBF Kernel-based CCA
# -----------------------------
gammas = [1e-1, 1e-2]
param_grid_rbf = {
Expand Down
Loading

0 comments on commit a17b234

Please sign in to comment.