Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unification of shape checks #250

Open
msluszniak opened this issue Apr 6, 2024 · 9 comments
Open

Unification of shape checks #250

msluszniak opened this issue Apr 6, 2024 · 9 comments
Labels
suggestion ideas and/or plans to put forward for consideration.

Comments

@msluszniak
Copy link
Contributor

As mentioned here.

@msluszniak msluszniak added the suggestion ideas and/or plans to put forward for consideration. label Apr 6, 2024
@krstopro
Copy link
Member

Is this really completed?

@josevalim
Copy link
Contributor

Errr, wrong issue.

@josevalim josevalim reopened this May 16, 2024
@josevalim
Copy link
Contributor

At the same time, I am not sure if there is something to unify here? I think checking for shapes should be done per module, as they can provide better error messages within their context?

@krstopro
Copy link
Member

At the same time, I am not sure if there is something to unify here? I think checking for shapes should be done per module, as they can provide better error messages within their context?

Could be the case. My idea was to have several functions for the cases that keep repeating, e.g. if x is of rank 2, if x and y agree on the first axis, etc.

@josevalim
Copy link
Contributor

The problem is that the error message ends up being generic: "expected tensors to have matching leading dimensions" while today we can say "expected the data to have the same dimension as the query" etc. We actually had those helpers inside Nx itself at the beginning but we removed them because of that once we introduced case. :)

@JoaquinIglesiasTurina
Copy link
Contributor

I am not sure whether this comment is fully on-topic, but on linear models, I find the behaviour of the api to be inconsistent when it comes to data shapes.

You can fit all linear models with a target shaped {n_samples, 1}. But when you call predict, some models yield an ArgumentError with the following message:
"dot/zip expects shapes to be compatible, dimension 1 of left-side (1) does not equal dimension 0 of right-side (10)".
The remaining models have a multioutput option, thus they can take an {n_samples, 1} shaped target. Leading to this inconsistency.

The models yielding this error are:

  • Scholar.Linear.BayesianRidgeRegression
  • Scholar.Linear.IsotonicRegression
  • Scholar.Linear.LinearRegression
  • Scholar.Linear.SVM

The models not yielding this error are:

  • Linear.PolynomialRegression
  • Linear.RidgeRegression

A livebook showcasing this behaviour of linear models is available here.

I think all linear models should work fine with {n_samples, 1} and {n_samples} shaped targets. If you all agree with that, I would be happy to work on this issue of linear models.

@msluszniak
Copy link
Contributor Author

Sure, that is really strange behaviour and not supposed to happen, so feel free to work on it. I'll appreciate that :)

@JoaquinIglesiasTurina
Copy link
Contributor

I've taken a further look at the issue with linear models. I think there is a decision to be made on how to handle the situation:

  • Should the output of an {n_samples, 1} and {n_samples} shaped target be the same? Or should it be different?

Meaning, RidgeRegression returns {1, n_samples} shaped coefficients for {n_samples, 1} shaped targets. While for {n_samples} targets, it returns {n_samples} shaped coefficients.
This is the behaviour of scikit's ordinary least squares. This behaviour can be achieved by fixing some Nx.dot/2 to Nx.dot/4.

A different approach would be, shape-check the target, and make sure to flatten it prior to fitting the model. This would ensure that {n_samples, 1} and {n_samples} shaped targets yield equally shaped coefficients.
This approach has some points of significance:

  • Inconsistency with scikit's api (I don't know if this is a problem).
  • It raises the question: How do we handle linear models with multioutput options?
  • It's potentially a breaking change of RidgeRegression's api.

I mathematically favor the second option, when linear models are described mathematically, the target is a column vector and actually fitting a column vector and a vector should yield the same results.
However, I feel like that is the riskier approach, it can yield to some inconsistencies and leaving things as they are is the safer choice.

Looking forward to your comments.

@msluszniak
Copy link
Contributor Author

I think that we may try the second direction, and if it will introduce some breaking changes to RidgeRegression then we will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
suggestion ideas and/or plans to put forward for consideration.
Projects
None yet
Development

No branches or pull requests

4 participants