-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tracking] Improvements to measures #17
Comments
Decision: What do we do about measures like One option is return a A related question is whether instead of returning a single value, in the case of |
Some other improvements on my wish list:
|
A few features that would be nice:
So a user can easily find: |
Good idea: JuliaAI/MLJBase.jl#301
Not sure I understand. We have a
I don't see why not. Not sure "scale-dependent" is the best description. Is this terminology common? How about a trait called
Sure. Similar to above. |
Just like there are many models/measures in MLJ.jl, there are many distributions in Distributions.jl. struct Arcsine{T<:Real} <: ContinuousUnivariateDistribution
a::T
b::T
Arcsine{T}(a::T, b::T) where {T<:Real} = new{T}(a, b)
end Then automatically that is a subtype of using Distributions
subtypes(Distribution)
subtypes(UnivariateDistribution)
subtypes(ContinuousDistribution)
subtypes(ContinuousUnivariateDistribution) I'm throwing ideas out here, but could it help if measures was similarly organized? Then if a user wants to find all measures etc: subtypes(Measure)
subtypes(RegressionMeasure)
subtypes(ScaleDependentRegressionMeasure)
subtypes(PercentBasedRegressionMeasure) |
You're right, it is not a flattering description (though informative). |
There is also an issue w/ asymmetry for percent errors: |
Right. The API specifies that |
Regarding having a hierarchy of types. We don't really need this. You can do queries based on the traits. I think there is a tendency towards traits, because other packages can extend without your package being a dependency, and so forth. For example, Also, adding traits is much easier than changing type hierarchies you didn't quite get right. @oxinabox May want to comment here. |
You can see all the traits in the current API with this example: julia> info(rms)
root mean squared; aliases: `rms`.
(name = "rms",
target_scitype = Union{AbstractArray{Continuous,1}, AbstractArray{Count,1}},
supports_weights = true,
prediction_type = :deterministic,
orientation = :loss,
reports_each_observation = false,
aggregation = MLJBase.RootMeanSquare(),
is_feature_dependent = false,
docstring = "root mean squared; aliases: `rms`.",
distribution_type = missing,) |
Can the current API tell me which measures work w/ regression (continuous |
Sure. Measures for a julia> measures(m -> AbstractVector{Finite} <: m.target_scitype)
19-element Array{NamedTuple{(:name, :target_scitype, :supports_weights, :prediction_type, :orientation, :reports_each_observation, :aggregation, :is_feature_dependent, :docstring, :distribution_type),T} where T<:Tuple,1}:
(name = area_under_curve, ...)
(name = accuracy, ...)
(name = balanced_accuracy, ...)
(name = cross_entropy, ...)
(name = FScore, ...)
(name = false_discovery_rate, ...)
(name = false_negative, ...)
(name = false_negative_rate, ...)
(name = false_positive, ...)
(name = false_positive_rate, ...)
(name = misclassification_rate, ...)
(name = negative_predictive_value, ...)
(name = positive_predictive_value, ...)
(name = true_negative, ...)
(name = true_negative_rate, ...)
(name = true_positive, ...)
(name = true_positive_rate, ...)
(name = BrierScore{UnivariateFinite}, ...)
(name = confusion_matrix, ...) Measures for a julia> measures(m -> AbstractVector{Continuous} <: m.target_scitype)
15-element Array{NamedTuple{(:name, :target_scitype, :supports_weights, :prediction_type, :orientation, :reports_each_observation, :aggregation, :is_feature_dependent, :docstring, :distribution_type),T} where T<:Tuple,1}:
(name = l1, ...)
(name = l2, ...)
(name = mae, ...)
(name = mape, ...)
(name = rms, ...)
(name = rmsl, ...)
(name = rmslp1, ...)
(name = rmsp, ...)
(name = HuberLoss(), ...)
(name = L1EpsilonInsLoss(), ...)
(name = L2EpsilonInsLoss(), ...)
(name = LPDistLoss(), ...)
(name = LogitDistLoss(), ...)
(name = PeriodicLoss(), ...)
(name = QuantileLoss(), ...) |
Ahhh! that's what I was looking for. Thanks! |
There is also EvalMetrics.jl to look at; see JuliaAI/MLJBase.jl#316 |
This would eliminate some type instabilities in the |
I don't think it's a serious contender after looking at their code in some amount of details (too narrow focus when we would like something as generic as possible); some of their core methods could possibly be adapted (they explicitly said they were happy with that). |
What the status with integration with EvalMetrics.jl?? |
I don't think we should, possibly we can use some of their code for a few specific metrics but last I checked it's not really interesting for us (eg not generic enough) |
Comment from @ven-k on slack: While defining the struct for losses, including the y slightly improved the time taken. For ex,
and
gave, similiar benchmarks but mean time of former was 0.1 to 0.01 μs was lesser than latter.
and pass only yhat in each epoch as Also adding to above, we could have a wrapper function |
LossFunctions fix: We can make measures from LossFunctions behave exactly like all the others when called by importing their names into scope, instead of |
I just tried There is a growing literature on Probabilistic predictions for regression models (ie predicting a conditional distribution). PS: here are all 47 measures I currently get using MLJ;
a=measures()
[println(a[i]) for i in 1:length(measures())]
(name = area_under_curve, ...)
(name = accuracy, ...)
(name = balanced_accuracy, ...)
(name = cross_entropy, ...)
(name = FScore, ...)
(name = false_discovery_rate, ...)
(name = false_negative, ...)
(name = false_negative_rate, ...)
(name = false_positive, ...)
(name = false_positive_rate, ...)
(name = l1, ...)
(name = l2, ...)
(name = log_cosh, ...)
(name = mae, ...)
(name = mape, ...)
(name = matthews_correlation, ...)
(name = misclassification_rate, ...)
(name = negative_predictive_value, ...)
(name = positive_predictive_value, ...)
(name = rms, ...)
(name = rmsl, ...)
(name = rmslp1, ...)
(name = rmsp, ...)
(name = true_negative, ...)
(name = true_negative_rate, ...)
(name = true_positive, ...)
(name = true_positive_rate, ...)
(name = BrierScore{UnivariateFinite}, ...)
(name = DWDMarginLoss(), ...)
(name = ExpLoss(), ...)
(name = L1HingeLoss(), ...)
(name = L2HingeLoss(), ...)
(name = L2MarginLoss(), ...)
(name = LogitMarginLoss(), ...)
(name = ModifiedHuberLoss(), ...)
(name = PerceptronLoss(), ...)
(name = SigmoidLoss(), ...)
(name = SmoothedL1HingeLoss(), ...)
(name = ZeroOneLoss(), ...)
(name = HuberLoss(), ...)
(name = L1EpsilonInsLoss(), ...)
(name = L2EpsilonInsLoss(), ...)
(name = LPDistLoss(), ...)
(name = LogitDistLoss(), ...)
(name = PeriodicLoss(), ...)
(name = QuantileLoss(), ...)
(name = confusion_matrix, ...) |
I believe we already have negative log-likelihood, aka log-loss. It is called search: cross_entropy
cross_entropy
Cross entropy loss with probabilities clamped between eps() and 1-eps(); aliases: cross_entropy.
ce = CrossEntropy(; eps=eps())
ce(ŷ, y)
Given an abstract vector of distributions ŷ and an abstract vector of true observations y, return the corresponding cross-entropy
loss (aka log loss) scores.
Since the score is undefined in the case of the true observation has predicted probability zero, probablities are clipped between
eps and 1-eps where eps can be specified.
If sᵢ is the predicted probability for the true class yᵢ then the score for that example is given by
-log(clamp(sᵢ, eps, 1-eps))
For more information, run info(cross_entropy). julia> yhat = UnivariateFinite(["yes", "no"], rand(5), pool=missing, augment=true)
5-element MLJBase.UnivariateFiniteArray{Multiclass{2},String,UInt8,Float64,1}:
UnivariateFinite{Multiclass{2}}(yes=>0.374, no=>0.626)
UnivariateFinite{Multiclass{2}}(yes=>0.532, no=>0.468)
UnivariateFinite{Multiclass{2}}(yes=>0.428, no=>0.572)
UnivariateFinite{Multiclass{2}}(yes=>0.691, no=>0.309)
UnivariateFinite{Multiclass{2}}(yes=>0.539, no=>0.461)
julia> y = rand(classes(yhat), 5)
5-element Array{CategoricalArrays.CategoricalValue{String,UInt8},1}:
"no"
"no"
"yes"
"no"
"yes"
julia> cross_entropy(yhat, y)
5-element Array{Float64,1}:
0.4691627141887623
0.7594675442682963
0.8484769383284205
1.1752213731506886
0.6185977143266518 |
@ablaom using MLJ
X,y=@load_boston
train, test = partition(eachindex(y), .7, rng=333);
@load LinearRegressor pkg = GLM
mdl = LinearRegressor()
mach = machine(mdl, X, y)
fit!(mach, rows=train, verbosity=0)
ŷ = predict(mach, rows=test)
cross_entropy(ŷ, y[test])
ERROR: MethodError: no method matching (::MLJBase.CrossEntropy{Float64})(::Array{Distributions.Normal{Float64},1}, ::Array{Float64,1})
Closest candidates are:
Any(::MLJBase.UnivariateFiniteArray{S,V,R,P,1}, ::AbstractArray{T,1} where T) where {S, V, R, P} at /Users/AZevelev/.julia/packages/MLJBase/Ov46j/src/measures/finite.jl:64
Any(::AbstractArray{var"#s577",1} where var"#s577"<:UnivariateFinite, ::AbstractArray{T,1} where T) at /Users/AZevelev/.julia/packages/MLJBase/Ov46j/src/measures/finite.jl:57
Stacktrace:
[1] top-level scope at none:1 |
Ah yes. |
For the continuous case, I doubt it would be called |
Lighthouse has some measures we may want to include: JuliaAI/MLJBase.jl#586 |
Community discussion on mitigating metric code fragmentation |
(sorry super old message but...
You can use
(see also JuliaLang/julia#35162) and going through the thread, the option to get auc to return a vector seems pretty weird to me, if it returns it internally to eliminate type instability, fine, but not to the user, maybe a way to do this is in implementing a |
No there doesn't seem to be much stomach for this suggestion.
Done. I had forgotten about NaN's though. |
Would be nice to have various metrics from |
edit See this important issue
The measures part of MLJBase could do with some TLC. It is not the shiniest part of the MLJ code base, written in a bit of a hurry because nothing much could go forward without something in place, and the existing packages came up short.
I think the API is more-or-less fine, but the way things are implemented is less that ideal, leading to:
(i) code redundancy
(ii) less functionality: measures that could support weights or implement
reports_each_observation
don'tRecall that a measure
reports_each_observation
meansm(v1, v2)
returns a vector of measurements, and otherwise a single scalar is returned. So it does't really make sense forauc
, for example, toreport_each_observation
(which it doesn't). However,mae
should (but doesn't).I propose we make the following assumption that will allow us to resolve these issues for the majority of measures:
If a measure
m(v1, v2)
implementsreports_each_observation
then it is understood that it is the sum or mean value of some scalar versionm(s1, s2)
.For such measures, then, we need only implement the scalar method
m(s1, s2)
and we can generate the other methodsm(v1, v2)
,m(v1, v2, w)
automatically.For other measures, such as
auc
and therms
family,m(v1, v2)
(and optionallym(v1, v2, w)
) must be explicitly implemented, as at present.In addition to the docs there is a lot about the measure design in this discussion.
Details
To "automatically generate" the extra methods, we could do something like this:
@tlienart
@azev77
The text was updated successfully, but these errors were encountered: