MAPE #296

azev77 · 2020-05-09T23:38:49Z

No description provided.

For a 0.13.4 release

For 0.13.4 release - Take 2

For a 0.13.5 release

azev77 · 2020-05-09T23:39:48Z

@OkonSamuel I think this one has the stuff

codecov-io · 2020-05-09T23:47:45Z

Codecov Report

Merging #296 into dev will increase coverage by 0.09%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev     JuliaAI/MLJBase.jl#296      +/-   ##
==========================================
+ Coverage   82.50%   82.59%   +0.09%     
==========================================
  Files          30       30              
  Lines        2023     2034      +11     
==========================================
+ Hits         1669     1680      +11     
  Misses        354      354

Impacted Files	Coverage Δ
src/MLJBase.jl	`100.00% <ø> (ø)`
src/measures/finite.jl	`98.75% <ø> (ø)`
src/data/datasets_synthetic.jl	`95.95% <100.00%> (ø)`
src/measures/continuous.jl	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c64b7e8...1d76a1d. Read the comment docs.

src/measures/continuous.jl

tlienart · 2020-05-10T09:14:52Z

ah crap I commented on the wrong PR, please see comments in #293 there's a few numerical issues to sort out. + there should be tests for these corner cases

azev77 · 2020-05-10T18:52:14Z

@tlienart
1 added @inbounds.
This can be added other places in the same program (I don't wanna touch other people's code)

2 regarding a tolerance, this is an issue w/ any measure based on percentage errors #95
(currently RMSP does the same thing)
New code:

function (::MAPE)(ŷ::Vec{<:Real}, y::Vec{<:Real}; tol = eps())
    check_dimensions(ŷ, y)
    ret = zero(eltype(y))
    count = 0
    @inbounds for i in eachindex(y)
        ayi = abs(y[i])
        if ayi > tol
        #if y[i] != zero(eltype(y))
            dev = abs((y[i] - ŷ[i]) / ayi)
            #dev = abs((y[i] - ŷ[i]) / y[i])
            ret += dev
            count += 1
        end
    end
    return ret / count
end

3 regarding the corner case where count = 0, that is an inevitable known problem w/ any percentage based score-measure (including RMSP).

tlienart · 2020-05-10T19:01:36Z

Great thanks, @ablaom will probably complain on how you pass the tolerance parameter, it should be part of the MAPE struct.

Re inevitable errors etc, I would suggest showing a warning?

Re improving RMSP to also have a threshold + use inbounds, we should do it 👍🏼

azev77 · 2020-05-10T19:10:37Z

None of the current measures have a tolerance parameter as part of the struct.
How can this be done?
I don't know how to throw a warning...

azev77 · 2020-05-10T19:33:05Z

perhaps the current code could be written more parsimoniously using mean(), median etc?
(unless loops are faster?)
Define vector: e := ŷ - y

Define vector: p := e/y {for y[i]>tol}

etcetera?

tlienart · 2020-05-10T20:17:23Z

I think @ablaom can better comment on this PR, he’s the one who thought carefully about measures.

I think the PR is great.

There’s a wider discussion of performance, warnings etc but it can be done separately. And yes afaik mean is two-three times faster than the loop, possibly bc it uses SIMD (?) , not sure.

Warning can be shown with @warn but better to do this for everything rather than have only one metric that would do it, so could be a separate issue.

Ps; performance is not really a problem with metrics...

azev77 · 2020-05-10T20:36:17Z

Ps; performance is not really a problem with metrics...

I fully agree, that's why I wasn't sure about the need for @inbounds, but why not...

azev77 · 2020-05-10T23:23:18Z

@ablaom @tlienart see how much easier this could be:
(the only things missing are weights & tol)

function err(ŷ, y)
    return ŷ - y
end

function perr(ŷ, y; tol = eps())
    e = err(ŷ, y)
    p = 100 * e[y .!= 0] ./ y[y .!= 0]
    return p
end

using Statistics
mse(ŷ, y)  = err(ŷ, y) |> (x)->x.^2 |> mean
rmse(ŷ, y) = err(ŷ, y) |> x -> x.^2 |> mean |> sqrt
mae(ŷ, y)  = err(ŷ, y) |> x -> abs.(x) |> mean
mdae(ŷ, y) = err(ŷ, y) |> x -> abs.(x) |> median
rmdse(ŷ, y) = err(ŷ, y) |> x -> x.^2 |> median |> sqrt
maxae(ŷ, y) = err(ŷ, y) |> x -> abs.(x) |> maximum

mape(ŷ, y)    = perr(ŷ, y) |> x -> abs.(x) |> mean
mdape(ŷ, y)   = perr(ŷ, y) |> x -> abs.(x) |> median
rmspe(ŷ, y)   = perr(ŷ, y) |> x -> x.^2 |> mean |> sqrt
rmdspe(ŷ, y)  = perr(ŷ, y) |> x -> x.^2 |> median |> sqrt

"https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error"
smape(ŷ, y)   = 200. * abs.(err(ŷ, y)) ./ (ŷ + y) |> mean
smdape(ŷ, y)   = 200. * abs.(err(ŷ, y)) ./ (ŷ + y) |> median

# TEST
y    = [-1, 0, 1, 2, 3, 4]
ŷ    = [-3, 1, 2, 2, 4, 5]

e = err(ŷ, y)
p = perr(ŷ, y)

mse(ŷ, y)
rmse(ŷ, y)
mae(ŷ, y) 
mdae(ŷ, y) 
rmdse(ŷ, y)
maxae(ŷ, y)

mape(ŷ, y)
mdape(ŷ, y)
rmspe(ŷ, y)
rmdspe(ŷ, y)
smape(ŷ, y)
smdape(ŷ, y)

ablaom

@azev77 Thanks very much indeed.

Yes, you need to make the tolerance part of the struct. Measures can't have kwargs. There is an example here:

https://github.com/alan-turing-institute/MLJBase.jl/blob/7f023250dd497c3ead048c06f894bcbeee0113f7/src/measures/finite.jl#L9

You may also want to add weight support but I won't insist.

It's not necessary in this PR, but ideally mape should be implemented as a reports_per_observation measure, as should all measures that are simply sum or mean aggregates of some function of a single observation (so, not rms and not auc). However, this improvement could be part of a larger refactoring project to eliminate a lot of redundant code for such loss functions. I'll open an issue.

(Yes, would be great if you can add this tolerance to Root mean squared percentage loss as well.)

azev77 · 2020-05-10T23:45:48Z

I'm not sure about struct & I took out kwarg, but put in a default tol.
Can a measure have arguments w/ default values?

struct MAPE <: Measure 
    tol::Real
end

"""
     mape(ŷ, y)
Mean Absolute Percentage Error:
``\\text{MAPE} =  m^{-1}∑ᵢ|{(yᵢ-ŷᵢ) \\over yᵢ}|`` 
where the sum is over indices such that `yᵢ≂̸0` and `m` is the number
of such indices.
For more information, run `info(mape)`.
"""
const mape = MAPE()

metadata_measure(MAPE;
    name                     = "mape",
    target_scitype           = Union{Vec{Continuous},Vec{Count}},
    prediction_type          = :deterministic,
    orientation              = :loss,
    reports_each_observation = false,
    is_feature_dependent     = false,
    supports_weights         = false,
    docstring                = "Mean Absolute Percentage Error; aliases: `mape`.")

function (::MAPE)(ŷ::Vec{<:Real}, y::Vec{<:Real}, tol = eps())
    check_dimensions(ŷ, y)
    ret = zero(eltype(y))
    count = 0
    @inbounds for i in eachindex(y)
        ayi = abs(y[i])
        if ayi > tol
        #if y[i] != zero(eltype(y))
            dev = abs((y[i] - ŷ[i]) / ayi)
            #dev = abs((y[i] - ŷ[i]) / y[i])
            ret += dev
            count += 1
        end
    end
    return ret / count
end

ablaom · 2020-05-11T00:51:56Z

No, the idea is that you choose the tolerance at time of instantiation of the measure. As in

mape_dangerous = MAPE(tol=0)
mape_dangerous(yhat, y)

You're missing a keyword constructor for the callable measure instances, as in CrossEntropy(;eps=eps()) = CrossEntropy(eps). Please see the cross_entropy example.

The signature of the callable object m can only be m(yhat, y) or m(yhat, y, w).

Clear?

azev77 · 2020-05-11T01:20:54Z

@ablaom sorry, not clear

can you show me w/ the mape() example
include it w/ the new upgraded measures.jl (if we're gonna throw out most of the code anyway)

ablaom · 2020-05-11T02:52:36Z

1. can you show me w/ the mape() example

See the new mape fork. Here's the diff: 9aabed7

2. include it w/ the new upgraded measures.jl (if we're gonna throw out most of the code anyway)

Are you volunteering to carry out the refactor at JuliaAI/StatisticalMeasures.jl#17, then? I'm not going to get around to that for some time.

azev77 · 2020-05-11T17:42:16Z

@ablaom

thanks for the code (I think I incorporated it)
while I'd love to help w/ the refactor I don't know enough about the details of MLJ.
I spilled my guts here.

ablaom · 2020-05-11T23:52:36Z

Closed in favour of #302 - essentially identical

ablaom · 2020-05-11T23:53:00Z

@azev77 Many thanks for the PR.

ablaom and others added 6 commits May 4, 2020 11:45

Merge pull request #287 from alan-turing-institute/dev

4735e34

For a 0.13.4 release

Merge pull request #288 from alan-turing-institute/dev

f99c457

For 0.13.4 release - Take 2

Merge pull request #290 from alan-turing-institute/dev

7f02325

For a 0.13.5 release

Update continuous.jl

f9d5777

Update MLJBase.jl

2dfb82e

Update continuous.jl

e6bd536

OkonSamuel approved these changes May 10, 2020

View reviewed changes

OkonSamuel reviewed May 10, 2020

View reviewed changes

src/measures/continuous.jl Outdated Show resolved Hide resolved

Update continuous.jl

21438f5

Update continuous.jl

495a11c

Update continuous.jl

79a6f8b

ablaom requested changes May 10, 2020

View reviewed changes

Update continuous.jl

5803b81

Update continuous.jl

1d76a1d

Update continuous.jl

0d749ac

Update continuous.jl

ee0e2f1

ablaom mentioned this pull request May 11, 2020

MAPE #302

Merged

ablaom closed this May 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAPE #296

MAPE #296

azev77 commented May 9, 2020

azev77 commented May 9, 2020

codecov-io commented May 9, 2020 •

edited

Loading

tlienart commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

tlienart commented May 10, 2020

azev77 commented May 10, 2020

azev77 commented May 10, 2020

tlienart commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

ablaom left a comment

azev77 commented May 10, 2020

ablaom commented May 11, 2020 •

edited

Loading

azev77 commented May 11, 2020

ablaom commented May 11, 2020

azev77 commented May 11, 2020 •

edited

Loading

ablaom commented May 11, 2020

ablaom commented May 11, 2020

MAPE #296

MAPE #296

Conversation

azev77 commented May 9, 2020

azev77 commented May 9, 2020

codecov-io commented May 9, 2020 • edited Loading

Codecov Report

tlienart commented May 10, 2020 • edited Loading

azev77 commented May 10, 2020 • edited Loading

tlienart commented May 10, 2020

azev77 commented May 10, 2020

azev77 commented May 10, 2020

tlienart commented May 10, 2020 • edited Loading

azev77 commented May 10, 2020 • edited Loading

azev77 commented May 10, 2020 • edited Loading

ablaom left a comment

Choose a reason for hiding this comment

azev77 commented May 10, 2020

ablaom commented May 11, 2020 • edited Loading

azev77 commented May 11, 2020

ablaom commented May 11, 2020

azev77 commented May 11, 2020 • edited Loading

ablaom commented May 11, 2020

ablaom commented May 11, 2020

codecov-io commented May 9, 2020 •

edited

Loading

tlienart commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

tlienart commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

azev77 commented May 10, 2020 •

edited

Loading

ablaom commented May 11, 2020 •

edited

Loading

azev77 commented May 11, 2020 •

edited

Loading