mnlogit weird standard errors #14

vincentarelbundock · 2023-06-29T11:56:50Z

No description provided.

artiom-matvei · 2024-11-19T00:36:42Z

I noted this too while working on fixing the mnlogit tests. It is caused by Jacobians actually. See the difference below.
These values can be reproduced by running the code provided below for Python and for R ( (and debugging both).

Python version

###test_comparisons_01() in mnlogit
    penguins_clean = penguins_with_nulls.select(
        ["island", "bill_length_mm", "flipper_length_mm"]
    ).drop_nulls()

    # Define island categories and create a mapping
    island_categories = ["Biscoe", "Dream", "Torgersen"]
    island_mapping = {island: code for code, island in enumerate(island_categories)}

    # Map 'island' to integer codes
    penguins_clean = penguins_clean.with_columns(
        pl.col("island").replace_strict(island_mapping)
    )

    mod = smf.mnlogit(
        "island ~ bill_length_mm + flipper_length_mm", data=penguins_clean
    ).fit()
    unknown = (
        comparisons(mod)
        .with_columns(pl.col("group").replace(island_mapping))
        .sort(["rowid", "term", "group"])
    )
    known = (
        pl.read_csv("tests/r/test_statsmodels_mnlogit_comparisons_01.csv")
        .with_columns(pl.col("group").replace(island_mapping))
        .sort(["rowid", "term", "group"])
    )
    new_column_names = {col: col.replace('.', '_') for col in known.columns}
    known = known.rename(new_column_names)
    assert_series_equal(known["estimate"].head(), unknown["estimate"].head(), rtol=2)

# python jacobian
    array([[ 3.26007929e-03, -1.11398030e+05, -1.07005746e+07,
         7.82170746e+07,  2.78495302e+07,  1.34757804e+00],
       [ 3.00860803e+05,  1.11398009e+05,  3.15899457e+07,
         4.00666771e+07,  1.42659178e+07, -5.02687187e+00],
       [-3.00860806e+05,  2.03275902e-02, -2.08893711e+07,
        -1.18283752e+08, -4.21154480e+07,  3.67929389e+00],
       ...,
       [ 1.94566192e-02,  6.31280874e+04,  6.06390551e+06,
         6.95824154e+07,  2.47751212e+07,  4.27175339e-01],
       [ 1.19234156e+05, -6.31280929e+04,  1.25194134e+07,
        -2.27053711e+07, -8.08434379e+06, -1.51166669e+00],
       [-1.19234176e+05,  5.47722901e-03, -1.85833189e+07,
        -4.68770444e+07, -1.66907774e+07,  1.08449133e+00]])

R version

#R jacobian
                 [,1]          [,2]          [,3]          [,4]          [,5]          [,6]
   [1,]  4.211085e-03  0.1129098874  7.622065e-01  6.449953e-03  2.390449e-01  1.167441e+00
   [2,]  2.317994e-03 -0.0002889116  4.311471e-01  1.240511e-02  4.599770e-01  2.307350e+00
   [3,] -1.458913e-02 -0.7474941310 -2.844880e+00  2.018562e-02  7.334876e-01  3.936196e+00
   [4,] -1.768363e-02 -0.7516309380 -3.412940e+00  1.977302e-02  5.853414e-01  3.816193e+00
   [5,] -6.148701e-03 -0.3694016210 -1.168253e+00  1.883557e-02  6.798748e-01  3.578758e+00
   [6,]  3.851085e-03  0.0970890213  6.970463e-01  6.842971e-03  2.519078e-01  1.238578e+00
   [7,] -1.807982e-02 -0.8520217380 -3.525564e+00  2.090703e-02  7.173019e-01  4.076872e+00
   [8,] -1.324527e-02 -0.5110488046 -2.556336e+00  1.326720e-02  2.654043e-01  2.560569e+00
   [9,]  7.328301e-03  0.1888463791  1.392377e+00  1.094683e-02  4.361391e-01  2.079898e+00
  [10,] -4.511196e-03 -0.2639279320 -8.390823e-01  1.642779e-02  5.682850e-01  3.055570e+00
  ...

# Load necessary packages
library(nnet)
library(dplyr)
library(marginaleffects)

# Load and prepare the data
penguins <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")
penguins_clean <- penguins %>%
  select(island, bill_length_mm, flipper_length_mm) %>%
  na.omit()
penguins_clean$island <- relevel(factor(penguins_clean$island), ref = "Biscoe")

# Fit the model
model_r <- multinom(island ~ bill_length_mm  + flipper_length_mm, data = penguins_clean)

# Extract and display coefficients in case the models do not match
comparisons(model_r)

vincentarelbundock · 2024-11-20T14:36:21Z

I'm not sure exactly and will have to invest a lot of time to figure things out. I don't have time now, so do what you can on this issue, then post something on Github to explain the state of work and things. I'll take a look when I can.

Then, you can move on to other things.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mnlogit weird standard errors #14

mnlogit weird standard errors #14

vincentarelbundock commented Jun 29, 2023

artiom-matvei commented Nov 19, 2024 •

edited

Loading

vincentarelbundock commented Nov 20, 2024

mnlogit weird standard errors #14

mnlogit weird standard errors #14

Comments

vincentarelbundock commented Jun 29, 2023

artiom-matvei commented Nov 19, 2024 • edited Loading

Python version

R version

vincentarelbundock commented Nov 20, 2024

artiom-matvei commented Nov 19, 2024 •

edited

Loading