[Bug]: the result of Lasso learner is different from others #149
-
Describe the bugHi, the DML package is really useful for me and I am using it to conduct my master thesis. I have tried LightGBM/RF/Xgboost/Lasso for learners. The results of LightGBM/RF/Xgboost are similar but the results of Lasso is rather different. The following is a part of the results. Can you help me with that issue? Minimum reproducible code snippetLassoFormula =xnames[1] for (name in xnames[-1]){ LassoFormula = formula(LassoFormula)# create the formula ################################ LassoDMLLasso = function(yname){ data_dml_flex = DoubleMLData$new(model_data, Expected ResultI think the results of different learners should be similar. Actual Result
Versions
Matrix products: default locale: attached base packages: other attached packages: loaded via a namespace (and not attached):
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hello @victoriasunsun , thank you for opening this issue/discussion. We moved your issue to the discussion because we believe that it's not really concerning a bug. I guess it rather depends on the learners' performance. In our example notebooks, e.g., on the 401(k) example, the performance of lasso is comparable to the other learners (which does not necessarily have to be the case). When applying double machine learning for causal inference, ML methods are used to approximate potentially high-dimensional and / or complex nuisance functions. When you obtain different results with different ML methods it is a good idea to check the first stage predictions. If the different ML methods have a similar prediction quality in the first stage and still the estimates for the causal parameters are very different, this would be problematic. It's a bit hard to really see what's going wrong in your example. IMO there's no guarantee that all learners lead to the same results. The tree-based methods are more flexible in terms of fitting the nuisance components due to their nonlinearity, whereas lasso is based on a linear model. Also an interacted model used in lasso like this one
does not necessarily achieve the same flexibility as the tree-based methods because the generated interactions might not match the structure of the trees in random forests etc. To be more specific: Maybe the effect of The performance of the learners might also depend on the choice of the parameters. Note that for the cross-fitting the sample is split into three folds ( And of course, the ratio of p (number of variables) and n (number of observations) plays a role, too. How many natural and constructed covariates Have you tried out a standard linear regression and logistic regression for estimation of the nuisance part? Might be interesting to see, whether the performance is similar to lasso - which might happen in case p is not big compared to n. Also you may want to use the option I hope this helps you a bit in finding out what's going on in your application. Let us know if you gain some insights and want to share some lessons learned... Thanks again and best, Philipp |
Beta Was this translation helpful? Give feedback.
Hello @victoriasunsun ,
thank you for opening this issue/discussion. We moved your issue to the discussion because we believe that it's not really concerning a bug. I guess it rather depends on the learners' performance. In our example notebooks, e.g., on the 401(k) example, the performance of lasso is comparable to the other learners (which does not necessarily have to be the case).
When applying double machine learning for causal inference, ML methods are used to approximate potentially high-dimensional and / or complex nuisance functions. When you obtain different results with different ML methods it is a good idea to check the first stage predictions. If the different ML methods have …