[ENH] mtype for multi-indexed pandas
and polars
dataframes
#460
Labels
API design
API design & software architecture
feature request
New feature or request
module:datatypes
datatypes module: data containers, checkers & converters
module:regression
probabilistic regression module
Specification and API consolidation discussion related to
polars
support for probabilistic predictions.PR #399 makes it clear that, for full
polars
support, we need an internal representation forpolars
returns ofpredict_interval
andpredict_quantiles
, and some means to convert betweenpandas
andpolars
representation.The PR proposes to extend the
Table
mtype, though I think that is dangerous as it would redefine the abstract datatype to include multi-index columns, which then would either affect, by a "chain of architecture", all mtypes in theTable
scitype, where it is unclear what would, for instance, have to happen topd.Series
ornumpy
based ones.However, the best way to proceed does not seem clear to me - hence the discussion issue.
Personally, I see two options to maintain a clearer structure, both leaving
Table
mtypes unchanged:Proba
mtype withpolars
based ones. However, this would result in onepolars
based mtype perpredict
output, which seems now less clean a design than it originally seemed when there were onlypandas
based ones.TableMI
(or similarly named), which has as ADT tables that can have a column multi-index (and, potentially, a row multi-index, but perhaps only later as we do not need this now). For the start, it can have threemtype
-s,polars
eager and lazy, andpd.DataFrame
based.We would then use the new concrete data structure in a converter in
predict_interval
andpredict_quantiles
, after the output has been produced.The text was updated successfully, but these errors were encountered: