Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take care of deprecation warnings from dependencies #482

Merged
merged 21 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ef5f01a
Call .map() if pandas version is 2.1.0 or greater
timmens Mar 3, 2024
5b0be83
Add compat.py module
timmens Mar 4, 2024
de9a346
Ignore deprecation warning about MemoizeJac, raised in cyipopt
timmens Mar 4, 2024
5fc72cb
Use np.prod instead of np.product
timmens Mar 4, 2024
ce60fac
Fix faulty ignore of deprecation warning
timmens Mar 4, 2024
4ce3e40
Call unique() and value_count() on Series and not list
timmens Mar 4, 2024
f18dbfc
Do not pass maxiter option to scipy truncated newton
timmens Mar 4, 2024
d3d07cb
Ignore deprecation of array to scalar conversion
timmens Mar 4, 2024
3c072c3
Use ffill() instead of fillna(method=ffill)
timmens Mar 4, 2024
7af5cb6
Use replace with categorical columns properly
timmens Mar 4, 2024
7345c05
Remove usage of DataFrameGroupBy.grouper
timmens Mar 4, 2024
07a8dd7
Convert dataframe to float before assigning np.inf values
timmens Mar 4, 2024
4c17eca
Add action to CI that runs certain tests with pandas 1
timmens Mar 4, 2024
bc88062
Remove unused packages from pandas test environment
timmens Mar 4, 2024
f326a5b
Update environments
timmens Mar 4, 2024
606dddf
Update Python version matrix for CI tests
timmens Mar 5, 2024
598c7d3
Merge branch 'main' into deprecation-warnings
timmens Mar 5, 2024
0eac844
Implement requested changes from review
timmens Mar 5, 2024
bb0d0e3
Merge branch 'deprecation-warnings' of https://github.com/OpenSourceE…
timmens Mar 5, 2024
df15ada
Describe why we convert to float
timmens Mar 6, 2024
50c2c24
Add comment to main.yml
timmens Mar 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .envs/testenv-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ dependencies:
- scipy>=1.2.1 # run, tests
- sqlalchemy # run, tests
- tranquilo>=0.0.4 # dev, tests
- seaborn # dev, tests
- pip: # dev, tests, docs
- DFO-LS # dev, tests
- Py-BOBYQA # dev, tests
Expand Down
1 change: 1 addition & 0 deletions .envs/testenv-others.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ dependencies:
- scipy>=1.2.1 # run, tests
- sqlalchemy # run, tests
- tranquilo>=0.0.4 # dev, tests
- seaborn # dev, tests
- pip: # dev, tests, docs
- DFO-LS # dev, tests
- Py-BOBYQA # dev, tests
Expand Down
31 changes: 31 additions & 0 deletions .envs/testenv-pandas.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
name: estimagic
channels:
- conda-forge
- nodefaults
dependencies:
- pandas<2.0.0
- nlopt # dev, tests
- pip # dev, tests, docs
- pytest # dev, tests
- pytest-cov # tests
- pytest-xdist # dev, tests
- statsmodels # dev, tests
- bokeh<=2.4.3 # run, tests
- click # run, tests
- cloudpickle # run, tests
- joblib # run, tests
- numpy>=1.17.0 # run, tests
- plotly # run, tests
- pybaum >= 0.1.2 # run, tests
- scipy>=1.2.1 # run, tests
- sqlalchemy # run, tests
- tranquilo>=0.0.4 # dev, tests
- seaborn # dev, tests
- pip: # dev, tests, docs
- DFO-LS # dev, tests
- Py-BOBYQA # dev, tests
- fides==0.7.4 # dev, tests
- kaleido # dev, tests
- simoptlib==1.0.1 # dev, tests
- -e ../
10 changes: 9 additions & 1 deletion .envs/update_envs.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,21 @@ def main():
test_env_others = deepcopy(test_env)
test_env_others.insert(_insert_idx, " - cyipopt<=1.2.0")

## test environment for pandas version 1
test_env_pandas = deepcopy(test_env)
test_env_pandas = [line for line in test_env_pandas if "pandas" not in line]
test_env_pandas.insert(_insert_idx, " - pandas<2.0.0")

# create docs testing environment

docs_env = [line for line in lines if _keep_line(line, "docs")]
docs_env.append(" - -e ../") # add local installation

# write environments
for name, env in zip(["linux", "others"], [test_env_linux, test_env_others]):
for name, env in zip(
["linux", "others", "pandas"],
[test_env_linux, test_env_others, test_env_pandas],
):
# Specify newline to avoid wrong line endings on Windows.
# See: https://stackoverflow.com/a/69869641
Path(f".envs/testenv-{name}.yml").write_text(
Expand Down
31 changes: 31 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
- '3.9'
- '3.10'
- '3.11'
- '3.12'
steps:
- uses: actions/checkout@v3
- name: create build environment
Expand Down Expand Up @@ -72,6 +73,36 @@ jobs:
run: |
micromamba activate estimagic
pytest -m "not slow and not jax"
run-tests-with-old-pandas:
timmens marked this conversation as resolved.
Show resolved Hide resolved
# This job is only for testing if estimagic works with older pandas versions, as
# many pandas functions we use will be deprecated in pandas 3. estimagic's behavior
# for older verions is handled in src/estimagic/compat.py.
name: Run tests for ${{ matrix.os}} on ${{ matrix.python-version }} with pandas 1
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os:
- ubuntu-latest
python-version:
- '3.11'
steps:
- uses: actions/checkout@v3
- name: create build environment
uses: mamba-org/provision-with-micromamba@main
with:
environment-file: ./.envs/testenv-pandas.yml
environment-name: estimagic
cache-env: true
extra-specs: |
python=${{ matrix.python-version }}
- name: run pytest
shell: bash -l {0}
run: |
micromamba activate estimagic
pytest tests/visualization
pytest tests/parameters
pytest tests/inference
code-in-docs:
name: Run code snippets in documentation
runs-on: ubuntu-latest
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ dependencies:
- sphinx-panels # docs
- sphinxcontrib-bibtex # docs
- tranquilo>=0.0.4 # dev, tests
- seaborn # dev, tests
- pip: # dev, tests, docs
- DFO-LS # dev, tests
- Py-BOBYQA # dev, tests
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ filterwarnings = [
"ignore:Method .ptp is deprecated and will be removed in a future version. Use numpy.ptp instead.",
"ignore:In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only",
"ignore:Please use `MemoizeJac` from the `scipy.optimize` namespace",
"ignore:`scipy.optimize.optimize.MemoizeJac` is deprecated",
"ignore:Some algorithms did not converge. Their walltime has been set to a very high value instead of infinity because Timedeltas do notsupport infinite values",
"ignore:In a future version, the Index constructor will not infer numeric dtypes when passed object-dtype sequences",
"ignore:distutils Version classes are deprecated. Use packaging.version instead",
Expand All @@ -91,6 +92,7 @@ filterwarnings = [
"ignore:Widget.widget_types is deprecated",
"ignore:Widget.widgets is deprecated",
"ignore:Parallelization together with",
"ignore:Conversion of an array with ndim > 0 to a scalar is deprecated",
]
addopts = ["--doctest-modules"]
markers = [
Expand Down
33 changes: 33 additions & 0 deletions src/estimagic/compat.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""Compatibility module.

Contains wrapper functions to handle compatibility issues between different versions of
external libraries.

"""

from estimagic.config import IS_PANDAS_VERSION_NEWER_OR_EQUAL_TO_2_1_0


def pd_df_map(df, func, na_action=None, **kwargs):
"""Apply a function to a Dataframe elementwise.

pandas has depricated the .applymap() function with version 2.1.0. This function
calls either .map() (if pandas version is greater or equal to 2.1.0) or .applymap()
(if pandas version is smaller than 2.1.0).

Args:
df (pd.DataFrame): A pandas DataFrame.
func (callable): Python function, returns a single value from a single value.
na_action (str): If 'ignore', propagate NaN values, without passing them to
func. If None, pass NaN values to func. Default is None.
**kwargs: Additional keyword arguments to pass as keywords arguments to func.

Returns:
pd.DataFrame: Transformed DataFrame.

"""
if IS_PANDAS_VERSION_NEWER_OR_EQUAL_TO_2_1_0:
out = df.map(func, na_action=na_action, **kwargs)
else:
out = df.applymap(func, na_action=na_action, **kwargs)

Check warning on line 32 in src/estimagic/compat.py

View check run for this annotation

Codecov / codecov/patch

src/estimagic/compat.py#L32

Added line #L32 was not covered by tests
return out
19 changes: 15 additions & 4 deletions src/estimagic/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from pathlib import Path
import pandas as pd
from packaging import version

import plotly.express as px

Expand All @@ -19,9 +21,9 @@
CRITERION_PENALTY_SLOPE = 0.1
CRITERION_PENALTY_CONSTANT = 100

# =====================================================================================
# ======================================================================================
# Check Available Packages
# =====================================================================================
# ======================================================================================

try:
from petsc4py import PETSc # noqa: F401
Expand Down Expand Up @@ -103,9 +105,18 @@
IS_NUMBA_INSTALLED = True


# =================================================================================
# ======================================================================================
# Check if pandas version is newer or equal to version 2.1.0
# ======================================================================================

IS_PANDAS_VERSION_NEWER_OR_EQUAL_TO_2_1_0 = version.parse(
pd.__version__
) >= version.parse("2.1.0")


# ======================================================================================
# Dashboard Defaults
# =================================================================================
# ======================================================================================

Y_RANGE_PADDING = 0.05
Y_RANGE_PADDING_UNITS = "absolute"
Expand Down
3 changes: 2 additions & 1 deletion src/estimagic/optimization/optimize_result.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import pandas as pd

from estimagic.utilities import to_pickle
from estimagic.compat import pd_df_map


@dataclass
Expand Down Expand Up @@ -128,7 +129,7 @@ def _format_convergence_report(report, algorithm):
report = pd.DataFrame.from_dict(report)
columns = ["one_step", "five_steps"]

table = report[columns].applymap(_format_float).astype(str)
table = pd_df_map(report[columns], _format_float).astype(str)

for col in "one_step", "five_steps":
table[col] = table[col] + _create_stars(report[col])
Expand Down
2 changes: 0 additions & 2 deletions src/estimagic/optimization/scipy_optimizers.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,6 @@ def scipy_truncated_newton(
upper_bounds,
*,
stopping_max_criterion_evaluations=STOPPING_MAX_CRITERION_EVALUATIONS,
stopping_max_iterations=STOPPING_MAX_ITERATIONS,
convergence_absolute_criterion_tolerance=CONVERGENCE_ABSOLUTE_CRITERION_TOLERANCE,
convergence_absolute_params_tolerance=CONVERGENCE_ABSOLUTE_PARAMS_TOLERANCE,
convergence_absolute_gradient_tolerance=CONVERGENCE_ABSOLUTE_GRADIENT_TOLERANCE,
Expand All @@ -381,7 +380,6 @@ def scipy_truncated_newton(
"xtol": convergence_absolute_params_tolerance,
"gtol": convergence_absolute_gradient_tolerance,
"maxfun": stopping_max_criterion_evaluations,
"maxiter": stopping_max_iterations,
"maxCGit": max_hess_evaluations_per_iteration,
"stepmx": max_step_for_line_search,
"minfev": func_min_estimate,
Expand Down
8 changes: 4 additions & 4 deletions src/estimagic/parameters/block_trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ def matrix_to_block_tree(matrix, outer_tree, inner_tree):
shapes_outer = [np.shape(a) for a in flat_outer_np]
shapes_inner = [np.shape(a) for a in flat_inner_np]

block_bounds_outer = np.cumsum([int(np.product(s)) for s in shapes_outer[:-1]])
block_bounds_inner = np.cumsum([int(np.product(s)) for s in shapes_inner[:-1]])
block_bounds_outer = np.cumsum([int(np.prod(s)) for s in shapes_outer[:-1]])
block_bounds_inner = np.cumsum([int(np.prod(s)) for s in shapes_inner[:-1]])

blocks = []
for leaf_outer, s1, submat in zip(
Expand Down Expand Up @@ -94,8 +94,8 @@ def hessian_to_block_tree(hessian, f_tree, params_tree):
shapes_f = [np.shape(a) for a in flat_f_np]
shapes_p = [np.shape(a) for a in flat_p_np]

block_bounds_f = np.cumsum([int(np.product(s)) for s in shapes_f[:-1]])
block_bounds_p = np.cumsum([int(np.product(s)) for s in shapes_p[:-1]])
block_bounds_f = np.cumsum([int(np.prod(s)) for s in shapes_f[:-1]])
block_bounds_p = np.cumsum([int(np.prod(s)) for s in shapes_p[:-1]])

sub_block_trees = []
for s0, subarr in zip(shapes_f, np.split(hessian, block_bounds_f, axis=0)):
Expand Down
2 changes: 1 addition & 1 deletion src/estimagic/parameters/consolidate_constraints.py
Original file line number Diff line number Diff line change
Expand Up @@ -592,7 +592,7 @@ def _drop_redundant_linear_constraints(weights, rhs):
new_rhs (pd.DataFrame)

"""
weights["dupl_group"] = weights.groupby(list(weights.columns)).grouper.group_info[0]
weights["dupl_group"] = weights.groupby(list(weights.columns)).ngroup()
rhs["dupl_group"] = weights["dupl_group"]
weights.set_index("dupl_group", inplace=True)

Expand Down
4 changes: 2 additions & 2 deletions src/estimagic/parameters/parameter_groups.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ def get_params_groups_and_short_names(params, free_mask, max_group_size=8):
names.append(name)

# if every parameter has its own group, they should all actually be in one group
if len(pd.unique(groups)) == len(groups):
if len(set(groups)) == len(groups):
groups = ["Parameters"] * len(groups)

counts = pd.value_counts(groups)
counts = pd.Series(groups).value_counts()
to_be_split = counts[counts > max_group_size]
for group_name, n_occurrences in to_be_split.items():
split_group_names = _split_long_group(
Expand Down
2 changes: 1 addition & 1 deletion src/estimagic/visualization/deviation_plot.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def deviation_plot(
names=["problem", "algorithm", runtime_measure],
)
)
.fillna(method="ffill")
.ffill()
.reset_index()
)
average_deviations = (
Expand Down
22 changes: 12 additions & 10 deletions src/estimagic/visualization/estimation_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from functools import partial
from pathlib import Path
from warnings import warn
from estimagic.compat import pd_df_map

import numpy as np
import pandas as pd
Expand Down Expand Up @@ -305,7 +306,7 @@ def render_latex(
ci_in_body = False

if ci_in_body:
body.loc[("",)] = body.loc[("",)].applymap("{{{}}}".format).values
body.loc[("",)] = pd_df_map(body.loc[("",)], "{{{}}}".format).values
if body.columns.nlevels > 1:
column_groups = body.columns.get_level_values(0)
else:
Expand Down Expand Up @@ -1383,22 +1384,23 @@ def _apply_number_format(df_raw, number_format, format_integers):
if isinstance(processed_format, (list, tuple)):
df_formatted = df_raw.copy(deep=True).astype("float")
for formatter in processed_format[:-1]:
df_formatted = df_formatted.applymap(formatter.format).astype("float")
df_formatted = df_formatted.astype("float").applymap(
processed_format[-1].format
df_formatted = pd_df_map(df_formatted, formatter.format).astype("float")
df_formatted = pd_df_map(
df_formatted.astype("float"), processed_format[-1].format
)
elif isinstance(processed_format, str):
df_formatted = df_raw.astype("str").applymap(
partial(_format_non_scientific_numbers, format_string=processed_format)
df_formatted = pd_df_map(
df_raw.astype("str"),
partial(_format_non_scientific_numbers, format_string=processed_format),
)
elif callable(processed_format):
df_formatted = df_raw.applymap(processed_format)
df_formatted = pd_df_map(df_raw, processed_format)

# Don't format integers: set to original value
if not format_integers:
integer_locs = df_raw.applymap(_is_integer)
df_formatted[integer_locs] = (
df_raw[integer_locs].astype(float).applymap("{:.0f}".format)
integer_locs = pd_df_map(df_raw, _is_integer)
df_formatted[integer_locs] = pd_df_map(
df_raw[integer_locs].astype(float), "{:.0f}".format
)
return df_formatted

Expand Down
6 changes: 3 additions & 3 deletions src/estimagic/visualization/profile_plot.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,13 +160,13 @@ def create_solution_times(df, runtime_measure, converged_info, return_tidy=True)
problem, algorithm and runtime_measure. The values are either the number
of evaluations or the walltime each algorithm needed to achieve the
desired precision. If the desired precision was not achieved the value is
set to np.inf (for n_evaluations) or 7000 days (for walltime since there
no infinite value is allowed).
set to np.inf.

"""
solution_times = df.groupby(["problem", "algorithm"])[runtime_measure].max()
solution_times = solution_times.unstack()
solution_times[~converged_info] = np.inf
# We convert the dtype to float to support the use of np.inf
solution_times = solution_times.astype(float).where(converged_info, other=np.inf)

if not return_tidy:
solution_times = solution_times.stack().reset_index()
Expand Down
4 changes: 2 additions & 2 deletions tests/inference/test_bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ def expected():
def seaborn_example():
out = {}

df = sns.load_dataset("exercise", index_col=0)
raw = sns.load_dataset("exercise", index_col=0)
replacements = {"1 min": 1, "15 min": 15, "30 min": 30}
df = df.replace({"time": replacements})
df = raw.assign(time=raw.time.cat.rename_categories(replacements).astype(int))
df["constant"] = 1

lower_ci = pd.Series([90.709236, 0.151193], index=["constant", "time"])
Expand Down
Loading
Loading