-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function 'builtins.int' is not supported? #20
Comments
I solved this problem with a silly looking piece of code. Instead of comparing str to int, I used int(str) to convert str to int, which, of course, is not supported! Now I tried the following modification, using str to str comparison,( def make_modify_date_pipeline():
return make_pipeline(ExpressionTransformer("X[0][:4] + '-' + X[0][4:6] + '-' + X[0][6:8] if len(X[0]) > 0 and X[0][0:8] < '20221230' else '2022-12-30'"), CastTransformer(dtype = "datetime64[D]"), DaysSinceYearTransformer(year = 2022)) Of course, I'm still curious about how to gracefully convert int to accomplish this task! |
The expression translator component of the JPMML-Python library currently does not support using inline Python/Numpy/Pandas/Scipy functions, whose translation would necessitate creating an external function definition (in the form of one or more For example, it's possible to use However, there is currently no way how to represent type casts using the |
This issue is about a component that lives inside the JPMML-Python library, so moving it there. |
As for a quick workaround for the example workflow, then I would suggest a two-step approach, where the incoming user input is first kept as def make_modify_date_pipeline():
int_sanitizer = ExpressionTransformer("X[0] if (pandas.notnull(X[0]) and X[0] > 0 and X[0] < 20221230) else 20221230")
int2string_caster = CastTransformer(dtype = str)
str_sanitizer = ExpressionTransformer("X[0][:4] + '-' + X[0][4:6] + '-' + X[0][6:8]")
return make_pipeline(int_sanitizer, int2string_caster, str_sanitizer) Alternatively, it would be possible to replace the int_sanitizer = ContinuousDomain(missing_value_treatment = "as_value", missing_value_replacement = 20221230, invalid_value_treatment = "as_missing", outlier_treatment = "as_missing", low_value = 0, high_value = 20221230) |
My intuition is that string-to-string comparison should NOT be allowed in this place. The natural operational type of strings is categorical, and the main characteristic of categorical values is that they are unordered. Therefore, numeric-like comparisons should not be allowed. I personally think that string-to-string comparisons (within a Python expression) should raise an error. Maybe if the Python side allows such "hack", then the (J)PMML side should be way more strict here, in order to ensure that there will be no surprises during model deployment. |
Hello Villu,
I am having problems with the sklearn2pmml conversion
It seems that the int in the following code is not being used correctly
Of course, we've talked about this before, and you give tips for better CastTransformer
I have upgraded to the latest sklearn2pmml version. What you mean is to change the sklearn version? (this will be an impossible operation, because I am working on the company's notebook and it is not allowed to change the sklearn version!).
Fetch, is there any other form to complete this operation? The reason why I write this is because I cannot compare str with int, so int is needed. If it is pure Python, I have many ways to solve it, but in pipeline, I don't know how to handle it!
The text was updated successfully, but these errors were encountered: