-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do you have plan to optimize leaky_relu and tanh op for tflm? #527
Comments
We are in the process of enabling the below kernel for TFLM on HiMax WE1 board as one of our NN models uses the Tanh kernel heavily. |
@mfarag13, @JaccovG, @Hakim7267 Please comment on this.. |
Hi, |
@JaccovG |
Here is output of one test from tflm leaky_relu_test.ccEntering to prepare of is_mli_applicable fixed 8: 0x40 tensor->data.int8:0x40 Exiting to prepare of is_mli_applicable Entering EvalMLI params->alpha:1.0*2^-1 Exiting EvalMLI expected_data[i] (1.02^0) near output_data[i] (1.59999932^1) failed at examples/kernel_add_test/add_test.cc:103
|
@JaccovG Could you please review patch and let us know your inputs |
@JaccovG Gentle reminder.! |
I'm not able to access the google drive. could you share it as a github commit? or as a PR? |
@JaccovG |
@JaccovG Gentle reminder.! |
1 similar comment
@JaccovG Gentle reminder.! |
sorry for my late reply, I was very busy. |
@JaccovG I have created q7 format for slope tensor,if this is something wrong could you please suggest. |
I couldn't quickly find how you did the conversion. it is fine to use q7 format as long as you shift the mantissa to match the exponent of 7. So what you need to check is if the slope value is correctly converted to the fixedpoint value. |
@JaccovG Pls refer here for conversion part |
@JaccovG as well pls refer the below code to construct slope_tensor |
We are using himax board to run our custom model that uses leaky_relu and tanh ops on arc processor and currently we are running on tflm with C reference code and it takes lot of cycles to run inference, so could you please accelerate these ops on TFLM.
The text was updated successfully, but these errors were encountered: