Skip to content
This repository has been archived by the owner on Oct 7, 2024. It is now read-only.

Speedup on Android #1

Open
eyalhoc opened this issue Jul 8, 2019 · 16 comments
Open

Speedup on Android #1

eyalhoc opened this issue Jul 8, 2019 · 16 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@eyalhoc
Copy link

eyalhoc commented Jul 8, 2019

I'm running the app on Samsung S8.
I get about 0.5 fps for the highest quality and around 2 fps for the lowest.
I'm trying to speed it up but have little experience with this.
From what I found in the documentation GPU will give about 2x but will only work with tflite.
Does anyone intend to contribute in this direction or can help with some guidance?

@FilippoAleotti
Copy link
Owner

FilippoAleotti commented Jul 8, 2019

Hi
I already did in the past a similar application using tflite. Unfortunately, the app was supposed to be “shot and elaborate”, so not in streaming. However, I’ve already planned to add a tflite-version as soon as I have some spare time. The main difference is about the tf engine invocation, since the api is changed.

@eyalhoc
Copy link
Author

eyalhoc commented Jul 8, 2019 via email

@FilippoAleotti
Copy link
Owner

Hi Eyal
No, I didn’t published that app. In the next days I’ll try to upload a tflite version of this app
Best regards

@eyalhoc
Copy link
Author

eyalhoc commented Jul 8, 2019 via email

@eyalhoc
Copy link
Author

eyalhoc commented Jul 12, 2019 via email

@FilippoAleotti
Copy link
Owner

FilippoAleotti commented Jul 12, 2019

Hi Eyal
I tried two days ago but right now the app crashes (probably due to a bug in the code I added). Unfortunately, I’m quite busy now and I have really few time to spend on debugging. If you have time to take a look at it, I can push the bugged tflite version in a separate folder/branch

@eyalhoc
Copy link
Author

eyalhoc commented Jul 12, 2019 via email

@FilippoAleotti
Copy link
Owner

FilippoAleotti commented Jul 12, 2019

Just created a tflite branch. I uploaded also the script I used for exporting the model

@eyalhoc
Copy link
Author

eyalhoc commented Jul 14, 2019

I checked the tflite model it works for me. On my Galaxy S8 inference is around 2-2.5 seconds per frame.
I tried to use GPU and I get this error:

java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: WARNING: op code #55 cannot be handled by this delegate.  Only the first 2 ops will run on the GPU, and the remaining 134 on the CPU.GpuDelegate Prepare: ReadValue: value is a constant tensor: 193Node number 136 (GpuDelegate) failed to prepare.

@eyalhoc
Copy link
Author

eyalhoc commented Jul 14, 2019

I updated to:
implementation 'org.tensorflow:tensorflow-lite:0.0.1-gpu-experimental'

Got this error:
java.lang.IllegalArgumentException: Internal error: Failed to apply delegate: Next operations are not supported by GPU delegate:
MAXIMUM: Operation is not supported.
First 2 operations will run on the GPU, and the remaining 134 on the CPU.TfLiteGlDelegate Prepare: ReadValue: value is a constant tensor: 193Node number 136 (TfLiteGlDelegate) failed to prepare.

@FilippoAleotti
Copy link
Owner

FilippoAleotti commented Jul 15, 2019

Thank you for your support. So, the problem is the maximum operation used in the leaky_relu. A possible solution could be replacing the leaky_relu (defined in script/layers.py) with the native tensorflow op tf.nn.leaky_relu, which should be supported by tflite https://www.tensorflow.org/lite/guide/ops_compatibility.

If you want to try again, I've already updated the frozen tflite-model in the asset folder using the native operation

@FilippoAleotti FilippoAleotti added enhancement New feature or request help wanted Extra attention is needed labels Jul 16, 2019
@FilippoAleotti
Copy link
Owner

Hi Eyal, have you got some news? I’ve spent some time on debugging but I can’t figure out why the app crashes. However, with the native tf.nn.leaky_relu the maximum operation problem is gone

@eyalhoc
Copy link
Author

eyalhoc commented Jul 23, 2019 via email

@FilippoAleotti
Copy link
Owner

Thank you for your reply

About the layers on gpu, I notice it but it sounds strange since we do not use any particular operation. For instance, the mlcore model for iOS is generated and runs on gpu without problems.

About the quality, are you sure you are not using a lower resolution map (for instance, the quartese resolution one)?

However, if I can help you in some way just ask

@eyalhoc
Copy link
Author

eyalhoc commented Jul 23, 2019 via email

@rdcdt1
Copy link

rdcdt1 commented Sep 1, 2020

hi
it's a beautiful project and it's sad that it's run so slow on android
i look and the app only use 1 thread maybe with mutlithreading it will be faster.
i find a project that use multithreading or the gpu (https://github.com/danielecolautti/PyDNetMobileMultithreading), it's improve a lot the speed (10 fps) but the depth map is wrong.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants