Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace yolov5 as palm detection network #19

Open
Cursky opened this issue Aug 24, 2022 · 1 comment
Open

Replace yolov5 as palm detection network #19

Cursky opened this issue Aug 24, 2022 · 1 comment

Comments

@Cursky
Copy link

Cursky commented Aug 24, 2022

Hello,handslandmark detection network is perfect,so fast and so accurate,But because it relies on data sets that are bare hands without gloves.When the hands it detects wear gloves, the effect is not satisfactory,At the same time, according to the mediapipe key point detection process, the first step is to detect the palm position, so I want to replace the palm detection model with the effective palm detection with gloves,I chose yolov5 as the detection model. Your project is perfect, but it would be better if you could teach how to replace your own retrained model.So I hope you can give me some guidance, and I will show some ideas and replacement process here.

First:

I retrain wear gloves palm detection model by yolov5,I use yolov5-6.0 [https://github.com/ultralytics/yolov5/releases] ,and then I export the onnx model according to this oak document https://www.oakchina.cn/2022/01/22/yolov5-blob/.
This is the conversion command:
(i use yolov5s.model)
python export.py --simplify --opset 12 --include onnx --batch-size 1 --imgsz 640 --weights yolov5s.pt

According to the explanation of the document, we are only interested in the last three convolution layers, so we add as SIGMOD

import onnx

onnx_model = onnx.load("plam.onnx")

conv_indices = []
for i, n in enumerate(onnx_model.graph.node):
  if "Conv" in n.name:
    conv_indices.append(i)

input1, input2, input3 = conv_indices[-3:]

sigmoid1 = onnx.helper.make_node(
    'Sigmoid',
    inputs=[onnx_model.graph.node[input1].output[0]],
    outputs=['output1_yolov5'],
)

sigmoid2 = onnx.helper.make_node(
    'Sigmoid',
    inputs=[onnx_model.graph.node[input2].output[0]],
    outputs=['output2_yolov5'],
)

sigmoid3 = onnx.helper.make_node(
    'Sigmoid',
    inputs=[onnx_model.graph.node[input3].output[0]],
    outputs=['output3_yolov5'],
)

onnx_model.graph.node.append(sigmoid1)
onnx_model.graph.node.append(sigmoid2)
onnx_model.graph.node.append(sigmoid3)

onnx.save(onnx_model, "plams.onnx")

and i use http://blobconverter.luxonis.com/ online model conversion. This is my command when converting:
image

now I have plams.blob

@geaxgx
Copy link
Owner

geaxgx commented Aug 24, 2022

Please note that the mediapipe palm detection model is doing more than finding a bounding box around the palm. It also finds keypoints in the palm that are used to calculate the rotated rectangle around the whole hand. This rotated rectangle (actually it is a square) is bigger than the initial bounding box but most importantly oriented so that the wrist keypoint is always on the lower side. This is what is expected as input by the landmark model. You can visualize these keypoints by running : ./demo.py --no_lm then pressing the key "2".
If the hands presented to the camera have always the same orientation (for instance when you raise your hands vertically), you can hardcode the rotation to bring the hand in the orientation expected by the landmark model (in the case of the vertical raised hand, a rotation is not even needed since the hand is already correctly oriented).

Another point I want to mention: currently this repo is not using the latest version of the palm detection model. There are 2 more recent versions (lite and full) of the palm detection, that I am not using because during my tests, I haven't found any noticeable accuracy improvement and the new models were slower (PINTO0309/tflite2tensorflow#19 (comment)).
I suspect google has used the same dataset to train their models, so I wouldn't expect much improvement for the detection of hands with gloves, but maybe it is worth a try.

From your tests,, when wearing gloves, once the hand palm is detected, is the landmark model doing a good job ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants