-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ GPU/OpenCL ] Split kernel registration from forwarding method #2785
base: main
Are you sure you want to change the base?
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2785. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
cibot: @EunjuYang, nntrainer/layers/cl_layers/concat_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md |
f43b253
to
d48da33
Compare
cibot: @EunjuYang, nntrainer/layers/cl_layers/layer_impl_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
d48da33
to
bff721b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
bff721b
to
4354f53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. LGTM!
<< "OpenCL Error: Fail to register concat_cl_axis3_fp16 kernel"; | ||
layer_kernel_ptrs.emplace_back(kernel_concat_ptr); | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick question! can't ConcatLayerCl::registerClKernels() be called twice?
assume it is called for a second time, would it throw a runtime error or return true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed it is only called once in add_default_object
, which is called by registerer
; the registerer
is called once. However, it seems better to check it. I will update it.
ClContext &ClContext::Global() {
static ClContext instance;
// initializing commandqueue and context
bool result = instance.clInit();
if (!result) {
ml_loge("cl_context: opencl command queue creation failed");
}
/// in g++ there is a bug that hangs up if caller throws,
/// so registerer is noexcept although it'd better not
/// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70298
std::call_once(global_cl_context_init_flag, registerer, std::ref(instance));
return instance;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a condition to check it in the registerClKernels()
as well.
if (!layer_kernel_ptrs.empty()) |
- This commit is draft - This commit splits kernel registeration from forwarding function. - This is WIP. This commit contains example update for concat_cl and fc_layer_cl. Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>
- This commit updates reshape_cl.cpp/.h to inherit LayerImplCl. - This commit implements registerClKernels(), which is called in context_cl.cpp - update fc_layer_cl.h (removing redundant variable) - update register_kernels only return true when all kernels are successfully registered. - add conditional code to check kernel is already registered Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>
- This commit do a fp16-related bugfix in concat_cl.cpp. - add condition `ENABLE_FP16` - update __fp16 to _FP16 Signed-off-by: Eunju Yang <[email protected]>
4354f53
to
da18596
Compare
6dc250f
to
e0bc0d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
- clang format is applied. - revert Android.mk - fix bug in registerClKernels Signed-off-by: Eunju Yang <[email protected]>
e0bc0d8
to
da83297
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the hard work! 👍
int dim = int(input1_batch_size * input1_width * input1_height * | ||
(input1_channels + input2_channels)); | ||
int dim = int(input1_batch_size * input1_channels * input1_width * | ||
(input1_height + input2_height)); | ||
|
||
opencl::Buffer inputA(cl_context_ref.context_inst_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about change the oopencl::Buffer to take the Tensor itself? Then we can set clCreateBuffer depending on type and we do not need to consider the type here.
Buffer::Buffer(ContextManager &context_manager, int size_in_bytes, |
concat_cl
,reshape_cl
, andfc_layer_cl
only.Self evaluation:
Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped