-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instanced BoundingBoxRenderer #15911
base: master
Are you sure you want to change the base?
Conversation
## Background Currently in BoundingBoxRenderer, it renders in a loop, calling `engine.drawElementsType` for each bounding box in `renderList`, making every bounding box 1 or 2 [draw call](https://doc.babylonjs.com/features/featuresDeepDive/scene/optimize_your_scene/#reducing-draw-calls). Since it renders on web, it would suffer the same performance issue like meshes when draw calls increases. ## Proposal Since only the `worldMatrix` changes for different bounding boxes, it should benefit from instancing, like [thin instances](https://doc.babylonjs.com/features/featuresDeepDive/mesh/copies/thinInstances) for meshes. 1. Allocate buffers for instancing. 2. Loop renderList and fill buffers. 3. Copy buffers to gpu. 4. Render instanced. With instancing the draw call needed for each BoundingBoxRenderer could be reduced to 1 or 2 (in case `this.showBackLines` enabled) Example: https://playground.babylonjs.com/#37HN68#4 ## Alternatives Since there are very few[1] vertex needed, and there is already [`vectorsWorld`](https://github.com/BabylonJS/Babylon.js/blob/73ab4d666c0506a5fe551ce852a23e0e8ec0d15a/packages/dev/core/src/Culling/boundingBox.ts#L43) in `BoundingBox`, which [updates](https://github.com/BabylonJS/Babylon.js/blob/73ab4d666c0506a5fe551ce852a23e0e8ec0d15a/packages/dev/core/src/Culling/boundingBox.ts#L166) every time world matrix updates, the computation of matrices can be skipped, and reuse computation result of `vectorsWorld` to rebuild vertex buffer every time it renders. In this case `_indexBuffer` needs to be reconstructed is count of bounding box mismatches. [1]: 24 points in case of `CreateBoxVertexData`, or 8 points if constructed manually, since uvs and normals are not needed for rendering as lines ## Compatibility 1. It's disabled by default for backwards compatibility. 2. It could result in a difference in rendering result if `showBackLines` enabled, because drawing the black/white part of each box one after the other can be different from drawing the black part of all boxes and then the white part. 3. Events of `onBeforeBoxRenderingObservable` and `onAfterBoxRenderingObservable` would only be triggered once for one rendering, instead of once every bounding box. Events would be triggered with a dummy box to keep backwards compatibility. ## References Forum post: <https://forum.babylonjs.com/t/batched-or-instanced-boundingboxrenderer/54977>
Please make sure to label your PR with "bug", "new feature" or "breaking change" label(s). |
Snapshot stored with reference name: Test environment: To test a playground add it to the URL, for example: https://snapshots-cvgtc2eugrd3cgfd.z01.azurefd.net/refs/pull/15911/merge/index.html#WGZLGJ#4600 Links to test babylon tools with this snapshot: https://playground.babylonjs.com/?snapshot=refs/pull/15911/merge To test the snapshot in the playground with a playground ID add it after the snapshot query string: https://playground.babylonjs.com/?snapshot=refs/pull/15911/merge#BCU1XR#0 |
Visualization tests for WebGPU (Experimental) |
WebGL2 visualization test reporter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
* If true, the buffer would be updated instead of recreated when rendering a frame, | ||
* this could reduce memory pressure. | ||
*/ | ||
public updatable = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this variable? Just keep the "updatable=true" case in the code, as I can't see why a user would want to set updatable=false
(recreating the vertex buffer every frame will likely always be slower than updating it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs of Mesh said that using static buffer could have better performances, but in recent tests on a chrome, recreating it leads to more memory pressure, so this is kept and defaults to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, performance is better when static=true
if you create the buffers once and reuse them. But in your case, you recreate the buffers every frame when updatable=false
. So even if the buffers are created with static=false
, it's probably slower than creating the buffers once with static=false
and updating them afterwards. You should run some tests to make sure, as we don't want to keep a code path that will never be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll run some profiling on chrome and firefox, but I does not have access to safari profiler. Also, since the buffer would be recreated when size change or useInstances
change, the buffer recreation code is needed, so there is few extra code for this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary:
Configuration | Browser | _renderInstanced | Heap |
---|---|---|---|
updatable=true |
Chrome | 65.1% | 102 MB |
Firefox | 69% | 156.33 MB | |
updatable=false |
Chrome | 69.0% | 103 MB |
Firefox | 72% | 154.20 MB |
Details
updatable
=true
:
Chrome:
_renderInstanced: 65.1%
Heap: 102MB
Firefox:
_renderInstanced: 69%
Heap: 156.33MB
updatable
=false
:
Chrome:
_renderInstanced: 69.0%
Heap: 103MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let s not make upadtable a configuration and optimize for the most common case as it adds extra complexity and maintenance.
Apply suggestions from code review Co-authored-by: Popov72 <[email protected]>
@@ -90,6 +91,15 @@ Object.defineProperty(AbstractMesh.prototype, "showBoundingBox", { | |||
configurable: true, | |||
}); | |||
|
|||
const mat1 = Matrix.Identity(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename mat1 vec and vec2 to better let the devs know they are temp values ? also whey not using babylon TMP values MathTmp.Vector3[0]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra work needs to be done to evaluate that the use of TmpVectors
does not conflict with other internal uses, maybe in getWorldMatrix
and other places, also, MathTmp
is not exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP, so lets rename only
* If true, the buffer would be updated instead of recreated when rendering a frame, | ||
* this could reduce memory pressure. | ||
*/ | ||
public updatable = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let s not make upadtable a configuration and optimize for the most common case as it adds extra complexity and maintenance.
Background
Currently in BoundingBoxRenderer, it renders in a loop, calling
engine.drawElementsType
for each bounding box inrenderList
, making every bounding box 1 or 2 draw call.Since it renders on web, it would suffer the same performance issue like meshes when draw calls increases.
Proposal
Since only the
worldMatrix
changes for different bounding boxes, it should benefit from instancing, like thin instances for meshes.With instancing the draw call needed for each BoundingBoxRenderer could be reduced to 1 or 2 (in case
this.showBackLines
enabled)Example:
https://playground.babylonjs.com/#37HN68#14
Alternatives
Since there are very few[1] vertex needed, and there is already
vectorsWorld
inBoundingBox
, which updates every time world matrix updates, the computation of matrices can be skipped, and reuse computation result ofvectorsWorld
to rebuild vertex buffer every time it renders. In this case_indexBuffer
needs to be reconstructed is count of bounding box mismatches.[1]: 24 points in case of
CreateBoxVertexData
, or 8 points if constructed manually, since uvs and normals are not needed for rendering as linesCompatibility
showBackLines
enabled, because drawing the black/white part of each box one after the other can be different from drawing the black part of all boxes and then the white part.onBeforeBoxRenderingObservable
andonAfterBoxRenderingObservable
would only be triggered oncefor one rendering, instead of once every bounding box. Events would be triggered with a dummy box to keep backwards compatibility.
References
Forum post: https://forum.babylonjs.com/t/batched-or-instanced-boundingboxrenderer/54977