Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous Render List Generation and Frame-Independent Task Scheduling #2887

Open
wants to merge 55 commits into
base: dev
Choose a base branch
from

Conversation

douira
Copy link
Collaborator

@douira douira commented Nov 22, 2024

Uses an octree to generate render lists independently of the, now asynchronous, slow graph search.

  • It runs the occlusion culler in a separate thread, which allows a large speedup in render list generation time. Correct culling results are ensured with a combination of different types of BFS, synchronous occlusion culling is used as a last resort if the camera teleports or moves extremely quickly.
  • Tasks are ordered based on a combination score of how long they've been pending, their distance from the camera, their type, and whether they're visible in the camera frustum. How many tasks can be scheduled is now independent of the frame rate and instead are limited based on their estimated duration and size.
  • There's an upload limit to ensure not too many tasks are submitted that will exceed the upload buffer's size. This isn't a hard limit and this PR doesn't implement a new way of handling task buffers, to avoid expanding the scope too far.

Testing has not shown regressions and generally frame rate has improved a little if a system was not limited by render list generation, and a lot if it was. (see testing thread)

Combination with #2886 will avoid even more of the asynchronous work since a graph search isn't necessary to schedule tasks with this PR. (will merge conflict, but should be easy to resolve)

Companion PR in Iris: IrisShaders/Iris#2539

Makes #2780 unnecessary, since the octree traversal is correctly ordered already.

…led yet, it just does the tree frustum test right after each bfs for testing purposes atm)
…g RenderSection objects, improve tree render list generation performance
…trees of varying accuracy to present as few sections as possible while not generating any errors when the camera is in motion.
…ssues when the distance changes (for example under water)
- rebuild tasks are scheduled in a queue and pruned each frame
- async culling tasks and results are classes in their own package
- chunk rebuild tasks are prioritized based on their distance to the camera, their type, how long the task has been pending, and whether the section is currently visible (in the frustum)
- generally cleaned up the update method in RSM
…the frame rate and section count when the world updates frequently
…plicity.

also changed the frustum test to be before adding a section to the queue, and not after.
from my measurements by looking at how long the OcclusionCuller call inside FrustumCullTask takes, this has no impact on performance.
…ta all at once instead of for each section separately
… variants of,

add support for very tall worlds
…estimation, limit upload size based on previous mesh task result size or an estimate of it,

the limit behavior changes depending on which type of upload buffer is used
… not automatically correctly ordered since they're not coming from a tree
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-enhancement Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant