Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoryless tiler #477

Merged
merged 4 commits into from
Nov 21, 2024
Merged

Memoryless tiler #477

merged 4 commits into from
Nov 21, 2024

Conversation

eugenegff
Copy link
Member

RSC_TILER + TextureFlags::Tiler[Depth]Memoryless => MTLStorageModeMemoryless and VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT

eugenegff and others added 2 commits November 16, 2024 01:40
@eugenegff eugenegff marked this pull request as draft November 16, 2024 09:31
@dyunchik
Copy link
Contributor

First of all, we should be able to globally disable the use of memoryless. Because some very heavy scenes can give an out of tile memory error and simply not render anything (we had such cases on Metal in Live Home 3D). For this purpose we introduced "WindowMemoryless" param. Actually it's almost the same as "memoryless_depth_buffer". But if "WindowMemoryless"/"memoryless_depth_buffer" is false, then we have to remove MTLStorageModeMemoryless from depth buffer and also from MSAA buffer as well. Indeed, this way Metal will be able to split rendering into several passes with intermediate results stored in these buffers, and Metal does this. I don't know about Vulcan, but I think he rules in the same way.

@dyunchik
Copy link
Contributor

dyunchik commented Nov 16, 2024

Second, we should be able to use memoryless for depth only when we render to MSAA texture. For this purpose we need something like TextureFlags::TilerDepthMemoryless.
For example:

compositor_node PlanarReflectionsFSAAReflectiveRenderingNode
{
	in 0 rt_renderwindow
    in 1 planarReflect_msaa_rtt
 
	target planarReflect_msaa_rtt
	{    
		pass render_scene
		{
			load
			{
				all				clear
				clear_colour	0.2 0.4 0.6 1
			}
			store
			{
				colour	store_or_resolve
				depth	dont_care
				stencil	dont_care
			}

			overlays		off
			visibility_mask 0xfffffffe
   
            rq_first    0
            rq_last     250
		}
	}
 
    target rt_renderwindow
    {
        pass render_quad
        {
            load { all dont_care }
            material Ogre/Copy/4xFP32
            input 0 planarReflect_msaa_rtt
        }
        
		pass generate_mipmaps
		{
			mipmap_method api_default
		}
    }
}

Here we use ordinary MSAA texture planarReflect_msaa_rtt and memoryless depth when we render into target planarReflect_msaa_rtt create with following code:

            Ogre::RenderSystem* rs = root->getRenderSystem();
            if(rs)
            {
                Ogre::TextureGpuManager *textureGpuManager = rs->getTextureGpuManager();
                if(textureGpuManager)
                {
                    const RenderSystemCapabilities *capabilities = rs->getCapabilities();
                    const bool isTiler = capabilities->hasCapability( RSC_IS_TILER );
                    const size_t uniqueId = Id::generateNewId<CSceneManager>();
                    Ogre::PixelFormatGpu pixelFormat = Ogre::PFG_BGRA8_UNORM_SRGB;
                    uint32 textureFlags = TextureFlags::RenderToTexture ;
                    if(isTiler)
                        textureFlags |= TextureFlags::TilerDepthMemoryless;
                    msReflectionSharedMSAAWorkTexture = textureGpuManager->createTexture(
                                        "ReflectionSharedMSAAWorkTexture" + StringConverter::toString( uniqueId ),
                                        Ogre::GpuPageOutStrategy::Discard, textureFlags, TextureTypes::Type2D );
                    msReflectionSharedMSAAWorkTexture->setOrientationMode(orientationMode);
                    msReflectionSharedMSAAWorkTexture->setResolution( resolution, resolution );
                    msReflectionSharedMSAAWorkTexture->setPixelFormat( pixelFormat );
                    msReflectionSharedMSAAWorkTexture->setSampleDescription(sampleDesc);
                    msReflectionSharedMSAAWorkTexture->setNumMipmaps(1u);
                    msReflectionSharedMSAAWorkTexture->_transitionTo( GpuResidency::Resident, (uint8*)0 );
                }
            }

So to be flexible, it would be great to have texture flags: TilerMemoryless and TilerDepthMemoryless

@dyunchik
Copy link
Contributor

dyunchik commented Nov 16, 2024

Hm, may be the special memoryless depth pool is enough and we may avoid TilerDepthMemoryless

@dyunchik
Copy link
Contributor

However, we still need the ability to disable memoryless mode for implicit MSAA along with depth. For example when too many triangles are visible and we run out of tile memory.

@darksylinc
Copy link
Member

darksylinc commented Nov 16, 2024

Because some very heavy scenes can give an out of tile memory error and simply not render anything (we had such cases on Metal in Live Home 3D).

Oh! I didn't know this. Then yes, we need a global toggle.

The only issue I see is that such toggle requires quite a bunch of resources to be recreated (if you want it to be toggled at runtime and not at startup).

Regardless Vulkan, ARM Mali has a 180MB limit (memoryless or not). That 180MB is in the data that gets sent from VS to PS, i.e. position + interpolators data.

ARM published more details about this yesterday, see page 13.

Hm, may be the special memoryless depth pool is enough and we may avoid TilerDepthMemoryless

Yes, the memoryless depth pool should be enough.

- Added extensive validation:
  - Load & Store actions are validated
  - TextureGpu::copyTo validates
  - AsyncTextureTicket validates
  - StagingTexture validates
- Validation is used on non-tilers too, to ease development and
cross-platform compatibility & testing.
- Replaced TextureFlags::TilerDepthMemoryless with
DepthBuffer::POOL_MEMORYLESS
- Eugene's code was more aggressive in deducing if memoryless could be
used, but this would break advanced user cases (i.e. where UAV / Compute
Shaders are involved). While it might be possible to deduce a that a
NotTexture + RenderTexture + not UAV should be memoryless, such specific
flag combination implies the user knows what they're doing and can
request the memoryless explicitly. Discardable does not mean it's safe
for the texture to be memoryless, since Discardable is for content that
can be discarded for the next frame, but it may still be accessed during
the same frame. Explicit MSAA are almost never intended to be
memoryless.
- Make implicit MSAA textures always memoryless
- "WindowMemoryless" replaced with per-window setting
memoryless_depth_buffer
@eugenegff eugenegff marked this pull request as ready for review November 19, 2024 11:08
@eugenegff
Copy link
Member Author

eugenegff commented Nov 19, 2024

Because some very heavy scenes can give an out of tile memory error and simply not render anything (we had such cases on Metal in Live Home 3D).

Oh! I didn't know this. Then yes, we need a global toggle.

So, what should we do? Take current iteration as is, and add support for opt-out for memoryless window color buffers later, or keep tweaking? In current state with memoryless depth buffers and render textures - it is already quite useful

And adds option to be toggeld via RenderSystem option.
This enables the user to fix potential rendering crashes if their scenes
are so heavy that a flush is necessary (which means TilerMemoryless flag
must be ignored).

Fix clang format
Remove use of auto keyword
@darksylinc
Copy link
Member

darksylinc commented Nov 21, 2024

OK I just pushed the necessary change:

Added "Allow Memoryless RTT" RenderSystem option which calls TextureGpuManager::setAllowMemoryless.

If all is well, it can be merged.

PS: I noticed clang format script was ignoring the Vulkan folder 😱
We'll fix that once that is merged to avoid any potential merge conflict (fortunately the changes are not much).

@eugenegff eugenegff merged commit b718ca1 into master Nov 21, 2024
2 checks passed
@eugenegff eugenegff deleted the memoryless_tiler branch November 21, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants