Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement material system #1105

Merged
merged 1 commit into from
Jul 27, 2024
Merged

Implement material system #1105

merged 1 commit into from
Jul 27, 2024

Conversation

VReaperV
Copy link
Contributor

@VReaperV VReaperV commented Apr 27, 2024

Base changes for implementing a material system.

The goal of this system is to improve performance by reducing the amount of state changes and data uploads from CPU->GPU.

The system works by generating materials for draw surfaces/shaders/shader stages after the map is loaded, as well as all the necessary data to draw each material in one drawcall. Currently this only works for world BSP surfaces, everything else is drawn the normal way.

Material

A material in this case represents a distinct global OpenGL state: this includes state bits (as used by GL_State(), however without the alphaTest bits), culling mode, shader program, material ordering and VBO/IBO bindings. Each material is also allocated 2 blocks of memory in a buffer, this memory holds the data for each surface.

Material system

The material system holds material packs, which map materials to various shader sorts: depth pre-pass, opaque pass, and translucent pass.
It also contains sky shaders and a list of dynamic surfaces.

A dynamic surface is a surface which can have its data changed after the map is loaded. This includes non-constant color, tex matrices, animated and video images, as well as any active expressions used by RB_EvalExpression().

All surface data is held in a shader storage buffer object. This buffer consists of 2 blocks which have the following memory layout:
Material0_surface0_stage0: uniform0 // Start of material 0 data
..
Material0_surface0_stage0: uniformn
Material0_surface0_stage0: optional padding
Material0_surface0_stage1: uniform0
..
Material0_surface0_stagex: uniformn
Material0_surface0_stagex: optional padding
..
Material1: optional padding
Material1_surface0_stage0: uniform0
etc.

The optional padding for per-surface data is needed for it to align with std430 layout. The optional padding for materials is needed because each material can have a different data size per surface, but the offset in the shader has to be an integer multiple of its materials size. The shader accesses the buffer using the offset that equals to: bufferOffset / paddedSize, bufferOffset is the offset in components i. e. multiples of 4 bytes because that's the minimum accessible memory in std430, paddedSize is the size of this shaders Material struct in components, aligned under std430.

The buffer is split into a static surface block, followed by a dynamic surface block. The dynamic surface block is updated once per frame. The whole block has to be updated because the BSP is entirely skipped for world surfaces.

There are 4 distinct steps done by the material system:

  1. Shader post-processing. This will move the uniforms around for shaders marked as using the material system, and then generate the memory layout data (offsets and sizes), as well as generate code that will be inserted into the shader code instead of the uniforms. Uniforms marked as global are skipped: these uniforms are set per-shader rather than per-surface because they don't change within a rendered view.
  2. Material generation. This will iterate through all of the world BSP surfaces except sky and portal surfaces using R_AddWorldSurfaces(). Then for each stage material data is obtained and if a matching material doesn't exist it will be added, then the surface data is added to the material. Surfaces will also be marked as either static or dynamic. Portal surfaces are copied so they can be rendered later, but until GPU culling is there it has to be done in an inefficient way.
  3. Buffer generation. This will generate the buffer with all of the per-surface data, and then the command buffer.
  4. Rendering. The material rendering is called from RB_RenderDrawSurfaces() with appropriate shader sorts. All of the materials matching this sort will the nbe rendered in order.

The command buffer contains data for multiDrawIndirect drawcalls. This contains the indexes into vertex and index array buffers, as well as baseInstance, which is used to access the buffer block in the shader. This data is static.

When rendering, all of the surfaces of a rendered material will be rendered with one such call.

While this means that all of the map surfaces are rendered every frame, it ends being much faster than walking the BSP tree and incurring all the state changes for maps with worse vis. Some maps that are good for testing this are https://users.unvanquished.net/~sweet/pkg/map-station12-b7_0.dpk and https://users.unvanquished.net/~kai/map-example_0.1.dpk.
The impact may vary depending on the device, but because currently at least on some devices perf can be worse with this system enabled depending on map, it is disabled by default. I expect that to work better or on par with what we currently have once GPU culling is added.

It can also later be extended to entities etc.

Some example performance differences:

map scene FPS(material system on) FPS(master branch)
nova -1896 336 248 0 0 210 170
habitat default spectator view 200 150
station12-b7 -1377 -195 10 6 8 110 70
example 22728 21405 25414 -139 45 170 43

There are a few issues that are out of scope of this pr: proper portal culling, heatHaze stages (the only map that I know of that use heatHaze is usstremor, but there the heatHaze stage is on a non-world entity, so it works fine anyway), quakeFog stage.

To use the material system, do /r_arb_bindless_texture 1; r_useMaterialSystem 1; vid_restart.

@illwieckz
Copy link
Member

illwieckz commented Apr 27, 2024

I get a lot of those warnings:

Unvanquished/daemon/src/engine/renderer/gl_shader.h:478:37: warning: '&&' within '||' [-Wlogical-op-parentheses]
                if ( _shader->UseMaterialSystem() && _global || !_shader->UseMaterialSystem() ) {
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~ ~~
Unvanquished/daemon/src/engine/renderer/Material.cpp:1503:70: warning: '&&' within '||' [-Wlogical-op-parentheses]
        if ( tr.hasSkybox && ( fromSort <= shaderSort_t::SS_ENVIRONMENT_FOG && toSort >= shaderSort_t::SS_ENVIRONMENT_FOG
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@slipher
Copy link
Member

slipher commented Apr 28, 2024

"Material" is sometimes used as a synonym or generic term for (q3)shader. Maybe we could call these metamaterials?

@VReaperV
Copy link
Contributor Author

I get a lot of those warnings:

This shouldbe fixed now.

@VReaperV
Copy link
Contributor Author

"Material" is sometimes used as a synonym or generic term for (q3)shader. Maybe we could call these metamaterials?

Yeah, I thought the name might cause some confusion, so a rename might be in order.

@illwieckz
Copy link
Member

Unvanquished/daemon/src/engine/renderer/gl_shader.h:399:3: warning: field '_global' will be initialized after field '_components' [-Wreorder-ctor]
                _global( global ),
                ^~~~~~~~~~~~~~~~~
                _std430Alignment( std430Alignment )
Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:136:7: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses]
        if ( !bundle->numImages == 0 ) {
             ^                  ~~
Unvanquished/daemon/src/engine/renderer/TextureManager.cpp:177:1: warning: non-void function does not return a value in all control paths [-Wreturn-type]
}
^
Unvanquished/daemon/src/engine/renderer/Material.cpp:1524:74: warning: '&&' within '||' [-Wlogical-op-parentheses]
        if ( tr.hasSkybox && ( ( fromSort <= shaderSort_t::SS_ENVIRONMENT_FOG ) && ( toSort >= shaderSort_t::SS_ENVIRONMENT_FOG )
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Unvanquished/daemon/src/engine/renderer/Material.cpp:1525:64: warning: '&&' within '||' [-Wlogical-op-parentheses]
                                            || ( fromSort <= shaderSort_t::SS_ENVIRONMENT_NOFOG ) && toSort >= ( shaderSort_t::SS_ENVIRONMENT_NOFOG ) ) ) {
                                            ~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:124:18: warning: expression result unused [-Wunused-value]
                bundle->image[0];
                ~~~~~~~~~~~~~ ~^

It's a cherry-picking as I get dozens and dozens of other warnings like unused variables, const type qualifer on return having no effect. They are likely harmless but produces a lot noise making harder to spot the warnings that can be problematic for real.

@illwieckz
Copy link
Member

illwieckz commented Apr 28, 2024

More details (GCC) on what is probably the same of the first one of the previous comment (Clang):

Unvanquished/daemon/src/engine/renderer/gl_shader.h: In constructorGLUniform::GLUniform(GLShader*, const char*, const char*, GLuint, GLuint, bool, int, bool)’:
/home/illwieckz/dev/buildme-unvanquished/Unvanquished/daemon/src/engine/renderer/gl_shader.h:387:20: warning: ‘GLUniform::_globalwill be initialized after [-Wreorder]
  387 |         const bool _global = false; // This uniform won't go into materials SSBO if true
      |                    ^~~~~~~
Unvanquished/daemon/src/engine/renderer/gl_shader.h:381:19: warning:   ‘const int GLUniform::_components’ [-Wreorder]
  381 |         const int _components;
      |                   ^~~~~~~~~~~
Unvanquished/daemon/src/engine/renderer/gl_shader.h:392:9: warning:   when initialized here [-Wreorder]
  392 |         GLUniform( GLShader *shader, const char *name, const char* type, const GLuint std430Size, const GLuint std430Alignment,
      |         ^~~~

@illwieckz
Copy link
Member

Also this:

diff --git a/src/engine/renderer/TextureManager.h b/src/engine/renderer/TextureManager.h
index d31892774..d28c78cb3 100644
--- a/src/engine/renderer/TextureManager.h
+++ b/src/engine/renderer/TextureManager.h
@@ -36,6 +36,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 #ifndef TEXTURE_MANAGER_H
 #define TEXTURE_MANAGER_H
 
+#include <queue>
 #include <vector>
 #include <unordered_set>
 #include "GL/glew.h"

It is caught by older compilers (GCC9, ICC…):

Unvanquished/daemon/src/engine/renderer/TextureManager.h:86:8: error: ‘queuein namespacestddoes not name a template type
   86 |   std::queue<Texture*> texQueue;
      |        ^~~~~
Unvanquished/daemon/src/engine/renderer/TextureManager.h:42:1: note: ‘std::queueis defined in header<queue>’; did you forget to#include <queue>’?
   41 | #include "GL/glew.h"
  +++ |+#include <queue>
Unvanquished/daemon/src/engine/renderer/TextureManager.h(86): error: qualified name is not allowed
  		std::queue<Texture*> texQueue;
  		^
Unvanquished/daemon/src/engine/renderer/TextureManager.h(86): error #77: this declaration has no storage class or type specifier
  		std::queue<Texture*> texQueue;
  		^

@illwieckz
Copy link
Member

With AMDGPU-PRO OGLP 23.40:

Warn: Compile log:
ERROR: 0:94: 'gl_DrawIDARB' : required extension not requested: GL_ARB_shader_draw_parameters
ERROR: 0:94: '' : missing #endif 
ERROR: 0:94: '' : compilation terminated 
ERROR: 3 compilation errors.  No code generated.
Warn: Unhandled exception (15ShaderException): Couldn't compile vertex shader: generic2D 

https://pastebin.com/JXGTBaky

With Mesa radeonsi 23.2.1:

Warn: Compile log:
0:94(14): error: `gl_DrawIDARB' undeclared
0:95(20): error: `gl_BaseInstanceARB' undeclared
 Warn: Unhandled exception (15ShaderException): Couldn't compile vertex shader: generic2D 

https://pastebin.com/0qNYBnX1

@VReaperV
Copy link
Contributor Author

Yeah, I'll fix most of those later, when everything is more ready. I'm not worried about the ones that don't actually change how the code is executed right now.
The && within || warn is just bullshit however.

@VReaperV
Copy link
Contributor Author

With AMDGPU-PRO OGLP 23.40:

Warn: Compile log:
ERROR: 0:94: 'gl_DrawIDARB' : required extension not requested: GL_ARB_shader_draw_parameters
ERROR: 0:94: '' : missing #endif 
ERROR: 0:94: '' : compilation terminated 
ERROR: 3 compilation errors.  No code generated.
Warn: Unhandled exception (15ShaderException): Couldn't compile vertex shader: generic2D 

https://pastebin.com/JXGTBaky

With Mesa radeonsi 23.2.1:

Warn: Compile log:
0:94(14): error: `gl_DrawIDARB' undeclared
0:95(20): error: `gl_BaseInstanceARB' undeclared
 Warn: Unhandled exception (15ShaderException): Couldn't compile vertex shader: generic2D 

https://pastebin.com/0qNYBnX1

It looks like AddExtension() for GL_ARB_shader_draw_parameters will have to be set as version -1, because it relies on the variables/functions/etc in an extension having the same names as in the core version.

@illwieckz
Copy link
Member

With both drivers case I get this though:

...using GL_ARB_shader_draw_parameters

@VReaperV
Copy link
Contributor Author

Most or all of these warns are caught by CI anyway, so they shouldn't be a problem.

@VReaperV
Copy link
Contributor Author

With both drivers case I get this though:

...using GL_ARB_shader_draw_parameters

I'm pretty sure it works that way will all extensions that we use. But the shader never declares the extension as required in this case.

@illwieckz
Copy link
Member

illwieckz commented Apr 28, 2024

Anyway with -1 I got this with Mesa radeonsi:

unvanquished_2024-04-28_183358_000

With OGLP the computer hangs for several minute, then later when the computer becomes usable again the game window blinks forever with some textures painted in the whole window (but behind 2D elements), doing a screenshot just captures black.

Edit: dmesg log: https://pastebin.com/pPpVWza8

@VReaperV
Copy link
Contributor Author

Some graphics settings are currently broken on this branch, so might be that.

@illwieckz
Copy link
Member

I get the same with medium preset, and with low preset I get:

Tess_Begin (stageIteratorFunc=0x555555721680 <Tess_StageIteratorSky()>, stageIteratorFunc2=0x0, surfaceShader=<optimized out>, lightShader=0x0, skipTangentSpaces=false, lightmapNum=-1, fogNum=-1, bspSurface=false) at Unvanquished/daemon/src/engine/renderer/tr_shade.cpp:551
551			tess.numSurfaceStages = state->numStages;

Thread 1 (Thread 0x7ffff554ba80 (LWP 3119916) "daemon"):
#0  Tess_Begin (stageIteratorFunc=0x555555721680 <Tess_StageIteratorSky()>, stageIteratorFunc2=0x0, surfaceShader=<optimized out>, lightShader=0x0, skipTangentSpaces=false, lightmapNum=-1, fogNum=-1, bspSurface=false) at Unvanquished/daemon/src/engine/renderer/tr_shade.cpp:551
#1  0x00005555556c0aad in MaterialSystem::RenderMaterials (this=<optimized out>, fromSort=fromSort@entry=shaderSort_t::SS_ENVIRONMENT_FOG, toSort=toSort@entry=shaderSort_t::SS_OPAQUE) at Unvanquished/daemon/src/engine/renderer/Material.cpp:1533
#2  0x00005555556a10ee in RB_RenderView (depthPass=false) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:5012
#3  DrawViewCommand::ExecuteSelf (this=0x7fffd7b96354) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:5911
#4  0x00005555556930aa in RB_ExecuteRenderCommands (data=<optimized out>) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:6166
#5  0x00005555556b22bb in R_SyncRenderThread () at Unvanquished/daemon/src/engine/renderer/tr_cmds.cpp:209
#6  0x00005555556b22f2 in R_SyncRenderThread () at Unvanquished/daemon/src/engine/renderer/tr_cmds.cpp:217
#7  0x00005555556c9ebd in RE_GenerateTexture (pic=<optimized out>, width=256, height=256) at Unvanquished/daemon/src/engine/renderer/tr_image.cpp:3069
#8  0x00005555556002bf in <lambda(std::vector<unsigned char, std::allocator<unsigned char> >, int, int, qhandle_t&)>::operator() (__closure=<optimized out>, handle=@0x7fffffffc3bc: 0, y=<optimized out>, x=<optimized out>, data=...) at Unvanquished/daemon/src/engine/client/cl_cgame.cpp:1297
#9  Util::apply_impl<CGameVM::QVMSyscall(int, Util::Reader&, IPC::Channel&)::<lambda(std::vector<unsigned char>, int, int, qhandle_t&)>, std::tuple<std::vector<unsigned char, std::allocator<unsigned char> >&&, int&&, int&&, int&>, 0, 1, 2, 3> (func=..., tuple=...) at Unvanquished/daemon/src/common/Util.h:125
#10 Util::apply<CGameVM::QVMSyscall(int, Util::Reader&, IPC::Channel&)::<lambda(std::vector<unsigned char>, int, int, qhandle_t&)>, std::tuple<std::vector<unsigned char, std::allocator<unsigned char> >&&, int&&, int&&, int&> > (func=..., tuple=...) at Unvanquished/daemon/src/common/Util.h:130
#11 IPC::detail::HandleMsg<CGameVM::QVMSyscall(int, Util::Reader&, IPC::Channel&)::<lambda(std::vector<unsigned char>, int, int, qhandle_t&)>, IPC::Message<IPC::Id<0, 68>, std::vector<unsigned char, std::allocator<unsigned char> >, int, int>, IPC::Reply<int> > (func=..., reader=..., channel=...) at Unvanquished/daemon/src/common/IPC/Channel.h:217
#12 IPC::HandleMsg<IPC::SyncMessage<IPC::Message<IPC::Id<0, 68>, std::vector<unsigned char, std::allocator<unsigned char> >, int, int>, IPC::Reply<int> >, CGameVM::QVMSyscall(int, Util::Reader&, IPC::Channel&)::<lambda(std::vector<unsigned char>, int, int, qhandle_t&)> >(IPC::Channel &, Util::Reader, <lambda(std::vector<unsigned char, std::allocator<unsigned char> >, int, int, qhandle_t&)> &&) (channel=..., reader=..., func=...) at Unvanquished/daemon/src/common/IPC/Channel.h:241
#13 0x0000555555603d11 in CGameVM::QVMSyscall (this=<optimized out>, syscallNum=<optimized out>, reader=..., channel=...) at Unvanquished/daemon/src/engine/client/cl_cgame.cpp:1296
#14 0x0000555555606065 in CGameVM::Syscall (this=<optimized out>, id=<optimized out>, reader=..., channel=...) at Unvanquished/daemon/src/engine/client/cl_cgame.cpp:1056
#15 0x000055555560f76d in VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&)::{lambda(unsigned int, Util::Reader)#1}::operator()(unsigned int, Util::Reader) (reader=..., id=68, this=<optimized out>) at Unvanquished/daemon/src/engine/framework/VirtualMachine.h:142
#16 IPC::detail::SendMsg<VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&)::{lambda(unsigned int, Util::Reader)#1}&, IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<>, cgClientState_t&>(IPC::Channel&, VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&)::{lambda(unsigned int, Util::Reader)#1}&, IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&) (channel=..., messageHandler=...) at Unvanquished/daemon/src/common/IPC/Channel.h:174
#17 0x00005555555feefc in IPC::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&)::{lambda(unsigned int, Util::Reader)#1}, cgClientState_t&>(IPC::Channel&, VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&)::{lambda(unsigned int, Util::Reader)#1}&&, cgClientState_t&) (messageHandler=..., channel=...) at Unvanquished/daemon/src/common/IPC/Channel.h:232
#18 VM::VMBase::SendMsg<IPC::SyncMessage<IPC::Message<IPC::Id<(unsigned short)0, (unsigned short)9>, cgClientState_t>, IPC::Reply<> >, cgClientState_t&>(cgClientState_t&) (this=0x555555ae5840 <cgvm>) at Unvanquished/daemon/src/engine/framework/VirtualMachine.h:140
#19 CGameVM::CGameRocketFrame (this=0x555555ae5840 <cgvm>) at Unvanquished/daemon/src/engine/client/cl_cgame.cpp:1043
#20 0x000055555562a93d in SCR_DrawScreenField () at Unvanquished/daemon/src/engine/client/cl_scrn.cpp:278
#21 0x000055555562a9a3 in SCR_UpdateScreen () at Unvanquished/daemon/src/engine/client/cl_scrn.cpp:318
#22 0x000055555561f0a5 in CL_Frame (msec=msec@entry=4) at Unvanquished/daemon/src/engine/client/cl_main.cpp:2089
#23 0x00005555555b240e in Com_Frame () at Unvanquished/daemon/src/engine/qcommon/common.cpp:1004
#24 0x00005555555ac9bd in Application::ClientApplication::Frame (this=0x555555881e00 <Application::GetApp()::app>) at Unvanquished/daemon/src/engine/client/ClientApplication.cpp:96
#25 0x00005555555abf85 in main (argc=<optimized out>, argv=<optimized out>) at Unvanquished/daemon/src/engine/framework/System.cpp:780

@illwieckz
Copy link
Member

illwieckz commented Apr 28, 2024

In fact it probably happened on vid_restart, it doesn't crash with low preset after an engine shutdown.

@illwieckz
Copy link
Member

illwieckz commented Apr 28, 2024

With low preset:

unvanquished_2024-04-28_194354_000

The black lines around the 23 number are blinking (Edit: it's likely part of the 23 texture, probably missing transparency like the blue shield textures).

@VReaperV
Copy link
Contributor Author

I get the same with medium preset, and with low preset I get:

Ah, I need to add that to Free(). It's just accessing memory for drawSurfs that isn't there anymore probably.

@VReaperV
Copy link
Contributor Author

The black lines around the 23 number are blinking (Edit: it's likely part of the 23 texture, probably missing transparency like the blue shield textures).

I think it's because of polygon offset. I still need to add it as part of the material state.

@VReaperV
Copy link
Contributor Author

With low preset:

That looks about right though (ignore the incorrect textures in some places for now, need to fix a thing or two for that).

Also, does the lighting look right to you? I had to invert the sign on u_InverseLightFactor, I'm not sure why it's not working correctly without that.

@VReaperV VReaperV force-pushed the material branch 2 times, most recently from b08d9a5 to 32093f4 Compare April 29, 2024 18:18
@illwieckz
Copy link
Member

I got this with llvmpipe, maybe that explains why I get black textures with radeonsi?

Warn: Compile log:
0:164(1): error: image/sampler variables may only be declared as function parameters or uniform-qualified global variables
0:164(1): error: opaque variables must be declared uniform
0:164(24): error: cannot initialize u_ColorMap variable opaque
0:164(24): error: cannot construct opaque type `sampler2D'
0:165(23): warning: `u_ColorMap' used uninitialized
Warn: Unhandled exception (15ShaderException): Couldn't compile fragment shader: genericMaterial

https://pastebin.com/NkEYRiNs

@illwieckz
Copy link
Member

I have no idea how those line happened to be there:

 161: void	main()
 162: {
 163: sampler2D u_ColorMap = sampler2D(u_ColorMap_initial); // <---
 164: 	vec4 color = texture2D(u_ColorMap, var_TexCoords);
 174: #if defined(USE_DEPTH_FADE) || defined(USE_VERTEX_SPRITE)
 175: sampler2D u_DepthMap = sampler2D(u_DepthMap_initial); // <---
 176: 	float depth = texture2D(u_DepthMap, gl_FragCoord.xy / r_FBufSize).x;
 177: 	float fadeDepth = 0.5 * var_FadeDepth.x / var_FadeDepth.y + 0.5;

@VReaperV
Copy link
Contributor Author

I got this with llvmpipe, maybe that explains why I get black textures with radeonsi?

Warn: Compile log:
0:164(1): error: image/sampler variables may only be declared as function parameters or uniform-qualified global variables
0:164(1): error: opaque variables must be declared uniform
0:164(24): error: cannot initialize u_ColorMap variable opaque
0:164(24): error: cannot construct opaque type `sampler2D'
0:165(23): warning: `u_ColorMap' used uninitialized
Warn: Unhandled exception (15ShaderException): Couldn't compile fragment shader: genericMaterial

https://pastebin.com/NkEYRiNs

That's because it's lacking the bindless textures extension (it's currently required because there's too many textures to bind them separately, without packing a lot of them into a few larger textures). I'll need to add a check to not load the material shaders if not all of the required extensions are supported.

@VReaperV
Copy link
Contributor Author

VReaperV commented Apr 30, 2024

I have no idea how those line happened to be there:

 161: void	main()
 162: {
 163: sampler2D u_ColorMap = sampler2D(u_ColorMap_initial); // <---
 164: 	vec4 color = texture2D(u_ColorMap, var_TexCoords);
 174: #if defined(USE_DEPTH_FADE) || defined(USE_VERTEX_SPRITE)
 175: sampler2D u_DepthMap = sampler2D(u_DepthMap_initial); // <---
 176: 	float depth = texture2D(u_DepthMap, gl_FragCoord.xy / r_FBufSize).x;
 177: 	float fadeDepth = 0.5 * var_FadeDepth.x / var_FadeDepth.y + 0.5;

That part is auto-generated in ShaderPostProcess(). Initially I just had it use #define uniform samplerType( uniform ), kinda like for the other uniforms, but it was crashing the AMD shader compiler or something. This is effectively the same thing.

@VReaperV
Copy link
Contributor Author

I still need to fix a small part of the material rendering code to make textures resident, so that might be why you're seeing black textures there.

@@ -1139,6 +1184,120 @@ void GLShaderManager::CompileAndLinkGPUShaderProgram( GLShader *shader, shaderPr
LinkProgram( program->program );
}

// This will generate all the extra code for material system shaders
std::string GLShaderManager::ShaderPostProcess( GLShader *shader, const std::string& shaderText, const bool isVertexShader ) {
Copy link
Member

@illwieckz illwieckz Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not writing a real .glsl file to be used to prefix the GLSL shaders with fragmentInline/vertexInline mechanism, using ifdef to select various code path and set defines based on the tests you need? Edit: This is basically an conditional “include” mechanism.

A good example of different shaders using a single source but different “inline” parts is generic and generic2D, that both uses generic_fp.glsl while using different compileMacros and vertexInlines. See GLShader_generic2D and GLShader_generic in gl_shader.cpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not writing a real .glsl file to be used to prefix the GLSL shaders with fragmentInline/vertexInline mechanism, using ifdef to select various code path and set defines based on the tests you need? Edit: This is basically an “include” mechanism.

Because I don't want to add a macro for almost every uniform. It would also mean moving all those uniform definitions to a new file for each shader, which is bad. It would also mean having to align memory for every shader manually, and with any change to uniforms too.

Also, having 2 different inlines isn't gonna work here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not writing a real .glsl file to be used to prefix the GLSL shaders with fragmentInline/vertexInline mechanism, using ifdef to select various code path and set defines based on the tests you need? Edit: This is basically an conditional “include” mechanism.

Two more issues with that is you'd have to keep all uniforms in the exact same order everywhere, manually, and it probably won't work for textures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is somewhat the case now since I did have to put some things into separate shader code files.

@VReaperV VReaperV force-pushed the material branch 2 times, most recently from 94fc72e to 7dfac81 Compare July 23, 2024 11:28
@VReaperV
Copy link
Contributor Author

r_useMaterialSystem also renamed to r_materialSystem, in line with other r_* cvars.

@illwieckz
Copy link
Member

r_useMaterialSystem also renamed to r_materialSystem, in line with other r_* cvars.

Great!

@illwieckz
Copy link
Member

The remaining Mesa radeonsi error has a pending fix:

I seen the proposed piglit test has been updated at the same time so it's probably for this too.

Once I merge the two fixes, everything works again.

I guess we will have to blacklist the broken Mesa versions for bindless textures the day we enable the feature.

@VReaperV
Copy link
Contributor Author

Good, once that is fixed I'll enable the extension but default.

Yeah, a blacklist for driver versions and print a warning about a driver update required to run the extension properly.

@illwieckz
Copy link
Member

illwieckz commented Jul 24, 2024

Here is an example of exhaustive search reporting the buggy driver version and disabling the extension when it's detected:

index a5dc70924..d9f7f9796 100644
--- a/src/engine/sys/sdl_glimp.cpp
+++ b/src/engine/sys/sdl_glimp.cpp
@@ -2023,8 +2023,45 @@ static void GLimp_InitExtensions()
 	// made required in OpenGL 4.3
 	glConfig2.computeShaderAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_compute_shader, r_arb_compute_shader.Get() );
 
-	// not required by any OpenGL version
-	glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_bindless_texture, r_arb_bindless_texture.Get() );
+	{
+		bool foundMesa241 = false;
+		std::string mesaVersion = "";
+
+		static const std::vector<std::pair<std::string, std::vector<std::string>>> versions = {
+			{ "1.0", { "1.0-devel", "1.0-rc1", "1.0-rc2", "1.0-rc3", "1.0-rc4", "1.0", } },
+			{ "1.",  { "1.1", "1.2", "1.3", "1.4", } },
+			{ "2.0", { "2.0-devel", "2.0-rc1", } },
+		};
+
+		for ( auto& vp : versions )
+		{
+			mesaVersion = Str::Format( "Mesa 24.%s", vp.first );
+
+			if ( Q_stristr( glConfig.version_string, mesaVersion.c_str() ) )
+			{
+				for ( auto& v : vp.second )
+				{
+					mesaVersion = Str::Format( "Mesa 24.%s", v );
+
+					if ( Q_stristr( glConfig.version_string, mesaVersion.c_str() ) )
+					{
+						foundMesa241 = true;
+						break;
+					}
+				}
+
+				break;
+			}
+		}
+
+		if ( foundMesa241 )
+		{
+			logger.Notice( "...found buggy %s driver", mesaVersion );
+		}
+
+		// not required by any OpenGL version
+		glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_bindless_texture, r_arb_bindless_texture.Get() && !foundMesa241 );
+	}
 
 	// made required in OpenGL 4.6
 	glConfig2.shaderDrawParametersAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_shader_draw_parameters, r_arb_shader_draw_parameters.Get() );

This does:

...found buggy Mesa 24.2.0-devel driver 
...ignoring GL_ARB_bindless_texture 

The idea of being exhaustive is to make sure to not trigger that known bug when bisecting something else in Mesa.

With some hope, the fixes will be merged before Mesa 24.2.0 is released and even maybe before Mesa 24.2.0-rc2 is out, and in worst case we can just add some version strings to the array.

@VReaperV
Copy link
Contributor Author

I was thinking of just a warning print, but I guess it's better to disable it as well on those versions since we know it doesn't work properly there.

@illwieckz
Copy link
Member

I edited the patch to make it more performant.

@VReaperV
Copy link
Contributor Author

VReaperV commented Jul 24, 2024

If it's working properly on your end with buggy/non-buggy versions I'll add it to the pr later, unless you wanna add that yourself.

@illwieckz
Copy link
Member

illwieckz commented Jul 24, 2024

We already do such kind of disablement for other buggy drivers / hardware, you can look for foundIntel and foundNvidia340.

Running the game with bindless textures and material features enabled on that broken Mesa can crash the whole computer, so we better disable the feature.

test what it prevents
foundIntel rendering glitches
foundNvidia340 engine crash
foundMesa241 computer crash

So this is the failure with the most negative impact we are implementing a workaround for, we better avoid the bug if we can.

With this patch the engine runs properly with both buggy and non-buggy driver (the engine properly disabling the broken feature when needed). It means by merging such patch it makes possible to also enable r_arb_bindless_texture by default.

Before enabling it by default it would be good to have some feedback from tests with other drivers, especially the Nvidia one and the AMD proprietary drivers. I believe @slipher has some Nvidia.

Current status (taking in account the above workaround is merged), it doesn't mean the feature is finally used, it means the game doesn't crash and renders properly with -set r_arb_bindless_texture on -set r_useMaterialSystem on:

System Driver Hardware Status Tester
🐧️ Linux 🕊️ Mesa 24.0.9 🔴️ AMD Radeon PRO W7600 ✅️ illwieckz
🐧️ Linux 🔒️ OGLP 24.10 🔴️ AMD Radeon PRO W7600 ❌️ illwieckz
🪟️ Windows 🔒️ Adrenalin 🔴️ AMD ❔️
🐧️ Linux 🕊️ Mesa 24.0.9 🟢️ Nvidia Quadro K1100M ✅️ illwieckz
🐧️ Linux 🔒️ Nvidia 470 🟢️ Nvidia Quadro K1100M ✅️ illwieckz
🪟️ Windows 🔒️ Nvidia 🟢️ Nvidia ✅️ Reaper
🐧️ Linux 🕊️ Mesa 🔵️ Intel ❔️
🪟️ Windows 🔒️ Intel 🔵️ Intel ❔️

@illwieckz
Copy link
Member

AMD OGLP on Linux produces a black screen with bindless textures. The problem is that both AMD OGLP on Linux and AMD Adrenalin on Windows has GL_VENDOR set to ATI Technologies Inc. so we can't know if it's Adrenalin or OGLP. We can do an #ifndef _WIN32 but then Unvanquished on OGLP on Wine on Linux would get the black screen.

@VReaperV
Copy link
Contributor Author

We already do such kind of disablement for other buggy drivers / hardware, you can look for foundIntel and foundNvidia340.

I'll look at those, thanks.

Nvidia works fine with it on Windows, I can maybe test it on an Intel iGPU later as well.

@VReaperV VReaperV assigned VReaperV and unassigned VReaperV Jul 25, 2024
Implement material system, set world BSP surfaces to use it. Material system works by generating enough materials (a distinct GL state that has to be set for the whole draw call) for each stage of every drawSurf_t.
All of the per-surface data that was previously set through uniforms for each drawSurf_t is now sourced from a buffer.
The remaining (global) uniforms like u_ViewUp and u_ViewOrigin are still set as usual.
Updates for surfaces whose parameters change with time are done by mapping the corresponding part of the memory and copying all of the data into it, then flushing it all at once.

Implement bindless texture support.
@VReaperV
Copy link
Contributor Author

Here is an example of exhaustive search reporting the buggy driver version and disabling the extension when it's detected:

I've now added this patch.

Also added a warn if r_smp is not 0 while material system is enabled.

@VReaperV
Copy link
Contributor Author

Nvidia works fine with it on Windows, I can maybe test it on an Intel iGPU later as well.

That intel iGPU apparently doesn't support bindless textures, so I'll be unable to test that.

@illwieckz
Copy link
Member

illwieckz commented Jul 26, 2024

In Khronos wiki: https://www.khronos.org/opengl/wiki/Bindless_Texture

I read this:

It is implemented across much of the OpenGL 4.x hardware spectrum. Intel is mostly absent, but both AMD and NVIDIA have a lot of hardware that can handle it.

But I don't know if that information is obsolete or not (that mention is from 2015, the page has not been updated since 2021).

@illwieckz
Copy link
Member

If it's ready to merge I can make a PR enabling bindless and detecting drivers for selective disablement after this is merged.

Is there something remaining to do?

@illwieckz
Copy link
Member

illwieckz commented Jul 26, 2024

According to the Intel Xe on Windows 10 supports bindless texture: https://opengl.gpuinfo.org/displayreport.php?id=13101

I don't own any Intel Xe.

I have some Intel HD (from UHD generation) but I doubt it supports it, for example this public listing for an HD 520 doesn't list the bindless texture extension: https://opengl.gpuinfo.org/displayreport.php?id=13055 neither this one for an UHD 620: https://opengl.gpuinfo.org/displayreport.php?id=12782

@VReaperV
Copy link
Contributor Author

gpuinfo.org is the one to check for hardware capabilities/extension availability yes.

I have some Intel HD (from UHD generation) but I doubt it supports it, for example this public listing for an HD 520 doesn't list the bindless texture extension: https://opengl.gpuinfo.org/displayreport.php?id=13055 neither this one for an UHD 620: https://opengl.gpuinfo.org/displayreport.php?id=12782

It's the report for OpenGL ES, probably emulated or something there. The version for OpenGL has it - https://opengl.gpuinfo.org/displayreport.php?id=12781.
Anyway, the page to check for extension availability is this one - https://opengl.gpuinfo.org/listreports.php?extension=GL_ARB_bindless_texture, just search for intel there.

@VReaperV
Copy link
Contributor Author

If it's ready to merge I can make a PR enabling bindless and detecting drivers for selective disablement after this is merged.

Sure, that'd work well.

Is there something remaining to do?

No, I'm just waiting for any more reviews or an LGTM :D

Copy link
Member

@illwieckz illwieckz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@VReaperV VReaperV merged commit f960177 into DaemonEngine:master Jul 27, 2024
9 checks passed
@VReaperV
Copy link
Contributor Author

Thx, biggest part of the changes done now finally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants