Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor dx shaders #1621

Merged
merged 3 commits into from
Oct 6, 2023
Merged

Refactor dx shaders #1621

merged 3 commits into from
Oct 6, 2023

Conversation

ns6089
Copy link
Contributor

@ns6089 ns6089 commented Sep 11, 2023

Description

Based on #1602

Shader refactoring.
Mainly deduplication, but also makes adding yuv444 and type2 chroma subsampling (also called topleft, recommended for bt.2020, looks smoother since it uses more pixels for averaging) pretty trivial.

Screenshot

Issues Fixed or Closed

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Dependency update (updates to dependencies)
  • Documentation update (changes to documentation)
  • Repository update (changes to repository files, e.g. .github/...)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated the in code docstring/documentation-blocks for new or existing methods/components

Branch Updates

LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.

  • I want maintainers to keep my branch updated

@ns6089
Copy link
Contributor Author

ns6089 commented Sep 12, 2023

Should be ready (after #1602).

The shader file naming scheme is done with future work in mind.
I have these files locally:
2023-09-12 10_33_21-Window

@cgutman
Copy link
Collaborator

cgutman commented Sep 17, 2023

Looks like there are some conflicts. Can you rebase?

@ns6089
Copy link
Contributor Author

ns6089 commented Sep 18, 2023

Looks like there are some conflicts. Can you rebase?

Not real conflicts, just the commits from the rotation PR, done.

float3 rgb_top_right = image.Sample(def_sampler, input.tex_right_left_top.yz).rgb;
float3 rgb_bottom_left = image.Sample(def_sampler, input.tex_right_left_bottom.xz).rgb;
float3 rgb_bottom_right = image.Sample(def_sampler, input.tex_right_left_bottom.yz).rgb;
float3 rgb = CONVERT_FUNCTION((rgb_top_left + rgb_top_right + rgb_bottom_left + rgb_bottom_right) * 0.25);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but based on this illustration, it seems like we're doing top and center instead of left and top-left.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, buckle in.
41
Let's concentrate on the "Type 0", this the only one we currently use and what ffmpeg calls AVCHROMA_LOC_LEFT
Untitled
I've numbered the pixels in the source image, the red square is the area which target chroma subsample occupies.

We calculate that chroma subsample using box filter, or in other words just average all source pixels that are "covered" by the target pixel, if coverage is partial - the corresponding pixel weight is reduced.

Going back to our subsample, we need pixels 2,5 with 1 weight, and pixel 1,3,4,6 with 0.5 weights.

This can be done with 6 texture fetches, but we're too smart (for our good) and do it in 2.
If we ask the linear texture sampler for a point between pixels 2,3,5,6, it will give us the average of these 4 pixels with even weights. Same thing can be done with 1,2,4,5.
And now if we sum these two pixels - we get the weights we want, because pixels 2,5 are used twice, just need to divide it by 2.

So our vertex shader generates these in-between texture points. Type 2 or AVCHROMA_LOC_TOPLEFT is done in a similar manner, 9 texture fetches are reduced to 4 optimized ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also why literally any scaling breaks our chroma - that requires box filter with different radius and the whole math needs to be redone,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the last part, which we currently don't do. That linear texture sampling and shader averaging must be done in linear light. Since we currently do it in srgb gamma (unless hdr), we end up with less saturated colors than they should be. But that's harder to notice than outright blocky artifacts we had until recently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the detailed explanation.

@ns6089 ns6089 mentioned this pull request Sep 22, 2023
11 tasks
@cgutman cgutman merged commit 974c4bd into LizardByte:nightly Oct 6, 2023
43 checks passed
@ns6089 ns6089 mentioned this pull request Oct 29, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants