Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GS/HW: Reduce sw/hdr colclip in more cases. #11809

Merged
merged 1 commit into from
Sep 14, 2024
Merged

Conversation

lightningterror
Copy link
Contributor

@lightningterror lightningterror commented Sep 13, 2024

Description of Changes

GS/HW: Reduce sw/hdr colclip in more cases.
When doing Cs*Alpha + Cd*(1 - Alpha) or Cd*Alpha + Cs*(1 - Alpha) with an alpha of 128 or lower we don't really need to hdr or sw colclip blend it because colour range of the result will be between 0-1 (0-255) without it overflowing.

Also update previous optimizations to include Ad cases when RTA is already scaled.

Rationale behind Changes

More speed, less barriers/draw calls/copies.

Suggested Testing Steps

Notable dump mentions:

dbz_bt2_oi_imine_depth Draw Calls: -26 [991=>965] Render Passes: -25 [69=>44] Barriers: -1 [12=>11] Copies: -25 [27=>2]
Power_Rangers_-_Super_Legends_SLUS-21679_20221215160520 Draw Calls: -20 [109=>89] Render Passes: -26 [56=>30] Barriers: -7 [9=>2] Copies: -20 [21=>1]
Tales_of_Destiny_Directors_Cut_SLPS-25842_20221015223409 Render Passes: -1 [7=>6] Barriers: -29 [29=>0]

Benchmark/test the following dumps on vk/gl/dx:
Dumps.zip

List of affected games/dumps to test:
GameList.txt

When doing `Cs*Alpha + Cd*(1 - Alpha)` or `Cd*Alpha + Cs*(1 - Alpha)` with an alpha of 128 or lower
we don't really need to hdr or sw colclip blend it because colour range of the result will be
between 0-1 (0-255) without it overflowing.

Also update previous optimizations to include Ad cases when RTA is already scaled.
@JordanTheToaster
Copy link
Member

Some benchs that mark of 2 games that showed the most improvement.

image

image

Copy link
Contributor

@kamfretoz kamfretoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some benchmarks of the dumps on AMD GPU:
CapFrameX_lGHwupQGfm

CapFrameX_eTFfRE7PQP

CapFrameX_CL0I6Fokc4

@lightningterror lightningterror merged commit e8e0b97 into master Sep 14, 2024
22 checks passed
@lightningterror lightningterror deleted the gs_cclip_opt branch September 14, 2024 23:09
@lightningterror lightningterror added this to the Release 2.2 milestone Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants