Compare commits

...

31 commits

Author SHA1 Message Date
Wunk
ac0cd66ca3
Merge c1e1b2b93a into 379649dbce 2026-06-05 15:50:07 +00:00
Wunkolo
c1e1b2b93a renderer_gl: Add layer/level specifier to MSAA resolves 2026-06-05 08:49:58 -07:00
Wunkolo
5e9b2aa33a renderer_gl: Fix MakeHandle string-view hazard
`glObjectLabel` expects a null-terminated string, but an `std::string_view` is not necessarily null-terminated
2026-06-05 08:49:58 -07:00
Wunkolo
f01b8fcdcd renderer_gl: Implement TextureRuntime::ClearTexture MSAA clears
When ArbClearTexture is not available, ensure the fallback implementation also clears the MSAA texture as well.
2026-06-05 08:49:58 -07:00
Wunkolo
3b3a5e5130 renderer_gl: Resolve framebuffer after each draw
Brute force approach while trying to determine a better heuristic. Framebuffer changes are not enough to determine the end of a "render pass". An 'msaa dirty flag' is likely the better way here.
2026-06-05 08:49:58 -07:00
Wunkolo
15be844ed8 renderer_gl: Implement multisample Convert{DS24S8ToRGBA8,RGBA4ToRGB5A1} 2026-06-05 08:49:58 -07:00
Wunkolo
9fd44f99d6 renderer_gl: Fix surface ScaleUp implementation
Should assign the new `res_scale` and `sample_count`
2026-06-05 08:49:58 -07:00
Wunkolo
2743ebd0c9 renderer_gl: Initial MSAA implementation
Basically copied over some of the paradigms over from Vulkan. Covers most rendering uses-cases except for conversions such as `ConvertDS24S8ToRGBA8` and `ConvertRGBA4ToRGB5A1`
2026-06-05 08:49:58 -07:00
Wunkolo
29954c1392 config: Refactor sample_count to antialiasing setting
Intended to allow for other anti-aliasing methods to be introduced
2026-06-05 08:49:58 -07:00
Wunkolo
0cb1680a20 renderer_vulkan: Fix Multisample ConvertDS24S8ToRGBA8
Migrate `ResolveTexture` to `BlitHelper`.
If the conversion was multi-sample, do a final resolve at the end of the conversion to ensure the non-multisample textures are updated as well.

Fixes Pokemon!
2026-06-05 08:49:58 -07:00
Wunkolo
a0a0cd648e renderer_vulkan: Fix MSAA image debug name 2026-06-05 08:49:58 -07:00
Wunkolo
0f8ffa6e66 renderer_vulkan: Fix multisample framebuffer creation
MSAA renderpasses require the "current" image as the main resolve target, and the MSAA image as the second attachment.
2026-06-05 08:49:58 -07:00
Wunkolo
5960b60d88 renderer_vulkan: Fix multisample ClearTextureWithRenderpass
Ensure that the multisample framebuffer is used rather than the usual one when using a render pass to clear the textures.

That way, a MSAA-render AND resolve is used to clear the textures at the same time.
2026-06-05 08:49:58 -07:00
Wunkolo
492304e215 renderer_vulkan: Fix ConvertDS24S8ToRGBA8 image targets
Ensure that the Multi-Sample texture is used for the destination color image as well. Should the dest image be MSAA too? Or should all the values be resolved into a minimum depth and some combination of stencil-values here?
2026-06-05 08:49:58 -07:00
Wunkolo
7b908bb4bc renderer_vulkan: Fix Surface::ScaleUp scale/sample increase
This should be checking the _new_ value to possibly cull upscaled texture creation rather than the current value of the surface. Fixes broken up render passes when drawing UI in some games.
2026-06-05 08:49:58 -07:00
Wunkolo
70b393e56d renderer_vulkan: Fix MSAA framebuffer target resolve surface
Use the specified type rather than defaulting to the surface's current one(implicit `ImageView` argument).
2026-06-05 08:49:58 -07:00
Wunkolo
0810828345 renderer_vulkan: Fix narrowing byte conversion
Fixes a compilation error on Unix platforms.
2026-06-05 08:49:58 -07:00
Wunkolo
a0655d5674 renderer_vulkan: Add TextureRuntime::ResolveTexture
Rather than use a big lambda, just rip this out into a proper function for other blit functions to utilize.
2026-06-05 08:49:58 -07:00
Wunkolo
8e839016d4 renderer_vulkan: Fix cleanup and debug-naming for d24s8_to_rgba8_ms_comp 2026-06-05 08:49:58 -07:00
Wunkolo
fd3a0a99a0 renderer_vulkan: Fix initialization of image handles
Try to optimally create the new image handles when a change in res scale or sample-count has actually occured. MSAA images need to be updated too in the case that the resolution scale has changed
2026-06-05 08:49:58 -07:00
Wunkolo
02eb015d36 renderer_vulkan: Derive framebuffer sample-count from attachments
Derive the framebuffer sample-count from the input color and depth operands. Similar to how `res_scale` is determined.
2026-06-05 08:49:58 -07:00
Wunkolo
177edd228e renderer_vulkan: Fix dangling surface reference during msaa resolve
These individual parameters need to be copied as the reference to the surface-object only lasts within the scope of this function.
2026-06-05 08:49:58 -07:00
Wunkolo
9d6527b3ae renderer_vulkan: Fix Framebuffer::sample_count move-operator
`sample_count` needs to be move/copied over.

Also reorder the accessor order to match the declaration of variables.
2026-06-05 08:49:58 -07:00
Wunkolo
fa97aaade4 renderer_vulkan: Fix multisample texture init barrier
Should address the MultiSampled image directly since the multisampled image is just a transient image and not the leading state of the image.
2026-06-05 08:49:58 -07:00
Wunkolo
88ded6e8c2 renderer_vulkan: Implement multisample texture runtime
This seems to be enough for simple programs to render with MSAA enabled!
2026-06-05 08:49:58 -07:00
Wunkolo
ff7b039c5c renderer_vulkan: Implement multisample pipeline/renderpass support
Allows multi-sample render passes and graphics pipelines to be created, using sample-rate shading rather than coverage-based MSAA.
2026-06-05 08:49:58 -07:00
Wunkolo
fdf022753c rasterizer_cache: Initial support for multi-sample surfaces 2026-06-05 08:49:58 -07:00
Wunkolo
f682a89bdc vk_instance: Add detection of MSAA features
Full multi-sample support is when renderpass-2 and depth-stencil-resolve extensions are available and when sample-rate-shading and msaa-storage-images are supported.
2026-06-05 08:49:58 -07:00
Wunkolo
fcadfd8e19 vk_blit_helper: Add d24s8_to_rgba8_ms_comp
Helper host-shader for blitting multi-sampled DS24S8 textures to multi-sampled RGBA8
2026-06-05 08:49:58 -07:00
Wunkolo
faf61b898a config: Add sample_count renderer option
Option is only enabled when the renderer is set to Vulkan, for now.
2026-06-05 08:49:58 -07:00
crueter
379649dbce
cmake: Fix MoltenVK fetch order/library conflicts (#2183)
* [cmake] Fix MoltenVK fetch order/library conflicts

Rather than dealing with `find_library` shenanigans, just set the
library path directly (when using bundled MoltenVK). System MoltenVK
solely uses `find_library`.

Avoids cache nonsense that can cause system/bundled versions to get
mixed up, and overall makes the system/bundled mvk handling a lot more
consistent

```
cmake -S . -B build -DUSE_SYSTEM_MOLTENVK=ON
-- Using MoltenVK at /opt/homebrew/lib/libMoltenVK.dylib.
cmake -S . -B build -DUSE_SYSTEM_MOLTENVK=OFF
-- Using MoltenVK at /Users/crueter/code/azahar/build/externals/MoltenVK/MoltenVK/dynamic/dylib/macOS/libMoltenVK.dylib.
cmake -S . -B build -DUSE_SYSTEM_MOLTENVK=ON
-- Using MoltenVK at /opt/homebrew/lib/libMoltenVK.dylib.
```

Signed-off-by: crueter <crueter@eden-emu.dev>

* remove old comment

Signed-off-by: crueter <crueter@eden-emu.dev>

* Cleanup

---------

Signed-off-by: crueter <crueter@eden-emu.dev>
Co-authored-by: OpenSauce04 <opensauce04@gmail.com>
2026-06-05 16:36:48 +01:00
36 changed files with 1024 additions and 252 deletions

View file

@ -411,13 +411,21 @@ if (APPLE)
endif()
find_library(AVFOUNDATION_LIBRARY AVFoundation REQUIRED)
find_library(IOSURFACE_LIBRARY IOSurface REQUIRED)
set(PLATFORM_LIBRARIES ${COCOA_LIBRARY} ${AVFOUNDATION_LIBRARY} ${IOSURFACE_LIBRARY} ${MOLTENVK_LIBRARY})
set(PLATFORM_LIBRARIES ${COCOA_LIBRARY} ${AVFOUNDATION_LIBRARY} ${IOSURFACE_LIBRARY})
if (ENABLE_VULKAN AND NOT ENABLE_LIBRETRO)
if (NOT USE_SYSTEM_MOLTENVK)
download_moltenvk()
endif()
if (USE_SYSTEM_MOLTENVK)
find_library(MOLTENVK_LIBRARY MoltenVK REQUIRED)
else()
download_moltenvk()
if (IOS)
set(MOLTENVK_RELATIVE_LIBPATH "static/MoltenVK.xcframework/ios-arm64/libMoltenVK.a")
else()
set(MOLTENVK_RELATIVE_LIBPATH "dynamic/dylib/macOS/libMoltenVK.dylib")
endif()
set(MOLTENVK_LIBRARY "${CMAKE_BINARY_DIR}/externals/MoltenVK/MoltenVK/${MOLTENVK_RELATIVE_LIBPATH}")
endif()
message(STATUS "Using MoltenVK at ${MOLTENVK_LIBRARY}.")
set(PLATFORM_LIBRARIES ${PLATFORM_LIBRARIES} ${MOLTENVK_LIBRARY})
endif()

View file

@ -171,15 +171,8 @@ function(download_qt target)
endfunction()
function(download_moltenvk)
if (IOS)
set(MOLTENVK_PLATFORM "static/MoltenVK.xcframework/ios-arm64")
else()
set(MOLTENVK_PLATFORM "dynamic/dylib/macOS")
endif()
set(MOLTENVK_DIR "${CMAKE_BINARY_DIR}/externals/MoltenVK")
set(MOLTENVK_TAR "${CMAKE_BINARY_DIR}/externals/MoltenVK.tar")
if (NOT EXISTS ${MOLTENVK_DIR})
if (NOT EXISTS "${CMAKE_BINARY_DIR}/externals/MoltenVK")
if (NOT EXISTS ${MOLTENVK_TAR})
file(DOWNLOAD https://github.com/KhronosGroup/MoltenVK/releases/download/v1.2.9/MoltenVK-all.tar
${MOLTENVK_TAR} SHOW_PROGRESS)
@ -188,10 +181,6 @@ function(download_moltenvk)
execute_process(COMMAND ${CMAKE_COMMAND} -E tar xf "${MOLTENVK_TAR}"
WORKING_DIRECTORY "${CMAKE_BINARY_DIR}/externals")
endif()
# Add the MoltenVK library path to the prefix so find_library can locate it.
list(APPEND CMAKE_PREFIX_PATH "${MOLTENVK_DIR}/MoltenVK/${MOLTENVK_PLATFORM}")
set(CMAKE_PREFIX_PATH ${CMAKE_PREFIX_PATH} PARENT_SCOPE)
endfunction()
function(get_external_prefix lib_name prefix_var)

View file

@ -44,6 +44,7 @@ foreach(KEY IN ITEMS
"use_display_refresh_rate_detection"
"use_shader_jit"
"resolution_factor"
"antialiasing"
"frame_limit"
"turbo_limit"
"texture_filter"

View file

@ -150,6 +150,7 @@ void Config::ReadValues() {
ReadSetting("Renderer", Settings::values.use_hw_shader);
ReadSetting("Renderer", Settings::values.use_shader_jit);
ReadSetting("Renderer", Settings::values.resolution_factor);
ReadSetting("Renderer", Settings::values.antialiasing);
ReadSetting("Renderer", Settings::values.use_disk_shader_cache);
ReadSetting("Renderer", Settings::values.use_vsync);
ReadSetting("Renderer", Settings::values.texture_filter);

View file

@ -716,6 +716,7 @@ void QtConfig::ReadRendererValues() {
ReadGlobalSetting(Settings::values.use_vsync);
ReadGlobalSetting(Settings::values.use_display_refresh_rate_detection);
ReadGlobalSetting(Settings::values.resolution_factor);
ReadGlobalSetting(Settings::values.antialiasing);
ReadGlobalSetting(Settings::values.use_integer_scaling);
ReadGlobalSetting(Settings::values.frame_limit);
ReadGlobalSetting(Settings::values.turbo_limit);
@ -1265,6 +1266,7 @@ void QtConfig::SaveRendererValues() {
WriteGlobalSetting(Settings::values.use_vsync);
WriteGlobalSetting(Settings::values.use_display_refresh_rate_detection);
WriteGlobalSetting(Settings::values.resolution_factor);
WriteGlobalSetting(Settings::values.antialiasing);
WriteGlobalSetting(Settings::values.use_integer_scaling);
WriteGlobalSetting(Settings::values.frame_limit);
WriteGlobalSetting(Settings::values.turbo_limit);

View file

@ -19,8 +19,9 @@ ConfigureEnhancements::ConfigureEnhancements(QWidget* parent)
SetConfiguration();
const auto graphics_api = Settings::values.graphics_api.GetValue();
const bool res_scale_enabled = graphics_api != Settings::GraphicsAPI::Software;
ui->resolution_factor_combobox->setEnabled(res_scale_enabled);
const bool hardware_graphics = graphics_api != Settings::GraphicsAPI::Software;
ui->resolution_factor_combobox->setEnabled(hardware_graphics);
ui->antialiasing_combobox->setEnabled(hardware_graphics);
connect(ui->render_3d_combobox, qOverload<int>(&QComboBox::currentIndexChanged), this,
[this](int currentIndex) {
@ -44,6 +45,8 @@ void ConfigureEnhancements::SetConfiguration() {
if (!Settings::IsConfiguringGlobal()) {
ConfigurationShared::SetPerGameSetting(ui->resolution_factor_combobox,
&Settings::values.resolution_factor);
ConfigurationShared::SetPerGameSetting(ui->antialiasing_combobox,
&Settings::values.antialiasing);
ConfigurationShared::SetPerGameSetting(ui->texture_filter_combobox,
&Settings::values.texture_filter);
ConfigurationShared::SetHighlight(ui->widget_texture_filter,
@ -51,6 +54,8 @@ void ConfigureEnhancements::SetConfiguration() {
} else {
ui->resolution_factor_combobox->setCurrentIndex(
Settings::values.resolution_factor.GetValue());
ui->antialiasing_combobox->setCurrentIndex(
static_cast<int>(Settings::values.antialiasing.GetValue()));
ui->texture_filter_combobox->setCurrentIndex(
static_cast<int>(Settings::values.texture_filter.GetValue()));
}
@ -111,6 +116,8 @@ void ConfigureEnhancements::RetranslateUI() {
void ConfigureEnhancements::ApplyConfiguration() {
ConfigurationShared::ApplyPerGameSetting(&Settings::values.resolution_factor,
ui->resolution_factor_combobox);
ConfigurationShared::ApplyPerGameSetting(&Settings::values.antialiasing,
ui->antialiasing_combobox);
Settings::values.render_3d =
static_cast<Settings::StereoRenderOption>(ui->render_3d_combobox->currentIndex());
Settings::values.swap_eyes_3d = ui->swap_eyes_3d->isChecked();
@ -149,6 +156,7 @@ void ConfigureEnhancements::SetupPerGameUI() {
// Block the global settings if a game is currently running that overrides them
if (Settings::IsConfiguringGlobal()) {
ui->widget_resolution->setEnabled(Settings::values.resolution_factor.UsingGlobal());
ui->widget_antialiasing->setEnabled(Settings::values.antialiasing.UsingGlobal());
ui->widget_texture_filter->setEnabled(Settings::values.texture_filter.UsingGlobal());
ui->toggle_linear_filter->setEnabled(Settings::values.filter_mode.UsingGlobal());
ui->use_integer_scaling->setEnabled(Settings::values.use_integer_scaling.UsingGlobal());
@ -189,6 +197,10 @@ void ConfigureEnhancements::SetupPerGameUI() {
ui->resolution_factor_combobox, ui->widget_resolution,
static_cast<int>(Settings::values.resolution_factor.GetValue(true)));
ConfigurationShared::SetColoredComboBox(
ui->antialiasing_combobox, ui->widget_antialiasing,
static_cast<int>(Settings::values.antialiasing.GetValue(true)));
ConfigurationShared::SetColoredComboBox(
ui->texture_filter_combobox, ui->widget_texture_filter,
static_cast<int>(Settings::values.texture_filter.GetValue(true)));

View file

@ -110,6 +110,55 @@
</layout>
</widget>
</item>
<item>
<widget class="QWidget" name="widget_antialiasing" native="true">
<layout class="QHBoxLayout" name="horizontalLayout">
<property name="leftMargin">
<number>0</number>
</property>
<property name="topMargin">
<number>0</number>
</property>
<property name="rightMargin">
<number>0</number>
</property>
<property name="bottomMargin">
<number>0</number>
</property>
<item>
<widget class="QLabel" name="antialiasing_label">
<property name="text">
<string>Anti-Aliasing</string>
</property>
</widget>
</item>
<item>
<widget class="QComboBox" name="antialiasing_combobox">
<item>
<property name="text">
<string>None</string>
</property>
</item>
<item>
<property name="text">
<string>MSAAx2</string>
</property>
</item>
<item>
<property name="text">
<string>MSAAx4</string>
</property>
</item>
<item>
<property name="text">
<string>MSAAx8</string>
</property>
</item>
</widget>
</item>
</layout>
</widget>
</item>
<item>
<widget class="QCheckBox" name="use_integer_scaling">
<property name="text">

View file

@ -97,6 +97,7 @@ void LogSettings() {
log_setting("Renderer_ShadersAccurateMul", values.shaders_accurate_mul.GetValue());
log_setting("Renderer_UseShaderJit", values.use_shader_jit.GetValue());
log_setting("Renderer_UseResolutionFactor", values.resolution_factor.GetValue());
log_setting("Renderer_SampleCount", values.antialiasing.GetValue());
log_setting("Renderer_UseIntegerScaling", values.use_integer_scaling.GetValue());
log_setting("Renderer_FrameLimit", values.frame_limit.GetValue());
log_setting("Renderer_VSyncNew", values.use_vsync.GetValue());
@ -213,6 +214,7 @@ void RestoreGlobalState(bool is_powered_on) {
values.shaders_accurate_mul.SetGlobal(true);
values.use_vsync.SetGlobal(true);
values.resolution_factor.SetGlobal(true);
values.antialiasing.SetGlobal(true);
values.use_integer_scaling.SetGlobal(true);
values.frame_limit.SetGlobal(true);
values.texture_filter.SetGlobal(true);

View file

@ -103,6 +103,26 @@ enum class AudioEmulation : u32 {
LLEMultithreaded = 2,
};
enum class AntiAliasingMethod : u32 {
None = 0,
MSAAx2 = 1,
MSAAx4 = 2,
MSAAx8 = 3,
};
static inline u8 GetAntiAliasingSampleCount(AntiAliasingMethod antialiasing_method) {
switch (antialiasing_method) {
case AntiAliasingMethod::MSAAx2:
return 2;
case AntiAliasingMethod::MSAAx4:
return 4;
case AntiAliasingMethod::MSAAx8:
return 8;
default:
return 1;
}
}
enum class TextureFilter : u32 {
NoFilter = 0,
Anime4K = 1,
@ -535,6 +555,7 @@ struct Values {
true, Keys::use_display_refresh_rate_detection};
Setting<bool> use_shader_jit{true, Keys::use_shader_jit};
SwitchableSetting<u32, true> resolution_factor{1, 0, 10, Keys::resolution_factor};
SwitchableSetting<AntiAliasingMethod> antialiasing{AntiAliasingMethod::None, "antialiasing"};
SwitchableSetting<bool> use_integer_scaling{false, Keys::use_integer_scaling};
SwitchableSetting<double, true> frame_limit{100, 0, 1000, Keys::frame_limit};
SwitchableSetting<double, true> turbo_limit{200, 0, 1000, Keys::turbo_limit};

View file

@ -1,11 +1,14 @@
# Copyright 2023 Citra Emulator Project
# Copyright Citra Emulator Project / Azahar Emulator Project
# Licensed under GPLv2 or any later version
# Refer to the license.txt file included.
set(SHADER_FILES
format_reinterpreter/d24s8_to_rgba8.frag
format_reinterpreter/d24s8_to_rgba8_ms.frag
format_reinterpreter/rgba4_to_rgb5a1.frag
format_reinterpreter/rgba4_to_rgb5a1_ms.frag
format_reinterpreter/vulkan_d24s8_to_rgba8.comp
format_reinterpreter/vulkan_d24s8_to_rgba8_ms.comp
texture_filtering/bicubic.frag
texture_filtering/refine.frag
texture_filtering/scale_force.frag

View file

@ -0,0 +1,25 @@
// Copyright Citra Emulator Project / Azahar Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
//? #version 430 core
precision highp int;
precision highp float;
layout(location = 0) in mediump vec2 tex_coord;
layout(location = 0) out lowp vec4 frag_color;
layout(binding = 0) uniform highp sampler2DMS depth;
layout(binding = 1) uniform lowp usampler2DMS stencil;
void main() {
mediump vec2 coord = tex_coord * vec2(textureSize(depth));
mediump ivec2 tex_icoord = ivec2(coord);
highp uint depth_val =
uint(texelFetch(depth, tex_icoord, gl_SampleID).x * (exp2(32.0) - 1.0));
lowp uint stencil_val = texelFetch(stencil, tex_icoord, gl_SampleID).x;
highp uvec4 components =
uvec4(stencil_val, (uvec3(depth_val) >> uvec3(24u, 16u, 8u)) & 0x000000FFu);
frag_color = vec4(components) / (exp2(8.0) - 1.0);
}

View file

@ -0,0 +1,22 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
//? #version 430 core
precision highp int;
precision highp float;
layout(location = 0) in mediump vec2 tex_coord;
layout(location = 0) out lowp vec4 frag_color;
layout(binding = 0) uniform lowp sampler2D source;
void main() {
mediump vec2 coord = tex_coord * vec2(textureSize(source, 0));
mediump ivec2 tex_icoord = ivec2(coord);
lowp ivec4 rgba4 = ivec4(texelFetch(source, tex_icoord, 0) * (exp2(4.0) - 1.0));
lowp ivec3 rgb5 =
((rgba4.rgb << ivec3(1, 2, 3)) | (rgba4.gba >> ivec3(3, 2, 1))) & 0x1F;
frag_color = vec4(vec3(rgb5) / (exp2(5.0) - 1.0), rgba4.a & 0x01);
}

View file

@ -0,0 +1,31 @@
// Copyright Citra Emulator Project / Azahar Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#version 450 core
#extension GL_EXT_samplerless_texture_functions : require
layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in;
layout(set = 0, binding = 0) uniform highp texture2DMS depth;
layout(set = 0, binding = 1) uniform lowp utexture2DMS stencil;
layout(set = 0, binding = 2, rgba8) uniform highp writeonly image2DMS color;
layout(push_constant, std140) uniform ComputeInfo {
mediump ivec2 src_offset;
mediump ivec2 dst_offset;
mediump ivec2 extent;
};
void main() {
ivec2 src_coord = src_offset + ivec2(gl_GlobalInvocationID.xy);
ivec2 dst_coord = dst_offset + ivec2(gl_GlobalInvocationID.xy);
int sample_count = textureSamples(depth);
for(int sample_index = 0; sample_index < sample_count; ++sample_index)
{
highp uint depth_val = uint(texelFetch(depth, src_coord, sample_index).x * (exp2(32.0) - 1.0));
lowp uint stencil_val = texelFetch(stencil, src_coord, sample_index).x;
highp uvec4 components = uvec4(stencil_val, (uvec3(depth_val) >> uvec3(24u, 16u, 8u)) & 0x000000FFu);
imageStore(color, dst_coord, sample_index, vec4(components) / (exp2(8.0) - 1.0));
}
}

View file

@ -38,7 +38,7 @@ RasterizerCache<T>::RasterizerCache(Memory::MemorySystem& memory_,
Pica::RegsInternal& regs_, RendererBase& renderer_)
: memory{memory_}, custom_tex_manager{custom_tex_manager_}, runtime{runtime_}, regs{regs_},
renderer{renderer_}, resolution_scale_factor{renderer.GetResolutionScaleFactor()},
filter{Settings::values.texture_filter.GetValue()},
sample_count{renderer.GetSampleCount()}, filter{Settings::values.texture_filter.GetValue()},
dump_textures{Settings::values.dump_textures.GetValue()},
use_custom_textures{Settings::values.custom_textures.GetValue()} {
using TextureConfig = Pica::TexturingRegs::TextureConfig;
@ -96,12 +96,15 @@ void RasterizerCache<T>::TickFrame() {
}
const u32 scale_factor = renderer.GetResolutionScaleFactor();
const u32 samples = renderer.GetSampleCount();
const bool resolution_scale_changed = resolution_scale_factor != scale_factor;
const bool sample_count_changed = sample_count != samples;
const bool use_custom_texture_changed =
Settings::values.custom_textures.GetValue() != use_custom_textures;
if (resolution_scale_changed || use_custom_texture_changed) {
if (resolution_scale_changed || use_custom_texture_changed || sample_count_changed) {
resolution_scale_factor = scale_factor;
sample_count = renderer.GetSampleCount();
use_custom_textures = Settings::values.custom_textures.GetValue();
if (use_custom_textures) {
custom_tex_manager.FindCustomTextures();
@ -287,6 +290,7 @@ bool RasterizerCache<T>::AccelerateDisplayTransfer(const Pica::DisplayTransferCo
: config.output_height.Value();
dst_params.is_tiled = config.input_linear != config.dont_swizzle;
dst_params.pixel_format = PixelFormatFromGPUPixelFormat(config.output_format);
dst_params.sample_count = sample_count;
dst_params.UpdateParams();
// Using flip_vertically alongside crop_input_lines produces skewed output on hardware.
@ -302,6 +306,7 @@ bool RasterizerCache<T>::AccelerateDisplayTransfer(const Pica::DisplayTransferCo
}
dst_params.res_scale = slot_surfaces[src_surface_id].res_scale;
dst_params.sample_count = slot_surfaces[src_surface_id].sample_count;
const auto [dst_surface_id, dst_rect] =
GetSurfaceSubRect(dst_params, ScaleMatch::Upscale, false);
@ -432,8 +437,10 @@ void RasterizerCache<T>::CopySurface(Surface& src_surface, Surface& dst_surface,
const u32 src_scale = src_surface.res_scale;
const u32 dst_scale = dst_surface.res_scale;
if (src_scale > dst_scale) {
dst_surface.ScaleUp(src_scale);
const u32 src_sample_count = src_surface.sample_count;
const u32 dst_sample_count = dst_surface.sample_count;
if ((src_scale > dst_scale) || (src_sample_count > dst_sample_count)) {
dst_surface.ScaleUp(src_scale, src_sample_count);
}
const auto src_rect = src_surface.GetScaledSubRect(subrect_params);
@ -502,6 +509,7 @@ typename RasterizerCache<T>::SurfaceRect_Tuple RasterizerCache<T>::GetSurfaceSub
if (surface_id) {
SurfaceParams new_params = slot_surfaces[surface_id];
new_params.res_scale = params.res_scale;
new_params.sample_count = params.sample_count;
surface_id = CreateSurface(new_params, create_initial_flags);
RegisterSurface(surface_id);
@ -706,6 +714,7 @@ FramebufferHelper<T> RasterizerCache<T>::GetFramebufferSurfaces(bool using_color
SurfaceParams color_params;
color_params.is_tiled = true;
color_params.res_scale = resolution_scale_factor;
color_params.sample_count = sample_count;
color_params.width = config.GetWidth();
color_params.height = config.GetHeight();
SurfaceParams depth_params = color_params;
@ -861,12 +870,16 @@ SurfaceId RasterizerCache<T>::FindMatch(const SurfaceParams& params, ScaleMatch
SurfaceId match_id{};
bool match_valid = false;
u32 match_scale = 0;
u8 match_sample_count = 0;
SurfaceInterval match_interval{};
ForEachSurfaceInRegion(params.addr, params.size, [&](SurfaceId surface_id, Surface& surface) {
const bool res_scale_matched = match_scale_type == ScaleMatch::Exact
? (params.res_scale == surface.res_scale)
: (params.res_scale <= surface.res_scale);
const bool sample_count_matched = match_scale_type == ScaleMatch::Exact
? (params.sample_count == surface.sample_count)
: (params.sample_count <= surface.sample_count);
const bool is_valid =
True(find_flags & MatchFlags::Copy)
? true
@ -886,11 +899,16 @@ SurfaceId RasterizerCache<T>::FindMatch(const SurfaceParams& params, ScaleMatch
surface.type != SurfaceType::Fill)
return;
if (!sample_count_matched && match_scale_type != ScaleMatch::Ignore &&
surface.type != SurfaceType::Fill)
return;
// Found a match, update only if this is better than the previous one
auto UpdateMatch = [&] {
match_id = surface_id;
match_valid = is_valid;
match_scale = surface.res_scale;
match_sample_count = surface.sample_count;
match_interval = surface_interval;
};
@ -901,6 +919,13 @@ SurfaceId RasterizerCache<T>::FindMatch(const SurfaceParams& params, ScaleMatch
return;
}
if (surface.sample_count > match_sample_count) {
UpdateMatch();
return;
} else if (surface.sample_count < match_sample_count) {
return;
}
if (is_valid && !match_valid) {
UpdateMatch();
return;
@ -1189,8 +1214,9 @@ bool RasterizerCache<T>::ValidateByReinterpretation(Surface& surface, SurfacePar
return false;
}
const u32 res_scale = src_surface.res_scale;
if (res_scale > surface.res_scale) {
surface.ScaleUp(res_scale);
const u8 sample_count = src_surface.sample_count;
if ((res_scale > surface.res_scale) || (sample_count > surface.sample_count)) {
surface.ScaleUp(res_scale, sample_count);
}
const PAddr addr = boost::icl::lower(interval);
const SurfaceParams copy_params = surface.FromInterval(copy_interval);
@ -1357,8 +1383,8 @@ SurfaceId RasterizerCache<T>::CreateSurface(const SurfaceParams& params,
return surface_id;
}();
Surface& surface = slot_surfaces[surface_id];
if (params.res_scale > surface.res_scale) {
surface.ScaleUp(params.res_scale);
if ((params.res_scale > surface.res_scale) || (params.sample_count > surface.sample_count)) {
surface.ScaleUp(params.res_scale, params.sample_count);
}
surface.MarkInvalid(surface.GetInterval());
return surface_id;

View file

@ -227,6 +227,7 @@ private:
SurfaceMap dirty_regions;
PageMap cached_pages;
u32 resolution_scale_factor;
u8 sample_count;
u64 frame_tick{};
FramebufferParams fb_params;
Settings::TextureFilter filter;

View file

@ -1,4 +1,4 @@
// Copyright 2022 Citra Emulator Project
// Copyright Citra Emulator Project / Azahar Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
@ -219,12 +219,13 @@ u32 SurfaceParams::LevelOf(PAddr level_addr) const {
return level;
}
std::string SurfaceParams::DebugName(bool scaled, bool custom) const noexcept {
std::string SurfaceParams::DebugName(bool scaled, bool custom, u8 sample_count) const noexcept {
const u32 scaled_width = scaled ? GetScaledWidth() : width;
const u32 scaled_height = scaled ? GetScaledHeight() : height;
return fmt::format("Surface: {}x{} {} {} levels from {:#x} to {:#x} ({}{})", scaled_width,
scaled_height, PixelFormatAsString(pixel_format), levels, addr, end,
custom ? "custom," : "", scaled ? "scaled" : "unscaled");
return fmt::format("Surface: {}x{} {} samples {} levels from {:#x} to {:#x} ({}{})",
scaled_width, scaled_height, PixelFormatAsString(pixel_format),
static_cast<u32>(sample_count), levels, addr, end, custom ? "custom," : "",
scaled ? "scaled" : "unscaled");
}
bool SurfaceParams::operator==(const SurfaceParams& other) const noexcept {

View file

@ -1,4 +1,4 @@
// Copyright 2022 Citra Emulator Project
// Copyright Citra Emulator Project / Azahar Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
@ -51,7 +51,7 @@ public:
u32 LevelOf(PAddr addr) const;
/// Returns a string identifier of the params object
std::string DebugName(bool scaled, bool custom = false) const noexcept;
std::string DebugName(bool scaled, bool custom = false, u8 sample_count = 1) const noexcept;
bool operator==(const SurfaceParams& other) const noexcept;
@ -71,6 +71,10 @@ public:
return height * res_scale;
}
[[nodiscard]] u8 GetSampleCount() const noexcept {
return sample_count;
}
[[nodiscard]] Common::Rectangle<u32> GetRect(u32 level = 0) const noexcept {
return {0, height >> level, width >> level, 0};
}
@ -104,6 +108,7 @@ public:
u32 stride = 0;
u32 levels = 1;
u32 res_scale = 1;
u8 sample_count = 1;
bool is_tiled = false;
TextureType texture_type = TextureType::Texture2D;

View file

@ -29,6 +29,10 @@ u32 RendererBase::GetResolutionScaleFactor() {
: render_window.GetFramebufferLayout().GetScalingRatio();
}
u8 RendererBase::GetSampleCount() const {
return Settings::GetAntiAliasingSampleCount(Settings::values.antialiasing.GetValue());
}
void RendererBase::UpdateCurrentFramebufferLayout(bool is_portrait_mode) {
const auto update_layout = [is_portrait_mode](Frontend::EmuWindow& window) {
const Layout::FramebufferLayout& layout = window.GetFramebufferLayout();

View file

@ -67,6 +67,9 @@ public:
/// Returns the resolution scale factor relative to the native 3DS screen resolution
u32 GetResolutionScaleFactor();
/// Returns the MSAA sample count
u8 GetSampleCount() const;
/// Updates the framebuffer layout of the contained render window handle.
void UpdateCurrentFramebufferLayout(bool is_portrait_mode = {});

View file

@ -11,7 +11,9 @@
#include "video_core/renderer_opengl/gl_texture_runtime.h"
#include "video_core/host_shaders/format_reinterpreter/d24s8_to_rgba8_frag.h"
#include "video_core/host_shaders/format_reinterpreter/d24s8_to_rgba8_ms_frag.h"
#include "video_core/host_shaders/format_reinterpreter/rgba4_to_rgb5a1_frag.h"
#include "video_core/host_shaders/format_reinterpreter/rgba4_to_rgb5a1_ms_frag.h"
#include "video_core/host_shaders/full_screen_triangle_vert.h"
#include "video_core/host_shaders/texture_filtering/bicubic_frag.h"
#include "video_core/host_shaders/texture_filtering/mmpx_frag.h"
@ -65,8 +67,13 @@ BlitHelper::BlitHelper(const Driver& driver_)
gradient_y_program{CreateProgram(HostShaders::Y_GRADIENT_FRAG, "Y_GRADIENT_FRAG")},
refine_program{CreateProgram(HostShaders::REFINE_FRAG, "REFINE_FRAG")},
d24s8_to_rgba8{CreateProgram(HostShaders::D24S8_TO_RGBA8_FRAG, "D24S8_TO_RGBA8_FRAG")},
rgba4_to_rgb5a1{CreateProgram(HostShaders::RGBA4_TO_RGB5A1_FRAG, "RGBA4_TO_RGB5A1_FRAG")} {
d24s8_to_rgba8_ms{
CreateProgram(HostShaders::D24S8_TO_RGBA8_MS_FRAG, "D24S8_TO_RGBA8_MS_FRAG")},
rgba4_to_rgb5a1{CreateProgram(HostShaders::RGBA4_TO_RGB5A1_FRAG, "RGBA4_TO_RGB5A1_FRAG")},
rgba4_to_rgb5a1_ms{
CreateProgram(HostShaders::RGBA4_TO_RGB5A1_MS_FRAG, "RGBA4_TO_RGB5A1_MS_FRAG")} {
vao.Create();
read_fbo.Create();
draw_fbo.Create();
state.draw.vertex_array = vao.handle;
for (u32 i = 0; i < 3; i++) {
@ -87,46 +94,63 @@ bool BlitHelper::ConvertDS24S8ToRGBA8(Surface& source, Surface& dest,
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
state.texture_units[0].texture_2d = source.Handle();
const bool multisample = (source.sample_count > 1) && (dest.sample_count > 1);
const GLuint textarget = multisample ? GL_TEXTURE_2D_MULTISAMPLE : GL_TEXTURE_2D;
state.texture_units[0].texture_2d = source.Handle(multisample ? 3 : 1);
state.texture_units[0].target = textarget;
state.texture_units[0].sampler = 0;
state.texture_units[1].sampler = 0;
if (use_texture_view) {
temp_tex.Create();
glActiveTexture(GL_TEXTURE1);
glTextureView(temp_tex.handle, GL_TEXTURE_2D, source.Handle(), GL_DEPTH24_STENCIL8, 0, 1, 0,
1);
glTextureView(temp_tex.handle, textarget, source.Handle(multisample ? 3 : 1),
GL_DEPTH24_STENCIL8, 0, 1, 0, 1);
if (!multisample) {
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
}
} else if (copy.extent.width > temp_extent.width || copy.extent.height > temp_extent.height) {
temp_extent = copy.extent;
temp_tex.Release();
temp_tex.Create();
state.texture_units[1].texture_2d = temp_tex.handle;
state.texture_units[1].target = textarget;
state.Apply();
glActiveTexture(GL_TEXTURE1);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_DEPTH24_STENCIL8, temp_extent.width,
if (multisample) {
glTexStorage2DMultisample(textarget, source.sample_count, GL_DEPTH24_STENCIL8,
temp_extent.width, temp_extent.height, true);
} else {
glTexStorage2D(textarget, 1, GL_DEPTH24_STENCIL8, temp_extent.width,
temp_extent.height);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(textarget, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(textarget, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
}
}
state.texture_units[1].texture_2d = temp_tex.handle;
state.texture_units[1].target = textarget;
state.Apply();
glActiveTexture(GL_TEXTURE1);
if (!use_texture_view) {
glCopyImageSubData(source.Handle(), GL_TEXTURE_2D, 0, copy.src_offset.x, copy.src_offset.y,
0, temp_tex.handle, GL_TEXTURE_2D, 0, copy.src_offset.x,
glCopyImageSubData(source.Handle(multisample ? 3 : 1), textarget, 0, copy.src_offset.x,
copy.src_offset.y, 0, temp_tex.handle, textarget, 0, copy.src_offset.x,
copy.src_offset.y, 0, copy.extent.width, copy.extent.height, 1);
}
glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_STENCIL_INDEX);
glTexParameteri(textarget, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_STENCIL_INDEX);
const Common::Rectangle src_rect{copy.src_offset.x, copy.src_offset.y + copy.extent.height,
copy.src_offset.x + copy.extent.width, copy.src_offset.x};
const Common::Rectangle dst_rect{copy.dst_offset.x, copy.dst_offset.y + copy.extent.height,
copy.dst_offset.x + copy.extent.width, copy.dst_offset.x};
SetParams(d24s8_to_rgba8, source.RealExtent(), src_rect);
Draw(d24s8_to_rgba8, dest.Handle(), draw_fbo.handle, 0, dst_rect);
OGLProgram& blit_program = multisample ? d24s8_to_rgba8_ms : d24s8_to_rgba8;
SetParams(blit_program, source.RealExtent(), src_rect);
Draw(blit_program, dest.Handle(multisample ? 3 : 1), draw_fbo.handle, 0, dst_rect, multisample);
if (use_texture_view) {
temp_tex.Release();
@ -136,6 +160,11 @@ bool BlitHelper::ConvertDS24S8ToRGBA8(Surface& source, Surface& dest,
state.texture_units[0].sampler = linear_sampler.handle;
state.texture_units[1].sampler = linear_sampler.handle;
if (multisample) {
// Resolve the destination image if needed
ResolveTexture(dest, copy.dst_level, copy.dst_layer);
}
return true;
}
@ -144,18 +173,48 @@ bool BlitHelper::ConvertRGBA4ToRGB5A1(Surface& source, Surface& dest,
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
state.texture_units[0].texture_2d = source.Handle();
const bool multisample = (source.sample_count > 1) && (dest.sample_count > 1);
state.texture_units[0].texture_2d = source.Handle(multisample ? 3 : 1);
const Common::Rectangle src_rect{copy.src_offset.x, copy.src_offset.y + copy.extent.height,
copy.src_offset.x + copy.extent.width, copy.src_offset.x};
const Common::Rectangle dst_rect{copy.dst_offset.x, copy.dst_offset.y + copy.extent.height,
copy.dst_offset.x + copy.extent.width, copy.dst_offset.x};
SetParams(rgba4_to_rgb5a1, source.RealExtent(), src_rect);
Draw(rgba4_to_rgb5a1, dest.Handle(), draw_fbo.handle, 0, dst_rect);
OGLProgram& blit_program = multisample ? rgba4_to_rgb5a1_ms : rgba4_to_rgb5a1;
SetParams(blit_program, source.RealExtent(), src_rect);
Draw(blit_program, dest.Handle(multisample ? 3 : 1), draw_fbo.handle, 0, dst_rect, multisample);
if (multisample) {
// Resolve the destination image if needed
ResolveTexture(dest, copy.dst_level, copy.dst_layer);
}
return true;
}
void BlitHelper::ResolveTexture(Surface& surface, u32 level, u32 layer) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
state.draw.read_framebuffer = read_fbo.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.texture_units[0].texture_2d = 0;
state.texture_units[1].texture_2d = 0;
state.texture_units[2].texture_2d = 0;
state.Apply();
surface.Attach(GL_READ_FRAMEBUFFER, level, layer, 3);
surface.Attach(GL_DRAW_FRAMEBUFFER, level, layer, 1);
const GLbitfield buffer_mask = surface.type == SurfaceType::Depth ? GL_DEPTH_BUFFER_BIT
: surface.type == SurfaceType::DepthStencil
? (GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT)
: GL_COLOR_BUFFER_BIT;
glBlitFramebuffer(0, 0, surface.GetScaledWidth(), surface.GetScaledHeight(), 0, 0,
surface.GetScaledWidth(), surface.GetScaledHeight(), buffer_mask, GL_NEAREST);
}
bool BlitHelper::Filter(Surface& surface, const VideoCore::TextureBlit& blit) {
const auto filter = Settings::values.texture_filter.GetValue();
const bool is_depth =
@ -290,7 +349,7 @@ void BlitHelper::SetParams(OGLProgram& program, const VideoCore::Extent& src_ext
}
void BlitHelper::Draw(OGLProgram& program, GLuint dst_tex, GLuint dst_fbo, u32 dst_level,
Common::Rectangle<u32> dst_rect) {
Common::Rectangle<u32> dst_rect, bool multisample) {
state.draw.draw_framebuffer = dst_fbo;
state.draw.shader_program = program.handle;
state.viewport.x = dst_rect.left;
@ -299,9 +358,11 @@ void BlitHelper::Draw(OGLProgram& program, GLuint dst_tex, GLuint dst_fbo, u32 d
state.viewport.height = dst_rect.GetHeight();
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex,
const GLuint textarget = multisample ? GL_TEXTURE_2D_MULTISAMPLE : GL_TEXTURE_2D;
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, textarget, dst_tex,
dst_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, textarget, 0, 0);
glDrawArrays(GL_TRIANGLES, 0, 3);
}

View file

@ -1,4 +1,4 @@
// Copyright 2023 Citra Emulator Project
// Copyright Citra Emulator Project / Azahar Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
@ -31,6 +31,8 @@ public:
bool ConvertRGBA4ToRGB5A1(Surface& source, Surface& dest, const VideoCore::TextureCopy& copy);
void ResolveTexture(Surface& surface, u32 level = 0, u32 layer = 0);
private:
void FilterAnime4K(Surface& surface, const VideoCore::TextureBlit& blit);
void FilterBicubic(Surface& surface, const VideoCore::TextureBlit& blit);
@ -41,12 +43,13 @@ private:
void SetParams(OGLProgram& program, const VideoCore::Extent& src_extent,
Common::Rectangle<u32> src_rect);
void Draw(OGLProgram& program, GLuint dst_tex, GLuint dst_fbo, u32 dst_level,
Common::Rectangle<u32> dst_rect);
Common::Rectangle<u32> dst_rect, bool multisample = false);
private:
const Driver& driver;
OGLVertexArray vao;
OpenGLState state;
OGLFramebuffer read_fbo;
OGLFramebuffer draw_fbo;
OGLSampler linear_sampler;
OGLSampler nearest_sampler;
@ -59,7 +62,9 @@ private:
OGLProgram gradient_y_program;
OGLProgram refine_program;
OGLProgram d24s8_to_rgba8;
OGLProgram d24s8_to_rgba8_ms;
OGLProgram rgba4_to_rgb5a1;
OGLProgram rgba4_to_rgb5a1_ms;
OGLTexture temp_tex;
VideoCore::Extent temp_extent{};

View file

@ -164,6 +164,10 @@ RasterizerOpenGL::RasterizerOpenGL(Memory::MemorySystem& memory, Pica::PicaCore&
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer.GetHandle());
glEnable(GL_BLEND);
glEnable(GL_MULTISAMPLE);
glEnable(GL_SAMPLE_SHADING);
glMinSampleShading(1.0f);
}
RasterizerOpenGL::~RasterizerOpenGL() = default;
@ -643,6 +647,22 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
}
}
// Resolve after drawing, slow initial approach to ensure the MSAA and non-MSAA buffers are
// always in sync
if (framebuffer->color_id != VideoCore::SurfaceId{}) {
Surface& color_surface = res_cache.GetSurface(framebuffer->color_id);
if (color_surface.GetSampleCount() > 1) {
runtime.ResolveTexture(color_surface, framebuffer->color_level);
}
}
if (framebuffer->depth_id != VideoCore::SurfaceId{}) {
Surface& depth_surface = res_cache.GetSurface(framebuffer->depth_id);
if (depth_surface.GetSampleCount() > 1) {
runtime.ResolveTexture(depth_surface, framebuffer->depth_level);
}
}
vertex_batch.clear();
if (shadow_rendering) {
@ -1005,7 +1025,8 @@ void RasterizerOpenGL::SyncAndUploadLUTs() {
}
void RasterizerOpenGL::UploadUniforms(bool accelerate_draw) {
// glBindBufferRange also changes the generic buffer binding point, so we sync the state first.
// glBindBufferRange also changes the generic buffer binding point, so we sync the state
// first.
state.draw.uniform_buffer = uniform_buffer.GetHandle();
state.Apply();

View file

@ -92,20 +92,27 @@ static constexpr std::array<FormatTuple, 8> CUSTOM_TUPLES = {{
return 0;
}
[[nodiscard]] OGLTexture MakeHandle(GLenum target, u32 width, u32 height, u32 levels,
[[nodiscard]] OGLTexture MakeHandle(GLenum target, u32 width, u32 height, u32 levels, u32 samples,
const FormatTuple& tuple, std::string_view debug_name = "") {
OGLTexture texture{};
texture.Create();
if (samples > 1) {
ASSERT(target == GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, texture.handle);
glTexStorage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, samples, tuple.internal_format, width,
height, true);
} else {
glBindTexture(target, texture.handle);
glTexStorage2D(target, levels, tuple.internal_format, width, height);
}
glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(target, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(target, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
if (!debug_name.empty()) {
glObjectLabel(GL_TEXTURE, texture.handle, -1, debug_name.data());
glObjectLabel(GL_TEXTURE, texture.handle, debug_name.size(), debug_name.data());
}
return texture;
@ -215,6 +222,14 @@ bool TextureRuntime::ClearTextureWithoutFbo(Surface& surface,
glClearTexSubImage(surface.Handle(), clear.texture_level, clear.texture_rect.left,
clear.texture_rect.bottom, 0, clear.texture_rect.GetWidth(),
clear.texture_rect.GetHeight(), 1, format, type, &clear.value);
if (surface.sample_count > 1) {
// Clear MSAA too
glClearTexSubImage(surface.Handle(3), clear.texture_level, clear.texture_rect.left,
clear.texture_rect.bottom, 0, clear.texture_rect.GetWidth(),
clear.texture_rect.GetHeight(), 1, format, type, &clear.value);
}
return true;
}
@ -232,8 +247,7 @@ void TextureRuntime::ClearTexture(Surface& surface, const VideoCore::TextureClea
state.draw.draw_framebuffer = draw_fbos[FboIndex(surface.type)].handle;
state.Apply();
surface.Attach(GL_DRAW_FRAMEBUFFER, clear.texture_level, 0);
const auto ClearBuffer = [&surface, &state, &clear]() {
switch (surface.type) {
case SurfaceType::Color:
case SurfaceType::Texture:
@ -258,6 +272,16 @@ void TextureRuntime::ClearTexture(Surface& surface, const VideoCore::TextureClea
default:
UNREACHABLE_MSG("Unknown surface type {}", surface.type);
}
};
surface.Attach(GL_DRAW_FRAMEBUFFER, clear.texture_level, 0);
ClearBuffer();
if (surface.GetSampleCount() > 1) {
// Clear MSAA too
surface.Attach(GL_DRAW_FRAMEBUFFER, clear.texture_level, 0, 3);
ClearBuffer();
}
}
bool TextureRuntime::CopyTextures(Surface& source, Surface& dest,
@ -279,6 +303,15 @@ bool TextureRuntime::CopyTextures(Surface& source, Surface& dest,
bool TextureRuntime::BlitTextures(Surface& source, Surface& dest,
const VideoCore::TextureBlit& blit) {
// Must resolve images first
// Todo(wunk): Add a "dirty" flag for msaa resolves to avoid redundant image resolves
if (source.sample_count > 1) {
blit_helper.ResolveTexture(source, blit.src_level, blit.src_layer);
}
if (dest.sample_count > 1) {
blit_helper.ResolveTexture(dest, blit.dst_level, blit.dst_layer);
}
OpenGLState state = OpenGLState::GetCurState();
state.scissor.enabled = false;
state.draw.read_framebuffer = read_fbos[FboIndex(source.type)].handle;
@ -329,11 +362,16 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceParams& param
const GLenum target =
texture_type == VideoCore::TextureType::CubeMap ? GL_TEXTURE_CUBE_MAP : GL_TEXTURE_2D;
textures[0] = MakeHandle(target, width, height, levels, tuple, DebugName(false));
textures[0] = MakeHandle(target, width, height, levels, 1, tuple, DebugName(false));
if (res_scale != 1) {
textures[1] = MakeHandle(target, GetScaledWidth(), GetScaledHeight(), levels, tuple,
textures[1] = MakeHandle(target, GetScaledWidth(), GetScaledHeight(), levels, 1, tuple,
DebugName(true, false));
}
if (sample_count > 1) {
textures[3] = MakeHandle(target, GetScaledWidth(), GetScaledHeight(), levels, sample_count,
tuple, DebugName(true, false, sample_count));
}
}
Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceBase& surface,
@ -351,15 +389,19 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceBase& surface
custom_format = mat->format;
material = mat;
textures[0] = MakeHandle(target, mat->width, mat->height, levels, tuple, DebugName(false));
textures[0] = MakeHandle(target, mat->width, mat->height, levels, 1, tuple, DebugName(false));
if (res_scale != 1) {
textures[1] = MakeHandle(target, mat->width, mat->height, levels, DEFAULT_TUPLE,
textures[1] = MakeHandle(target, mat->width, mat->height, levels, 1, DEFAULT_TUPLE,
DebugName(true, true));
}
const bool has_normal = mat->Map(MapType::Normal);
if (has_normal) {
textures[2] =
MakeHandle(target, mat->width, mat->height, levels, tuple, DebugName(true, true));
MakeHandle(target, mat->width, mat->height, levels, 1, tuple, DebugName(true, true));
}
if (sample_count > 1) {
textures[3] = MakeHandle(target, mat->width, mat->height, sample_count, levels,
DEFAULT_TUPLE, DebugName(true, true, sample_count));
}
}
@ -374,8 +416,8 @@ GLuint Surface::Handle(u32 index) const noexcept {
GLuint Surface::CopyHandle() noexcept {
if (!copy_texture.handle) {
copy_texture = MakeHandle(GL_TEXTURE_2D, GetScaledWidth(), GetScaledHeight(), levels, tuple,
DebugName(true));
copy_texture = MakeHandle(GL_TEXTURE_2D, GetScaledWidth(), GetScaledHeight(), levels, 1,
tuple, DebugName(true));
}
for (u32 level = 0; level < levels; level++) {
@ -534,37 +576,38 @@ bool Surface::DownloadWithoutFbo(const VideoCore::BufferTextureCopy& download,
return false;
}
void Surface::Attach(GLenum target, u32 level, u32 layer, bool scaled) {
const GLuint handle = Handle(static_cast<u32>(scaled));
const GLenum textarget = texture_type == TextureType::CubeMap
? GL_TEXTURE_CUBE_MAP_POSITIVE_X + layer
void Surface::Attach(GLenum target, u32 level, u32 layer, u32 handle) {
const GLuint gl_handle = Handle(handle);
GLenum textarget = texture_type == TextureType::CubeMap ? GL_TEXTURE_CUBE_MAP_POSITIVE_X + layer
: GL_TEXTURE_2D;
if (handle == 3 && sample_count > 1) {
ASSERT(texture_type == TextureType::Texture2D);
textarget = GL_TEXTURE_2D_MULTISAMPLE;
}
switch (type) {
case SurfaceType::Color:
case SurfaceType::Texture:
glFramebufferTexture2D(target, GL_COLOR_ATTACHMENT0, textarget, handle, level);
glFramebufferTexture2D(target, GL_COLOR_ATTACHMENT0, textarget, gl_handle, level);
break;
case SurfaceType::Depth:
glFramebufferTexture2D(target, GL_DEPTH_ATTACHMENT, textarget, handle, level);
glFramebufferTexture2D(target, GL_DEPTH_ATTACHMENT, textarget, gl_handle, level);
break;
case SurfaceType::DepthStencil:
glFramebufferTexture2D(target, GL_DEPTH_STENCIL_ATTACHMENT, textarget, handle, level);
glFramebufferTexture2D(target, GL_DEPTH_STENCIL_ATTACHMENT, textarget, gl_handle, level);
break;
default:
UNREACHABLE_MSG("Invalid surface type!");
}
}
void Surface::ScaleUp(u32 new_scale) {
if (res_scale == new_scale || new_scale == 1) {
return;
}
void Surface::ScaleUp(u32 new_scale, u8 new_sample_count) {
const bool res_scale_modified = res_scale != new_scale;
if (res_scale_modified && new_scale > 1) {
res_scale = new_scale;
textures[1] = MakeHandle(GL_TEXTURE_2D, GetScaledWidth(), GetScaledHeight(), levels, tuple,
DebugName(true));
textures[1] = MakeHandle(GL_TEXTURE_2D, GetScaledWidth(), GetScaledHeight(), levels, 1,
tuple, DebugName(true));
for (u32 level = 0; level < levels; level++) {
const VideoCore::TextureBlit blit = {
.src_level = level,
@ -574,6 +617,13 @@ void Surface::ScaleUp(u32 new_scale) {
};
BlitScale(blit, true);
}
}
if ((res_scale_modified || sample_count != new_sample_count) && new_sample_count > 1) {
sample_count = new_sample_count;
textures[3] = MakeHandle(GL_TEXTURE_2D, GetScaledWidth(), GetScaledHeight(), levels,
sample_count, tuple, DebugName(true));
}
}
u32 Surface::GetInternalBytesPerPixel() const {
@ -606,7 +656,8 @@ void Surface::BlitScale(const VideoCore::TextureBlit& blit, bool up_scale) {
Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferParams& params,
const Surface* color, const Surface* depth)
: VideoCore::FramebufferParams{params},
res_scale{color ? color->res_scale : (depth ? depth->res_scale : 1u)} {
res_scale{color ? color->res_scale : (depth ? depth->res_scale : 1u)},
sample_count{color ? color->sample_count : (depth ? depth->sample_count : 1u)} {
if (shadow_rendering && !color) {
return;
@ -619,6 +670,15 @@ Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferPa
attachments[1] = depth->Handle();
}
if (sample_count > 1) {
if (color) {
attachments[2] = color->Handle(3);
}
if (depth) {
attachments[3] = depth->Handle(3);
}
}
framebuffer.Create();
OpenGLState state = OpenGLState::GetCurState();
@ -650,6 +710,27 @@ Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferPa
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
0, 0);
}
if (sample_count > 1) {
if (color) {
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_TEXTURE_2D_MULTISAMPLE, color ? color->Handle(3) : 0,
color_level);
}
if (depth) {
if (depth->pixel_format == PixelFormat::D24S8) {
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_TEXTURE_2D_MULTISAMPLE, depth->Handle(3),
depth_level);
} else {
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,
GL_TEXTURE_2D_MULTISAMPLE, depth->Handle(3),
depth_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT,
GL_TEXTURE_2D_MULTISAMPLE, 0, 0);
}
}
}
}
}

View file

@ -78,6 +78,11 @@ public:
/// Generates mipmaps for all the available levels of the texture
void GenerateMipmaps(Surface& surface);
/// Resolve a surface's MSAA texture into the surface's appropriate non-MSAA texture
void ResolveTexture(Surface& surface, u32 level = 0, u32 layer = 0) {
blit_helper.ResolveTexture(surface, level, layer);
}
private:
/// Returns the OpenGL driver class
const Driver& GetDriver() const {
@ -130,10 +135,10 @@ public:
const VideoCore::StagingData& staging);
/// Attaches a handle of surface to the specified framebuffer target
void Attach(GLenum target, u32 level, u32 layer, bool scaled = true);
void Attach(GLenum target, u32 level, u32 layer, u32 handle = 1);
/// Scales up the surface to match the new resolution scale.
void ScaleUp(u32 new_scale);
/// Scales up the surface to match the new resolution scale and sample-count.
void ScaleUp(u32 new_scale, u8 new_sample_count);
/// Returns the bpp of the internal surface format
u32 GetInternalBytesPerPixel() const;
@ -149,7 +154,7 @@ private:
private:
const Driver* driver;
TextureRuntime* runtime;
std::array<OGLTexture, 3> textures;
std::array<OGLTexture, 4> textures;
OGLTexture copy_texture;
FormatTuple tuple;
};
@ -170,6 +175,10 @@ public:
return res_scale;
}
[[nodiscard]] u32 Samples() const noexcept {
return sample_count;
}
[[nodiscard]] GLuint Handle() const noexcept {
return framebuffer.handle;
}
@ -184,7 +193,8 @@ public:
private:
u32 res_scale{1};
std::array<GLuint, 2> attachments{};
u32 sample_count{1};
std::array<GLuint, 4> attachments{};
OGLFramebuffer framebuffer;
};

View file

@ -14,6 +14,7 @@
#include "video_core/renderer_vulkan/vk_texture_runtime.h"
#include "video_core/host_shaders/format_reinterpreter/vulkan_d24s8_to_rgba8_comp.h"
#include "video_core/host_shaders/format_reinterpreter/vulkan_d24s8_to_rgba8_ms_comp.h"
#include "video_core/host_shaders/full_screen_triangle_vert.h"
#include "video_core/host_shaders/vulkan_blit_depth_stencil_frag.h"
#include "video_core/host_shaders/vulkan_depth_to_buffer_comp.h"
@ -248,6 +249,8 @@ BlitHelper::BlitHelper(const Instance& instance_, Scheduler& scheduler_,
vk::ShaderStageFlagBits::eVertex, device)},
d24s8_to_rgba8_comp{Compile(HostShaders::VULKAN_D24S8_TO_RGBA8_COMP,
vk::ShaderStageFlagBits::eCompute, device)},
d24s8_to_rgba8_ms_comp{Compile(HostShaders::VULKAN_D24S8_TO_RGBA8_MS_COMP,
vk::ShaderStageFlagBits::eCompute, device)},
depth_to_buffer_comp{Compile(HostShaders::VULKAN_DEPTH_TO_BUFFER_COMP,
vk::ShaderStageFlagBits::eCompute, device)},
blit_depth_stencil_frag{VK_NULL_HANDLE},
@ -260,6 +263,8 @@ BlitHelper::BlitHelper(const Instance& instance_, Scheduler& scheduler_,
mmpx_frag{Compile(HostShaders::MMPX_FRAG, vk::ShaderStageFlagBits::eFragment, device)},
refine_frag{Compile(HostShaders::REFINE_FRAG, vk::ShaderStageFlagBits::eFragment, device)},
d24s8_to_rgba8_pipeline{MakeComputePipeline(d24s8_to_rgba8_comp, compute_pipeline_layout)},
d24s8_to_rgba8_ms_pipeline{
MakeComputePipeline(d24s8_to_rgba8_ms_comp, compute_pipeline_layout)},
depth_to_buffer_pipeline{
MakeComputePipeline(depth_to_buffer_comp, compute_buffer_pipeline_layout)},
depth_blit_pipeline{VK_NULL_HANDLE},
@ -284,11 +289,13 @@ BlitHelper::BlitHelper(const Instance& instance_, Scheduler& scheduler_,
"BlitHelper: three_textures_pipeline_layout");
SetObjectName(device, full_screen_vert, "BlitHelper: full_screen_vert");
SetObjectName(device, d24s8_to_rgba8_comp, "BlitHelper: d24s8_to_rgba8_comp");
SetObjectName(device, d24s8_to_rgba8_ms_comp, "BlitHelper: d24s8_to_rgba8_ms_comp");
SetObjectName(device, depth_to_buffer_comp, "BlitHelper: depth_to_buffer_comp");
if (blit_depth_stencil_frag) {
SetObjectName(device, blit_depth_stencil_frag, "BlitHelper: blit_depth_stencil_frag");
}
SetObjectName(device, d24s8_to_rgba8_pipeline, "BlitHelper: d24s8_to_rgba8_pipeline");
SetObjectName(device, d24s8_to_rgba8_ms_pipeline, "BlitHelper: d24s8_to_rgba8_ms_pipeline");
SetObjectName(device, depth_to_buffer_pipeline, "BlitHelper: depth_to_buffer_pipeline");
if (depth_blit_pipeline) {
SetObjectName(device, depth_blit_pipeline, "BlitHelper: depth_blit_pipeline");
@ -310,6 +317,7 @@ BlitHelper::~BlitHelper() {
device.destroyPipelineLayout(three_textures_pipeline_layout);
device.destroyShaderModule(full_screen_vert);
device.destroyShaderModule(d24s8_to_rgba8_comp);
device.destroyShaderModule(d24s8_to_rgba8_ms_comp);
device.destroyShaderModule(depth_to_buffer_comp);
if (blit_depth_stencil_frag) {
device.destroyShaderModule(blit_depth_stencil_frag);
@ -322,6 +330,7 @@ BlitHelper::~BlitHelper() {
device.destroyShaderModule(refine_frag);
device.destroyPipeline(depth_to_buffer_pipeline);
device.destroyPipeline(d24s8_to_rgba8_pipeline);
device.destroyPipeline(d24s8_to_rgba8_ms_pipeline);
device.destroyPipeline(depth_blit_pipeline);
device.destroySampler(linear_sampler);
device.destroySampler(nearest_sampler);
@ -401,16 +410,23 @@ bool BlitHelper::BlitDepthStencil(Surface& source, Surface& dest,
bool BlitHelper::ConvertDS24S8ToRGBA8(Surface& source, Surface& dest,
const VideoCore::TextureCopy& copy) {
const bool multisample = (source.sample_count > 1) && (dest.sample_count > 1);
const Type src_type = multisample ? Type::MultiSampled : Type::Current;
const auto pipeline = multisample ? d24s8_to_rgba8_ms_pipeline : d24s8_to_rgba8_pipeline;
const auto descriptor_set = compute_provider.Commit();
update_queue.AddImageSampler(descriptor_set, 0, 0, source.DepthView(), VK_NULL_HANDLE,
update_queue.AddImageSampler(descriptor_set, 0, 0, source.ImageView(ViewType::Depth, src_type),
VK_NULL_HANDLE, vk::ImageLayout::eDepthStencilReadOnlyOptimal);
update_queue.AddImageSampler(descriptor_set, 1, 0,
source.ImageView(ViewType::Stencil, src_type), VK_NULL_HANDLE,
vk::ImageLayout::eDepthStencilReadOnlyOptimal);
update_queue.AddImageSampler(descriptor_set, 1, 0, source.StencilView(), VK_NULL_HANDLE,
vk::ImageLayout::eDepthStencilReadOnlyOptimal);
update_queue.AddStorageImage(descriptor_set, 2, dest.ImageView());
update_queue.AddStorageImage(descriptor_set, 2, dest.ImageView(ViewType::Sample, src_type));
renderpass_cache.EndRendering();
scheduler.Record([this, descriptor_set, copy, src_image = source.Image(),
dst_image = dest.Image()](vk::CommandBuffer cmdbuf) {
scheduler.Record([this, pipeline, descriptor_set, copy, src_image = source.Image(src_type),
dst_image = dest.Image(src_type)](vk::CommandBuffer cmdbuf) {
const std::array pre_barriers = {
vk::ImageMemoryBarrier{
.srcAccessMask = vk::AccessFlagBits::eDepthStencilAttachmentWrite,
@ -488,7 +504,7 @@ bool BlitHelper::ConvertDS24S8ToRGBA8(Surface& source, Surface& dest,
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, compute_pipeline_layout, 0,
descriptor_set, {});
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, d24s8_to_rgba8_pipeline);
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline);
const ComputeInfo info = {
.src_offset = Common::Vec2i{static_cast<int>(copy.src_offset.x),
@ -507,6 +523,11 @@ bool BlitHelper::ConvertDS24S8ToRGBA8(Surface& source, Surface& dest,
vk::PipelineStageFlagBits::eTransfer,
vk::DependencyFlagBits::eByRegion, {}, {}, post_barriers);
});
if (multisample) {
// Resolve the destination image if needed
ResolveTexture(dest);
}
return true;
}
@ -585,6 +606,99 @@ bool BlitHelper::DepthToBuffer(Surface& source, vk::Buffer buffer,
return true;
}
void BlitHelper::ResolveTexture(Surface& surface) {
scheduler.Record([width = surface.GetScaledWidth(), height = surface.GetScaledHeight(),
aspect = surface.Aspect(), access_flags = surface.AccessFlags(),
pipeline_state_flags = surface.PipelineStageFlags(),
msaa_image = surface.Image(Type::MultiSampled),
dest_image = surface.Image()](vk::CommandBuffer cmdbuf) {
const vk::ImageResolve resolve_area = {
.srcSubresource{
.aspectMask = aspect,
.mipLevel = 0,
.baseArrayLayer = 0,
.layerCount = 1,
},
.srcOffset = {},
.dstSubresource{
.aspectMask = aspect,
.mipLevel = 0,
.baseArrayLayer = 0,
.layerCount = 1,
},
.dstOffset = {},
.extent{
.width = width,
.height = height,
.depth = 1,
},
};
const vk::ImageSubresourceRange subresource_range = vk::ImageSubresourceRange{
.aspectMask = aspect,
.baseMipLevel = 0,
.levelCount = 1,
.baseArrayLayer = 0,
.layerCount = VK_REMAINING_ARRAY_LAYERS,
};
const std::array read_barriers = {
vk::ImageMemoryBarrier{
.srcAccessMask = access_flags,
.dstAccessMask = vk::AccessFlagBits::eTransferRead,
.oldLayout = vk::ImageLayout::eGeneral,
.newLayout = vk::ImageLayout::eTransferSrcOptimal,
.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.image = msaa_image,
.subresourceRange = subresource_range,
},
vk::ImageMemoryBarrier{
.srcAccessMask = access_flags,
.dstAccessMask = vk::AccessFlagBits::eTransferWrite,
.oldLayout = vk::ImageLayout::eGeneral,
.newLayout = vk::ImageLayout::eTransferDstOptimal,
.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.image = dest_image,
.subresourceRange = subresource_range,
},
};
const std::array write_barriers = {
vk::ImageMemoryBarrier{
.srcAccessMask = vk::AccessFlagBits::eTransferRead,
.dstAccessMask = access_flags,
.oldLayout = vk::ImageLayout::eTransferSrcOptimal,
.newLayout = vk::ImageLayout::eGeneral,
.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.image = msaa_image,
.subresourceRange = subresource_range,
},
vk::ImageMemoryBarrier{
.srcAccessMask = vk::AccessFlagBits::eTransferWrite,
.dstAccessMask = access_flags,
.oldLayout = vk::ImageLayout::eTransferDstOptimal,
.newLayout = vk::ImageLayout::eGeneral,
.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
.image = dest_image,
.subresourceRange = subresource_range,
},
};
cmdbuf.pipelineBarrier(pipeline_state_flags, vk::PipelineStageFlagBits::eTransfer,
vk::DependencyFlagBits::eByRegion, {}, {}, read_barriers);
cmdbuf.resolveImage(msaa_image, vk::ImageLayout::eTransferSrcOptimal, dest_image,
vk::ImageLayout::eTransferDstOptimal, resolve_area);
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eTransfer, pipeline_state_flags,
vk::DependencyFlagBits::eByRegion, {}, {}, write_barriers);
});
}
vk::Pipeline BlitHelper::MakeComputePipeline(vk::ShaderModule shader, vk::PipelineLayout layout) {
const vk::ComputePipelineCreateInfo compute_info = {
.stage = MakeStages(shader),

View file

@ -39,6 +39,8 @@ public:
bool DepthToBuffer(Surface& source, vk::Buffer buffer,
const VideoCore::BufferTextureCopy& copy);
void ResolveTexture(Surface& surface);
private:
vk::Pipeline MakeComputePipeline(vk::ShaderModule shader, vk::PipelineLayout layout);
vk::Pipeline MakeDepthStencilBlitPipeline();
@ -83,6 +85,7 @@ private:
vk::ShaderModule full_screen_vert;
vk::ShaderModule d24s8_to_rgba8_comp;
vk::ShaderModule d24s8_to_rgba8_ms_comp;
vk::ShaderModule depth_to_buffer_comp;
vk::ShaderModule blit_depth_stencil_frag;
vk::ShaderModule bicubic_frag;
@ -92,6 +95,7 @@ private:
vk::ShaderModule refine_frag;
vk::Pipeline d24s8_to_rgba8_pipeline;
vk::Pipeline d24s8_to_rgba8_ms_pipeline;
vk::Pipeline depth_to_buffer_pipeline;
vk::Pipeline depth_blit_pipeline;
vk::Sampler linear_sampler;

View file

@ -163,8 +163,9 @@ bool GraphicsPipeline::Build(bool fail_on_compile_required) {
};
const vk::PipelineMultisampleStateCreateInfo multisampling = {
.rasterizationSamples = vk::SampleCountFlagBits::e1,
.sampleShadingEnable = false,
.rasterizationSamples = vk::SampleCountFlagBits(info.state.attachments.sample_count),
.sampleShadingEnable = true,
.minSampleShading = 1.0f,
};
const vk::PipelineColorBlendAttachmentState colorblend_attachment = {
@ -275,7 +276,8 @@ bool GraphicsPipeline::Build(bool fail_on_compile_required) {
.pDynamicState = &dynamic_info,
.layout = pipeline_layout,
.renderPass = renderpass_cache.GetRenderpass(info.state.attachments.color,
info.state.attachments.depth, false),
info.state.attachments.depth, false,
info.state.attachments.sample_count),
};
if (fail_on_compile_required) {

View file

@ -214,6 +214,7 @@ static_assert(std::is_trivial_v<VertexLayout>);
struct AttachmentInfo {
VideoCore::PixelFormat color;
VideoCore::PixelFormat depth;
u8 sample_count;
static consteval u64 StructHash() {
constexpr u64 STRUCT_VERSION = 0;
@ -225,7 +226,7 @@ struct AttachmentInfo {
LAYOUT_HASH,
// fields
FIELD_HASH(color), FIELD_HASH(depth));
FIELD_HASH(color), FIELD_HASH(depth), FIELD_HASH(sample_count));
}
};
static_assert(std::is_trivial_v<AttachmentInfo>);

View file

@ -449,6 +449,8 @@ bool Instance::CreateDevice() {
add_extension(VK_KHR_SWAPCHAIN_EXTENSION_NAME);
image_format_list = add_extension(VK_KHR_IMAGE_FORMAT_LIST_EXTENSION_NAME);
create_renderpass2 = add_extension(VK_KHR_CREATE_RENDERPASS_2_EXTENSION_NAME);
depth_stencil_resolve = add_extension(VK_KHR_DEPTH_STENCIL_RESOLVE_EXTENSION_NAME);
shader_stencil_export = add_extension(VK_EXT_SHADER_STENCIL_EXPORT_EXTENSION_NAME);
external_memory_host = add_extension(VK_EXT_EXTERNAL_MEMORY_HOST_EXTENSION_NAME);
tooling_info = add_extension(VK_EXT_TOOLING_INFO_EXTENSION_NAME);
@ -518,9 +520,11 @@ bool Instance::CreateDevice() {
.features{
.robustBufferAccess = features.robustBufferAccess,
.geometryShader = features.geometryShader,
.sampleRateShading = features.sampleRateShading,
.logicOp = features.logicOp,
.samplerAnisotropy = features.samplerAnisotropy,
.fragmentStoresAndAtomics = features.fragmentStoresAndAtomics,
.shaderStorageImageMultisample = features.shaderStorageImageMultisample,
.shaderClipDistance = features.shaderClipDistance,
},
},

View file

@ -251,6 +251,11 @@ public:
return triangle_fan_supported;
}
// Returns true when sampleRateShading, VK_KHR_create_renderpass2, VK_KHR_depth_stencil_resolve
bool IsMultiSampleSupported() const {
return features.sampleRateShading && create_renderpass2 && depth_stencil_resolve;
}
/// Returns true if dynamic indices can be used inside shaders.
bool IsImageArrayDynamicIndexSupported() const {
return features.shaderSampledImageArrayDynamicIndexing;
@ -330,6 +335,8 @@ protected:
bool index_type_uint8{};
bool fragment_shader_interlock{};
bool image_format_list{};
bool create_renderpass2{};
bool depth_stencil_resolve{};
bool pipeline_creation_cache_control{};
bool fragment_shader_barycentric{};
bool shader_stencil_export{};

View file

@ -558,6 +558,7 @@ bool RasterizerVulkan::Draw(bool accelerate, bool is_indexed) {
pipeline_info.state.attachments.color = framebuffer->Format(SurfaceType::Color);
pipeline_info.state.attachments.depth = framebuffer->Format(SurfaceType::Depth);
pipeline_info.state.attachments.sample_count = framebuffer->Samples();
// Update scissor uniforms
const auto [scissor_x1, scissor_y2, scissor_x2, scissor_y1] = fb_helper.Scissor();
@ -777,6 +778,8 @@ bool RasterizerVulkan::AccelerateDisplay(const Pica::FramebufferConfig& config,
src_params.stride = pixel_stride;
src_params.is_tiled = false;
src_params.pixel_format = VideoCore::PixelFormatFromGPUPixelFormat(config.color_format);
src_params.sample_count =
Settings::GetAntiAliasingSampleCount(Settings::values.antialiasing.GetValue());
src_params.UpdateParams();
const auto [src_surface_id, src_rect] =

View file

@ -2,6 +2,8 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <limits>
#include <boost/container/static_vector.hpp>
#include "common/assert.h"
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/renderer_vulkan/vk_instance.h"
@ -37,7 +39,7 @@ void RenderManager::BeginRendering(const Framebuffer* framebuffer,
.framebuffer = framebuffer->Handle(),
.render_pass = framebuffer->RenderPass(),
.render_area = render_area,
.clear = {},
.clears = {},
.do_clear = false,
};
images = framebuffer->Images();
@ -58,8 +60,8 @@ void RenderManager::BeginRendering(const RenderPass& new_pass) {
.renderPass = info.render_pass,
.framebuffer = info.framebuffer,
.renderArea = info.render_area,
.clearValueCount = info.do_clear ? 1u : 0u,
.pClearValues = &info.clear,
.clearValueCount = info.do_clear ? 2u : 0u,
.pClearValues = info.clears.data(),
};
cmdbuf.beginRenderPass(renderpass_begin_info, vk::SubpassContents::eInline);
});
@ -77,7 +79,7 @@ void RenderManager::EndRendering() {
u32 num_barriers = 0;
vk::PipelineStageFlags pipeline_flags{};
vk::AccessFlags src_access_flags{};
std::array<vk::ImageMemoryBarrier, 2> barriers;
std::array<vk::ImageMemoryBarrier, 4> barriers;
for (u32 i = 0; i < images.size(); i++) {
if (!images[i]) {
continue;
@ -138,7 +140,8 @@ void RenderManager::EndRendering() {
}
vk::RenderPass RenderManager::GetRenderpass(VideoCore::PixelFormat color,
VideoCore::PixelFormat depth, bool is_clear) {
VideoCore::PixelFormat depth, bool is_clear,
u8 sample_count) {
std::scoped_lock lock{cache_mutex};
const u32 color_index =
@ -151,13 +154,23 @@ vk::RenderPass RenderManager::GetRenderpass(VideoCore::PixelFormat color,
ASSERT_MSG(color_index <= NumColorFormats && depth_index <= NumDepthFormats,
"Invalid color index {} and/or depth_index {}", color_index, depth_index);
vk::UniqueRenderPass& renderpass = cached_renderpasses[color_index][depth_index][is_clear];
ASSERT_MSG(sample_count && std::has_single_bit(sample_count) && sample_count <= MaxSamples,
"Invalid sample count {}", static_cast<u32>(sample_count));
const u32 samples_index = static_cast<u32>(std::bit_width(sample_count) - 1);
vk::UniqueRenderPass& renderpass =
cached_renderpasses[color_index][depth_index][samples_index][is_clear];
if (!renderpass) {
const vk::Format color_format = instance.GetTraits(color).native;
const vk::Format depth_format = instance.GetTraits(depth).native;
const vk::AttachmentLoadOp load_op =
is_clear ? vk::AttachmentLoadOp::eClear : vk::AttachmentLoadOp::eLoad;
renderpass = CreateRenderPass(color_format, depth_format, load_op);
renderpass = (sample_count > 1)
? CreateRenderPassMSAA(color_format, depth_format, load_op,
static_cast<vk::SampleCountFlagBits>(sample_count))
: CreateRenderPass(color_format, depth_format, load_op);
}
return *renderpass;
@ -165,27 +178,27 @@ vk::RenderPass RenderManager::GetRenderpass(VideoCore::PixelFormat color,
vk::UniqueRenderPass RenderManager::CreateRenderPass(vk::Format color, vk::Format depth,
vk::AttachmentLoadOp load_op) const {
u32 attachment_count = 0;
std::array<vk::AttachmentDescription, 2> attachments;
boost::container::static_vector<vk::AttachmentDescription, 2> attachments{};
bool use_color = false;
vk::AttachmentReference color_attachment_ref{};
bool use_depth = false;
vk::AttachmentReference depth_attachment_ref{};
if (color != vk::Format::eUndefined) {
attachments[attachment_count] = vk::AttachmentDescription{
attachments.emplace_back(vk::AttachmentDescription{
.format = color,
.samples = vk::SampleCountFlagBits::e1,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = vk::AttachmentLoadOp::eDontCare,
.stencilStoreOp = vk::AttachmentStoreOp::eDontCare,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
};
});
color_attachment_ref = vk::AttachmentReference{
.attachment = attachment_count++,
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
};
@ -193,18 +206,19 @@ vk::UniqueRenderPass RenderManager::CreateRenderPass(vk::Format color, vk::Forma
}
if (depth != vk::Format::eUndefined) {
attachments[attachment_count] = vk::AttachmentDescription{
attachments.emplace_back(vk::AttachmentDescription{
.format = depth,
.samples = vk::SampleCountFlagBits::e1,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = load_op,
.stencilStoreOp = vk::AttachmentStoreOp::eStore,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
};
});
depth_attachment_ref = vk::AttachmentReference{
.attachment = attachment_count++,
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
};
@ -217,12 +231,11 @@ vk::UniqueRenderPass RenderManager::CreateRenderPass(vk::Format color, vk::Forma
.pInputAttachments = nullptr,
.colorAttachmentCount = use_color ? 1u : 0u,
.pColorAttachments = &color_attachment_ref,
.pResolveAttachments = 0,
.pDepthStencilAttachment = use_depth ? &depth_attachment_ref : nullptr,
};
const vk::RenderPassCreateInfo renderpass_info = {
.attachmentCount = attachment_count,
.attachmentCount = static_cast<u32>(attachments.size()),
.pAttachments = attachments.data(),
.subpassCount = 1,
.pSubpasses = &subpass,
@ -233,4 +246,132 @@ vk::UniqueRenderPass RenderManager::CreateRenderPass(vk::Format color, vk::Forma
return instance.GetDevice().createRenderPassUnique(renderpass_info);
}
vk::UniqueRenderPass RenderManager::CreateRenderPassMSAA(
vk::Format color, vk::Format depth, vk::AttachmentLoadOp load_op,
vk::SampleCountFlagBits sample_count) const {
boost::container::static_vector<vk::AttachmentDescription2, 4> attachments{};
vk::AttachmentReference2 color_resolve_attachment = {.attachment = VK_ATTACHMENT_UNUSED};
vk::AttachmentReference2 depth_resolve_attachment = {.attachment = VK_ATTACHMENT_UNUSED};
bool use_color = false;
vk::AttachmentReference2 color_attachment_ref{};
bool use_depth = false;
vk::AttachmentReference2 depth_attachment_ref{};
if (color != vk::Format::eUndefined) {
attachments.emplace_back(vk::AttachmentDescription2{
.format = color,
.samples = vk::SampleCountFlagBits::e1,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = vk::AttachmentLoadOp::eDontCare,
.stencilStoreOp = vk::AttachmentStoreOp::eDontCare,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
});
color_attachment_ref = vk::AttachmentReference2{
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
.aspectMask = vk::ImageAspectFlagBits::eColor,
};
use_color = true;
}
if (depth != vk::Format::eUndefined) {
attachments.emplace_back(vk::AttachmentDescription2{
.format = depth,
.samples = vk::SampleCountFlagBits::e1,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = load_op,
.stencilStoreOp = vk::AttachmentStoreOp::eStore,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
});
depth_attachment_ref = vk::AttachmentReference2{
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
.aspectMask = vk::ImageAspectFlagBits::eDepth,
};
use_depth = true;
}
// In the case of MSAA, each attachment gets an additional MSAA attachment that now becomes the
// main attachment and the original attachments now get resolved into
if (sample_count > vk::SampleCountFlagBits::e1) {
if (color != vk::Format::eUndefined) {
attachments.emplace_back(vk::AttachmentDescription2{
.format = color,
.samples = sample_count,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = vk::AttachmentLoadOp::eDontCare,
.stencilStoreOp = vk::AttachmentStoreOp::eDontCare,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
});
color_resolve_attachment = color_attachment_ref;
color_attachment_ref = vk::AttachmentReference2{
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
};
}
if (depth != vk::Format::eUndefined) {
attachments.emplace_back(vk::AttachmentDescription2{
.format = depth,
.samples = sample_count,
.loadOp = load_op,
.storeOp = vk::AttachmentStoreOp::eStore,
.stencilLoadOp = load_op,
.stencilStoreOp = vk::AttachmentStoreOp::eStore,
.initialLayout = vk::ImageLayout::eGeneral,
.finalLayout = vk::ImageLayout::eGeneral,
});
depth_resolve_attachment = depth_attachment_ref;
depth_attachment_ref = vk::AttachmentReference2{
.attachment = static_cast<u32>(attachments.size() - 1),
.layout = vk::ImageLayout::eGeneral,
};
}
}
const vk::StructureChain<vk::SubpassDescription2, vk::SubpassDescriptionDepthStencilResolve>
subpass = {
vk::SubpassDescription2{
.pipelineBindPoint = vk::PipelineBindPoint::eGraphics,
.inputAttachmentCount = 0,
.pInputAttachments = nullptr,
.colorAttachmentCount = use_color ? 1u : 0u,
.pColorAttachments = &color_attachment_ref,
.pResolveAttachments = &color_resolve_attachment,
.pDepthStencilAttachment = use_depth ? &depth_attachment_ref : nullptr,
},
vk::SubpassDescriptionDepthStencilResolve{
.depthResolveMode = vk::ResolveModeFlagBits::eSampleZero,
.stencilResolveMode = vk::ResolveModeFlagBits::eSampleZero,
.pDepthStencilResolveAttachment = &depth_resolve_attachment},
};
const vk::RenderPassCreateInfo2 renderpass_info = {
.attachmentCount = static_cast<u32>(attachments.size()),
.pAttachments = attachments.data(),
.subpassCount = 1,
.pSubpasses = &subpass.get(),
.dependencyCount = 0,
.pDependencies = nullptr,
};
return instance.GetDevice().createRenderPass2Unique(renderpass_info);
}
} // namespace Vulkan

View file

@ -4,6 +4,7 @@
#pragma once
#include <bit>
#include <mutex>
#include "common/math_util.h"
@ -23,20 +24,22 @@ struct RenderPass {
vk::Framebuffer framebuffer;
vk::RenderPass render_pass;
vk::Rect2D render_area;
vk::ClearValue clear;
std::array<vk::ClearValue, 2> clears;
u32 do_clear;
bool operator==(const RenderPass& other) const noexcept {
return std::tie(framebuffer, render_pass, render_area, do_clear) ==
std::tie(other.framebuffer, other.render_pass, other.render_area,
other.do_clear) &&
std::memcmp(&clear, &other.clear, sizeof(vk::ClearValue)) == 0;
std::memcmp(&clears, &other.clears, sizeof(clears)) == 0;
}
};
class RenderManager {
static constexpr u32 NumColorFormats = static_cast<u32>(VideoCore::PixelFormat::NumColorFormat);
static constexpr u32 NumDepthFormats = static_cast<u32>(VideoCore::PixelFormat::NumDepthFormat);
static constexpr size_t MaxSamples = 8;
static_assert(std::has_single_bit(MaxSamples));
public:
explicit RenderManager(const Instance& instance, Scheduler& scheduler);
@ -53,20 +56,26 @@ public:
/// Returns the renderpass associated with the color-depth format pair
vk::RenderPass GetRenderpass(VideoCore::PixelFormat color, VideoCore::PixelFormat depth,
bool is_clear);
bool is_clear, u8 sample_count = 1);
private:
/// Creates a renderpass configured appropriately and stores it in cached_renderpasses
vk::UniqueRenderPass CreateRenderPass(vk::Format color, vk::Format depth,
vk::AttachmentLoadOp load_op) const;
/// Creates an MSAA renderpass configured appropriately and stores it in cached_renderpasses
vk::UniqueRenderPass CreateRenderPassMSAA(vk::Format color, vk::Format depth,
vk::AttachmentLoadOp load_op,
vk::SampleCountFlagBits sample_count) const;
private:
const Instance& instance;
Scheduler& scheduler;
vk::UniqueRenderPass cached_renderpasses[NumColorFormats + 1][NumDepthFormats + 1][2];
vk::UniqueRenderPass cached_renderpasses[NumColorFormats + 1][NumDepthFormats + 1]
[std::bit_width(MaxSamples)][2];
std::mutex cache_mutex;
std::array<vk::Image, 2> images;
std::array<vk::ImageAspectFlags, 2> aspects;
std::array<vk::Image, 4> images;
std::array<vk::ImageAspectFlags, 4> aspects;
bool shadow_rendering{};
RenderPass pass{};
u32 num_draws{};

View file

@ -142,13 +142,13 @@ private:
static_assert(sizeof(GSConfigEntry) == 48);
struct PLConfigEntry {
static constexpr u8 EXPECTED_VERSION = 0;
static constexpr u8 EXPECTED_VERSION = 1;
u64 version; // Surprise tool that can help us later
StaticPipelineInfo pl_info;
};
static_assert(sizeof(PLConfigEntry) == 152);
static_assert(sizeof(PLConfigEntry) == 160);
class CacheFile;
class CacheEntry {

View file

@ -147,6 +147,11 @@ void MakeInitBarriers(vk::ImageAspectFlags aspect, u32 num_images,
}
}
void MakeInitBarriers(vk::ImageAspectFlags aspect, std::span<const vk::Image> images,
std::span<vk::ImageMemoryBarrier> out_barriers) {
MakeInitBarriers(aspect, images.size(), images, out_barriers);
}
vk::ImageSubresourceRange MakeSubresourceRange(vk::ImageAspectFlags aspect, u32 level = 0,
u32 levels = 1, u32 layer = 0) {
return vk::ImageSubresourceRange{
@ -164,8 +169,8 @@ constexpr u64 DOWNLOAD_BUFFER_SIZE = 16_MiB;
} // Anonymous namespace
void Handle::Create(u32 width, u32 height, u32 levels, TextureType type, vk::Format format,
vk::ImageUsageFlags usage, vk::ImageCreateFlags flags,
vk::ImageAspectFlags aspect, bool need_format_list,
vk::SampleCountFlagBits samples, vk::ImageUsageFlags usage,
vk::ImageCreateFlags flags, vk::ImageAspectFlags aspect, bool need_format_list,
std::string_view debug_name) {
const bool is_cube_map = type == TextureType::CubeMap && instance.IsLayeredRenderingSupported();
if (!is_cube_map) {
@ -194,7 +199,7 @@ void Handle::Create(u32 width, u32 height, u32 levels, TextureType type, vk::For
.extent = {width, height, 1},
.mipLevels = levels,
.arrayLayers = layers,
.samples = vk::SampleCountFlagBits::e1,
.samples = samples,
.usage = usage,
};
@ -344,7 +349,10 @@ bool TextureRuntime::ClearTexture(Surface& surface, const VideoCore::TextureClea
.src_image = surface.Image(),
};
if (clear.texture_rect == surface.GetScaledRect()) {
// MSAA images should always use a render-pass to clear both the MSAA texture and the regular
// texture at the same time
if (clear.texture_rect == surface.GetScaledRect() && (surface.GetSampleCount() == 1)) {
scheduler.Record([params, clear](vk::CommandBuffer cmdbuf) {
const vk::ImageSubresourceRange range = {
.aspectMask = params.aspect,
@ -407,7 +415,8 @@ void TextureRuntime::ClearTextureWithRenderpass(Surface& surface,
const auto color_format = is_color ? surface.pixel_format : PixelFormat::Invalid;
const auto depth_format = is_color ? PixelFormat::Invalid : surface.pixel_format;
const auto render_pass = renderpass_cache.GetRenderpass(color_format, depth_format, true);
const auto render_pass =
renderpass_cache.GetRenderpass(color_format, depth_format, true, surface.GetSampleCount());
const RecordParams params = {
.aspect = surface.Aspect(),
@ -416,8 +425,11 @@ void TextureRuntime::ClearTextureWithRenderpass(Surface& surface,
.src_image = surface.Image(),
};
scheduler.Record([params, is_color, clear, render_pass,
framebuffer = surface.Framebuffer()](vk::CommandBuffer cmdbuf) {
// Ensure we get the MSAA framebuffer if we are are doing an MSAA texture
const vk::Framebuffer framebuffer =
surface.Framebuffer((surface.GetSampleCount() > 1) ? Type::MultiSampled : Type::Current);
scheduler.Record([params, is_color, clear, render_pass, framebuffer](vk::CommandBuffer cmdbuf) {
const vk::AccessFlags access_flag =
is_color ? vk::AccessFlagBits::eColorAttachmentRead |
vk::AccessFlagBits::eColorAttachmentWrite
@ -462,13 +474,14 @@ void TextureRuntime::ClearTextureWithRenderpass(Surface& surface,
};
const auto clear_value = MakeClearValue(clear.value);
std::array<vk::ClearValue, 2> clear_values = {clear_value, clear_value};
const vk::RenderPassBeginInfo renderpass_begin_info = {
.renderPass = render_pass,
.framebuffer = framebuffer,
.renderArea = render_area,
.clearValueCount = 1,
.pClearValues = &clear_value,
.clearValueCount = static_cast<u32>(clear_values.size()),
.pClearValues = clear_values.data(),
};
cmdbuf.pipelineBarrier(params.pipeline_flags, pipeline_flags,
@ -594,6 +607,15 @@ bool TextureRuntime::BlitTextures(Surface& source, Surface& dest,
renderpass_cache.EndRendering();
// Must resolve images first
// Todo(wunk): Add a "dirty" flag for msaa resolves to avoid redundant image resolves
if (source.sample_count > 1) {
blit_helper.ResolveTexture(source);
}
if (dest.sample_count > 1) {
blit_helper.ResolveTexture(dest);
}
const RecordParams params = {
.aspect = source.Aspect(),
.filter = MakeFilter(source.pixel_format),
@ -731,8 +753,8 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceParams& param
const VideoCore::SurfaceFlagBits& initial_flag_bits)
: SurfaceBase{params, initial_flag_bits}, runtime{runtime_}, instance{runtime_.GetInstance()},
scheduler{runtime_.GetScheduler()}, traits{instance.GetTraits(pixel_format)},
handles{Handle(instance), Handle(instance), Handle(instance), Handle(instance)} {
handles{Handle(instance), Handle(instance), Handle(instance), Handle(instance),
Handle(instance)} {
if (pixel_format == VideoCore::PixelFormat::Invalid || !traits.transfer_support) {
return;
}
@ -753,8 +775,7 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceParams& param
ASSERT_MSG(format != vk::Format::eUndefined && levels >= 1,
"Image allocation parameters are invalid");
u32 num_images{};
std::array<vk::Image, 2> raw_images;
boost::container::static_vector<vk::Image, 3> raw_images;
vk::ImageCreateFlags flags{};
if (texture_type == VideoCore::TextureType::CubeMap) {
@ -773,26 +794,38 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceParams& param
}
const bool need_format_list = is_mutable && instance.IsImageFormatListSupported();
handles[Type::Base].Create(width, height, levels, texture_type, format, usage, flags,
traits.aspect, need_format_list, DebugName(false));
raw_images[num_images++] = handles[Type::Base].image;
handles[Type::Base].Create(width, height, levels, texture_type, format,
vk::SampleCountFlagBits::e1, usage, flags, traits.aspect,
need_format_list, DebugName(false));
raw_images.emplace_back(handles[Type::Base].image);
// Upscaled image
if (res_scale != 1) {
handles[Type::Scaled].Create(GetScaledWidth(), GetScaledHeight(), levels, texture_type,
format, usage, flags, traits.aspect, need_format_list,
DebugName(true));
raw_images[num_images++] = handles[Type::Scaled].image;
format, vk::SampleCountFlagBits::e1, usage, flags,
traits.aspect, need_format_list, DebugName(true));
raw_images.emplace_back(handles[Type::Scaled].image);
}
// Upscaled+MSAA image
if (sample_count > 1) {
handles[Type::MultiSampled].Create(
GetScaledWidth(), GetScaledHeight(), levels, texture_type, format,
vk::SampleCountFlagBits(sample_count), traits.usage, flags, traits.aspect,
need_format_list, DebugName(true, false, sample_count));
raw_images.emplace_back(handles[Type::MultiSampled].image);
}
current = res_scale != 1 ? Type::Scaled : Type::Base;
runtime.renderpass_cache.EndRendering();
scheduler.Record([raw_images, num_images, aspect = traits.aspect](vk::CommandBuffer cmdbuf) {
std::array<vk::ImageMemoryBarrier, 3> barriers;
MakeInitBarriers(aspect, num_images, raw_images, barriers);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTopOfPipe, vk::PipelineStageFlagBits::eTopOfPipe,
vk::DependencyFlagBits::eByRegion, 0, nullptr, 0, nullptr, num_images, barriers.data());
scheduler.Record([raw_images, aspect = traits.aspect](vk::CommandBuffer cmdbuf) {
std::array<vk::ImageMemoryBarrier, 4> barriers;
MakeInitBarriers(aspect, raw_images, barriers);
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eTopOfPipe,
vk::PipelineStageFlagBits::eTopOfPipe,
vk::DependencyFlagBits::eByRegion, 0, nullptr, 0, nullptr,
raw_images.size(), barriers.data());
});
}
@ -800,7 +833,8 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceBase& surface
const VideoCore::Material* mat)
: SurfaceBase{surface}, runtime{runtime_}, instance{runtime_.GetInstance()},
scheduler{runtime_.GetScheduler()}, traits{instance.GetTraits(mat->format)},
handles{Handle(instance), Handle(instance), Handle(instance), Handle(instance)} {
handles{Handle(instance), Handle(instance), Handle(instance), Handle(instance),
Handle(instance)} {
if (!traits.transfer_support) {
return;
}
@ -808,8 +842,7 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceBase& surface
const bool has_normal = mat && mat->Map(MapType::Normal);
const vk::Format format = traits.native;
u32 num_images{};
std::array<vk::Image, 3> raw_images;
boost::container::static_vector<vk::Image, 4> raw_images;
vk::ImageCreateFlags flags{};
if (texture_type == VideoCore::TextureType::CubeMap) {
@ -817,31 +850,41 @@ Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceBase& surface
}
const std::string debug_name = DebugName(false, true);
handles[Type::Base].Create(mat->width, mat->height, levels, texture_type, format, traits.usage,
flags, traits.aspect, false, debug_name);
raw_images[num_images++] = handles[Type::Base].image;
handles[Type::Base].Create(mat->width, mat->height, levels, texture_type, format,
vk::SampleCountFlagBits::e1, traits.usage, flags, traits.aspect,
false, debug_name);
raw_images.emplace_back(handles[Type::Base].image);
if (res_scale != 1) {
handles[Type::Scaled].Create(mat->width, mat->height, levels, texture_type,
vk::Format::eR8G8B8A8Unorm, traits.usage, flags, traits.aspect,
false, debug_name);
raw_images[num_images++] = handles[Type::Scaled].image;
vk::Format::eR8G8B8A8Unorm, vk::SampleCountFlagBits::e1,
traits.usage, flags, traits.aspect, false, debug_name);
raw_images.emplace_back(handles[Type::Scaled].image);
}
if (sample_count > 1) {
handles[Type::MultiSampled].Create(
GetScaledWidth(), GetScaledHeight(), levels, texture_type, format,
vk::SampleCountFlagBits(sample_count), traits.usage, flags, traits.aspect, false,
DebugName(res_scale != 1, true, sample_count));
raw_images.emplace_back(handles[Type::MultiSampled].image);
}
if (has_normal) {
handles[Type::Custom].Create(mat->width, mat->height, levels, texture_type, format,
traits.usage, flags, traits.aspect, false, debug_name);
raw_images[num_images++] = handles[Type::Custom].image;
vk::SampleCountFlagBits::e1, traits.usage, flags,
traits.aspect, false, debug_name);
raw_images.emplace_back(handles[Type::Custom].image);
}
current = res_scale != 1 ? Type::Scaled : Type::Base;
runtime.renderpass_cache.EndRendering();
scheduler.Record([raw_images, num_images, aspect = traits.aspect](vk::CommandBuffer cmdbuf) {
std::array<vk::ImageMemoryBarrier, 3> barriers;
MakeInitBarriers(aspect, num_images, raw_images, barriers);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTopOfPipe, vk::PipelineStageFlagBits::eTopOfPipe,
vk::DependencyFlagBits::eByRegion, 0, nullptr, 0, nullptr, num_images, barriers.data());
scheduler.Record([raw_images, aspect = traits.aspect](vk::CommandBuffer cmdbuf) {
std::array<vk::ImageMemoryBarrier, 4> barriers;
MakeInitBarriers(aspect, raw_images, barriers);
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eTopOfPipe,
vk::PipelineStageFlagBits::eTopOfPipe,
vk::DependencyFlagBits::eByRegion, 0, nullptr, 0, nullptr,
raw_images.size(), barriers.data());
});
custom_format = mat->format;
@ -1096,13 +1139,7 @@ void Surface::Download(const VideoCore::BufferTextureCopy& download,
});
}
void Surface::ScaleUp(u32 new_scale) {
if (res_scale == new_scale || new_scale == 1) {
return;
}
res_scale = new_scale;
void Surface::ScaleUp(u32 new_scale, u8 new_sample_count) {
const bool is_mutable = pixel_format == VideoCore::PixelFormat::RGBA8;
vk::ImageCreateFlags flags{};
@ -1113,9 +1150,13 @@ void Surface::ScaleUp(u32 new_scale) {
flags |= vk::ImageCreateFlagBits::eMutableFormat;
}
const bool res_scale_modified = res_scale != new_scale;
if (res_scale_modified && new_scale > 1) {
res_scale = new_scale;
handles[Type::Scaled].Create(GetScaledWidth(), GetScaledHeight(), levels, texture_type,
traits.native, traits.usage, flags, traits.aspect, false,
DebugName(true));
traits.native, vk::SampleCountFlagBits::e1, traits.usage,
flags, traits.aspect, false, DebugName(true));
current = Type::Scaled;
runtime.renderpass_cache.EndRendering();
@ -1137,6 +1178,30 @@ void Surface::ScaleUp(u32 new_scale) {
};
BlitScale(blit, true);
}
}
if ((res_scale_modified || sample_count != new_sample_count) && new_sample_count > 1) {
sample_count = new_sample_count;
handles[Type::MultiSampled].Create(
GetScaledWidth(), GetScaledHeight(), levels, texture_type, traits.native,
vk::SampleCountFlagBits(sample_count), traits.usage, flags, traits.aspect, false,
DebugName(res_scale != 1, false, sample_count));
// The multi-sampled image is just a transient image that is almost always immediately
// resolved into the current image, and should not be representative of the entire surface!
//
// current = Type::MultiSampled;
runtime.renderpass_cache.EndRendering();
scheduler.Record([raw_images = std::array{Image(Type::MultiSampled)},
aspect = traits.aspect](vk::CommandBuffer cmdbuf) {
std::array<vk::ImageMemoryBarrier, 1> barriers;
MakeInitBarriers(aspect, 1, raw_images, barriers);
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eTopOfPipe,
vk::PipelineStageFlagBits::eTopOfPipe,
vk::DependencyFlagBits::eByRegion, {}, {}, barriers);
});
}
}
u32 Surface::GetInternalBytesPerPixel() const {
@ -1185,7 +1250,7 @@ vk::ImageView Surface::CopyImageView() noexcept {
flags |= vk::ImageCreateFlagBits::eCubeCompatible;
}
copy_handle.Create(GetScaledWidth(), GetScaledHeight(), levels, texture_type, traits.native,
traits.usage, flags, traits.aspect, false);
vk::SampleCountFlagBits::e1, traits.usage, flags, traits.aspect, false);
copy_layout = vk::ImageLayout::eUndefined;
}
@ -1317,7 +1382,8 @@ vk::ImageView Surface::ImageView(ViewType view_type, Type type) noexcept {
}
vk::Framebuffer Surface::Framebuffer(Type type) noexcept {
auto& handle = handles[type == Type::Current ? current : type];
type = (type == Type::Current) ? current : type;
auto& handle = handles[type];
if (handle.framebuffer) {
return handle.framebuffer;
}
@ -1327,11 +1393,20 @@ vk::Framebuffer Surface::Framebuffer(Type type) noexcept {
const auto color_format = is_depth ? PixelFormat::Invalid : pixel_format;
const auto depth_format = is_depth ? pixel_format : PixelFormat::Invalid;
const auto image_view = ImageView(ViewType::Mip0, type);
boost::container::small_vector<vk::ImageView, 2> image_views;
if (sample_count > 1) {
// Main surface + MSAA surface
image_views.emplace_back(ImageView(ViewType::Mip0, current));
image_views.emplace_back(ImageView(ViewType::Mip0, Type::MultiSampled));
} else {
image_views.emplace_back(ImageView(ViewType::Mip0, type));
}
const vk::FramebufferCreateInfo framebuffer_info = {
.renderPass = runtime.renderpass_cache.GetRenderpass(color_format, depth_format, false),
.attachmentCount = 1u,
.pAttachments = &image_view,
.renderPass =
runtime.renderpass_cache.GetRenderpass(color_format, depth_format, false, sample_count),
.attachmentCount = static_cast<u32>(image_views.size()),
.pAttachments = image_views.data(),
.width = handle.width,
.height = handle.height,
.layers = handle.layers,
@ -1448,7 +1523,8 @@ void Surface::BlitScale(const VideoCore::TextureBlit& blit, bool up_scale) {
Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferParams& params,
Surface* color, Surface* depth)
: VideoCore::FramebufferParams{params}, instance{runtime.GetInstance()},
res_scale{color ? color->res_scale : (depth ? depth->res_scale : 1u)} {
res_scale{color ? color->res_scale : (depth ? depth->res_scale : 1u)},
sample_count{color ? color->sample_count : (depth ? depth->sample_count : u8(1u))} {
auto& renderpass_cache = runtime.GetRenderpassCache();
if (shadow_rendering && !color) {
return;
@ -1457,7 +1533,7 @@ Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferPa
width = height = std::numeric_limits<u32>::max();
u32 num_attachments{};
std::array<vk::ImageView, 2> attachments;
std::array<vk::ImageView, 4> attachments;
const auto prepare = [&](u32 index, Surface* surface) {
const auto extent = surface->RealExtent();
@ -1473,8 +1549,8 @@ Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferPa
const auto extent = color->RealExtent();
width = extent.width;
height = extent.height;
render_pass =
renderpass_cache.GetRenderpass(PixelFormat::Invalid, PixelFormat::Invalid, false);
render_pass = renderpass_cache.GetRenderpass(PixelFormat::Invalid, PixelFormat::Invalid,
false, sample_count);
images[0] = color->Image();
image_views[0] = color->StorageView();
aspects[0] = vk::ImageAspectFlagBits::eColor;
@ -1487,7 +1563,23 @@ Framebuffer::Framebuffer(TextureRuntime& runtime, const VideoCore::FramebufferPa
prepare(1, depth);
attachments[num_attachments++] = image_views[1];
}
render_pass = renderpass_cache.GetRenderpass(formats[0], formats[1], false);
// MSAA attachments
if (sample_count > 1) {
if (color) {
images[2] = color->Image(Type::MultiSampled);
image_views[2] = color->ImageView(ViewType::Mip0, Type::MultiSampled);
aspects[2] = color->Aspect();
attachments[num_attachments++] = image_views[2];
}
if (depth) {
images[3] = depth->Image(Type::MultiSampled);
image_views[3] = depth->ImageView(ViewType::Mip0, Type::MultiSampled);
aspects[3] = depth->Aspect();
attachments[num_attachments++] = image_views[3];
}
}
render_pass = renderpass_cache.GetRenderpass(formats[0], formats[1], false, sample_count);
}
const vk::FramebufferCreateInfo framebuffer_info = {

View file

@ -29,6 +29,7 @@ enum Type {
Current = -1,
Base = 0,
Scaled,
MultiSampled,
Custom,
Copy,
Num,
@ -79,8 +80,9 @@ struct Handle {
}
void Create(u32 width, u32 height, u32 levels, VideoCore::TextureType type, vk::Format format,
vk::ImageUsageFlags usage, vk::ImageCreateFlags flags, vk::ImageAspectFlags aspect,
bool need_format_list, std::string_view debug_name = {});
vk::SampleCountFlagBits samples, vk::ImageUsageFlags usage,
vk::ImageCreateFlags flags, vk::ImageAspectFlags aspect, bool need_format_list,
std::string_view debug_name = {});
void Destroy();
@ -252,8 +254,8 @@ public:
void Download(const VideoCore::BufferTextureCopy& download,
const VideoCore::StagingData& staging);
/// Scales up the surface to match the new resolution scale.
void ScaleUp(u32 new_scale);
/// Scales up the surface to match the new resolution scale and sample-count.
void ScaleUp(u32 new_scale, u8 new_sample_count);
/// Returns the bpp of the internal surface format
u32 GetInternalBytesPerPixel() const;
@ -306,7 +308,8 @@ public:
formats(std::exchange(
other.formats, {VideoCore::PixelFormat::Invalid, VideoCore::PixelFormat::Invalid})),
width(std::exchange(other.width, 0)), height(std::exchange(other.height, 0)),
res_scale(std::exchange(other.res_scale, 1)) {}
res_scale(std::exchange(other.res_scale, 1)),
sample_count(std::exchange(other.sample_count, 1)) {}
Framebuffer& operator=(Framebuffer&& other) noexcept {
VideoCore::FramebufferParams::operator=(std::move(other));
@ -321,6 +324,7 @@ public:
width = std::exchange(other.width, 0);
height = std::exchange(other.height, 0);
res_scale = std::exchange(other.res_scale, 1);
sample_count = std::exchange(other.sample_count, 1);
return *this;
}
@ -337,11 +341,11 @@ public:
return framebuffer;
}
[[nodiscard]] std::array<vk::Image, 2> Images() const noexcept {
[[nodiscard]] std::array<vk::Image, 4> Images() const noexcept {
return images;
}
[[nodiscard]] std::array<vk::ImageAspectFlags, 2> Aspects() const noexcept {
[[nodiscard]] std::array<vk::ImageAspectFlags, 4> Aspects() const noexcept {
return aspects;
}
@ -353,19 +357,26 @@ public:
return res_scale;
}
u8 Samples() const noexcept {
return sample_count;
}
private:
const Instance& instance;
std::array<vk::Image, 2> images{};
std::array<vk::ImageView, 2> image_views{};
// Color, Depth, ColorMSAA, DepthMSAA
std::array<vk::Image, 4> images{};
std::array<vk::ImageView, 4> image_views{};
vk::Framebuffer framebuffer{};
vk::RenderPass render_pass{};
std::vector<vk::UniqueImageView> framebuffer_views;
std::array<vk::ImageAspectFlags, 2> aspects{};
std::array<VideoCore::PixelFormat, 2> formats{VideoCore::PixelFormat::Invalid,
VideoCore::PixelFormat::Invalid};
std::array<vk::ImageAspectFlags, 4> aspects{};
std::array<VideoCore::PixelFormat, 4> formats{
VideoCore::PixelFormat::Invalid, VideoCore::PixelFormat::Invalid,
VideoCore::PixelFormat::Invalid, VideoCore::PixelFormat::Invalid};
u32 width{};
u32 height{};
u32 res_scale{1};
u8 sample_count{1};
};
class Sampler {