The developer of the RPCS3 ‘PlayStation 3’ emulator has posted a detailed blog highlighting the advantage of AVX-512 on modern CPUs. The blog post helps us understand how AVX-512 works and how it is beneficial for the emulator for those who want to get extra performance.
RPCS3 ‘PlayStation 3 Emulator’ Dev Highlights The Performance Advantage of AVX-512 Enabled CPUs
The blog was published by Whatcookie, one of the many developers on the RPCS3 Emulator project and compares how AVX-512 helps over standard AVX2 instructions. You can read the full blog here or to keep it simple, the main advantages of AVX-512 come in the form of:
- Larger register file
- New forms of old instructions
- Mask registers
The developer shows how all of this helps in improving the performance within RPCS3, the go-to-choice for PS3 Emulators. An Intel Core i9-12900K CPU was used for testing at 5.2 GHz with AVX-512 enabled. Using the standard SSE2 instructions, the game only delivered 5 FPS while moving to SSE 4.1 delivered a massive 160 FPS gain. It is stated that the reason for this is due to the lack of SSSE3 instructions which are essential for the PlayStation 3 Emulator.
Moving to AVX2/FMA, you get an additional 13% performance boost, and switching from AVX2 to AVX-512, you get a 30% performance boost to 242 FPS.
The SSE4.1 target achieves an average of 160 FPS, while the AVX2/FMA target achieves an average of 190 FPS. This is a 18% improvement over the SSE4.1 target. AVX2 doesn’t include many new instructions over SSE4.1, but it does include a new 3 operand form for instructions, which eliminates many register to register
movinstructions. Crucially, all CPUs that support AVX2 also support FMA instructions. FMA instructions aren’t just faster than a chain of multiply + add instructions, but can also produce different results due to not rounding to single precision between the multiply and the add. Accurately emulating this without FMA instructions adds some overhead, and so native FMA operations help out quite a bit.
The Icelake tier AVX-512 target hits a ludicrous 235 FPS average, 23% faster than the AVX2/FMA target. The sheer number of new instructions added in AVX-512 is so large that quite a number of them end up being useful for RPCS3. Unlike AVX2 which was mostly a straightforward extension of existing SSE instructions to 256 bits, AVX-512 includes a huge number of new features which are very useful for SIMD programming, even at lower bit widths. However, since intel chose to market AVX-512 with the -512 moniker, people who aren’t familiar with the instruction set usually fixate on the 512 bit vector aspect of the instruction set.
via Whatcookie Github Blog
While Intel has been removing AVX-512 support from its 12th Generation Alder Lake CPUs, we have seen the performance potential and it looks like AMD’s recently announced Zen 4 ‘Ryzen 7000’ CPU lineup can take big advantage of that. The Zen 4 core architecture will support AVX-512 and if there is a successor of Steam Deck or various other handheld consoles that utilize the next-gen CPUs, we can see several users take advantage of the instruction set to emulate older games with really good performance.
This will prove very beneficial for the RPCS3 ‘PlayStation 3’ Emulator & may force the blue team to reconsider removing AVX-512 from its consumer chips.
News Source: RPCS3