Rsi

Comment

Author: Admin | 2025-04-28

At a time (64 bytes) as vmovdqa (Move aligned packed integer values), and thus the zeroing can be done in fewer instructions that’s still considered a reasonable number.Zeroing also shows up elsewhere, such as when initializing structs. Those have also previously employed SIMD instructions where relevant, e.g. this:// dotnet run -c Release -f net8.0 --filter "*" --runtimes net8.0 net9.0using BenchmarkDotNet.Attributes;using BenchmarkDotNet.Running;BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);[DisassemblyDiagnoser][HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]public class Tests{ [Benchmark] public MyStruct Init() => new(); public struct MyStruct { public Int128 A, B, C, D; }}produces this assembly today on .NET 8:; Tests.Init() vzeroupper vxorps ymm0,ymm0,ymm0 vmovdqu32 [rsi],zmm0 mov rax,rsi ret; Total bytes of code 17But, if we tweak MyStruct to add a field of a reference type anywhere in the struct (e.g. add public string Oops; as the first line of the struct above), it knocks the initialization off this optimized path, and we end up with initialization like this on .NET 8:; Tests.Init() xor eax,eax mov [rsi],rax mov [rsi+8],rax mov [rsi+10],rax mov [rsi+18],rax mov [rsi+20],rax mov [rsi+28],rax mov [rsi+30],rax mov [rsi+38],rax mov [rsi+40],rax mov [rsi+48],rax mov rax,rsi ret; Total bytes of code 45This is due to alignment requirements in order to provide necessary atomicity guarantees. But rather than giving up wholesale, dotnet/runtime#102132 allows the SIMD zeroing to be used for the contiguous portions that don’t contain GC references, so now on .NET 9 we get this:; Tests.Init() xor eax,eax mov [rsi],rax vxorps xmm0,xmm0,xmm0 vmovdqu32 [rsi+8],zmm0 mov [rsi+48],rax mov rax,rsi ret; Total bytes of code 27This optimization isn’t specific to AVX512,

Add Comment