For me, this result is also not too surprising:
-
If allowing / using Undefined Behavior (UB) would allow for systematically better optimizations, Rust programs would be systematically slower than C or C++ programs, since Rust does not allow UB. But this is not the case. Rather, sometimes Rust programs are faster, and sometimes C/CC++ programs. A good example is the Debian Benchmark Game Comparison.
-
Now, one could argue that in the cases where C/C++ programs turned out to be faster than Rust programs, that at least im these cases exploiting UB gave an advantage. But, if you examine these programs im the Debian benchmark game (or in other places), this is not the case either. They do not rely on clever compiler optimizations based on assumptioms around UB. Instead, they make heavy use of compiler and SIMD intrinsics, manual vectorization, inline assembly, manual loop unrolling, and in general micro-managing what the CPU does. In fact, these implementations are long and complex and not at all idiomatic C/C++ code.
-
Thirdly, one could say that while these benchmark examples are not idiomatic C code, one at times needs that specific capability to fall back to things like inline assembly, and that this is a characteristic capability of C snd C++.
Well, Rust supports inline assembly as well, though it is rarely used.