While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert SLEB128-encoded type/addend to use ULEB128 instead, the generate code is inferior to or on par with SLEB128 for one-byte encodings on x86, AArch64, and RISC-V. Haven't tried wider values - but zigzag encoding is likely slower as well
// One-byte case for SLEB128
int64_t from_signext(uint64_t v) {
return v < 64 ? v - 128 : v;
}
// One-byte case for ULEB128 with zig-zag encoding
int64_t from_zigzag(uint64_t z) {
return (z >> 1) ^ -(z & 1);
}
londons_explore 23 minutes ago [-]
This sort of analysis is great.
Now why can't compilers do this sort of thing automatically?
Almost any problem seems to be possible to speed up 1000x in AVX512+days of thought compared to the naive version written in a python loop. If we could automate that whole process for big codebases the performance gains could be huge.
cmovq 9 minutes ago [-]
Compilers can’t really, in a meaningful way, change the layout of your data in memory. And you do need to think about your memory layout to get any benefit from SIMD. You’ll notice a lot of compiler auto vectorization insert many instructions just to shuffle data around to get to a usable layout, which negates much of the benefit.
diamondlovesyou 14 minutes ago [-]
> Now why can't compilers do this sort of thing automatically?
They do - they just can't assume GFNI instructions are present unless you explicitly say so: https://godbolt.org/z/eYasbKsse
Rendered at 08:13:30 GMT+0000 (Coordinated Universal Time) with Vercel.
// One-byte case for SLEB128 int64_t from_signext(uint64_t v) { return v < 64 ? v - 128 : v; }
// One-byte case for ULEB128 with zig-zag encoding int64_t from_zigzag(uint64_t z) { return (z >> 1) ^ -(z & 1); }
Now why can't compilers do this sort of thing automatically?
Almost any problem seems to be possible to speed up 1000x in AVX512+days of thought compared to the naive version written in a python loop. If we could automate that whole process for big codebases the performance gains could be huge.
They do - they just can't assume GFNI instructions are present unless you explicitly say so: https://godbolt.org/z/eYasbKsse