The talks that Niles gave at the Utah Zig meetups (linked in the repo) were great, just wished the AV setup was a little smoother. There seemed like there some really neat visualizations that Niles prepared that flopped. Either way, I recommend it. Inspired me to read a lot more machine code these days.
I guess they are too tailored to the actual memory layout with respective memory access delay of the architecture, but I would like to be shown that I am wrong and it is feasible.
neerajsi 4 hours ago [-]
Very interesting project!
I wonder if there's a way to make this set of techniques less brittle and more applicable to any language. I guess you're looking at a new backend or some enhancements to one of the parser generator tools.
adev_ 3 hours ago [-]
I have applied a subset of these techniques in a tokenizer in C++ to parse a language syntactically similar to Swift: no inline assembly, no intrinsics, no SWAR but reduce branching, cache optimization and SIMD parsing + explicit vectorization.
I get:
- ~4 MLOC/sec/core on a laptop
- ~ 8-9MLOC/sec/core on a modern AMD sever grade CPU with AVX512.
So yes, it is definitively possible.
Rendered at 20:22:15 GMT+0000 (Coordinated Universal Time) with Vercel.
I guess they are too tailored to the actual memory layout with respective memory access delay of the architecture, but I would like to be shown that I am wrong and it is feasible.
I wonder if there's a way to make this set of techniques less brittle and more applicable to any language. I guess you're looking at a new backend or some enhancements to one of the parser generator tools.
I get:
- ~4 MLOC/sec/core on a laptop
- ~ 8-9MLOC/sec/core on a modern AMD sever grade CPU with AVX512.
So yes, it is definitively possible.