Minor nit: In familiarity, you gloss over the fact that it's character rather than token based which might be worth a shout out:
"Microgpt's larger cousins using building blocks called tokens representing one or more letters. That's hard to reason about, but essential for building sentences and conversations.
"So we'll just deal with spelling names using the English alphabet. That gives us 26 tokens, one for each letter."
mips_avatar 3 minutes ago [-]
Using ascii characters is a simple form of tokenization with less compression
b44 1 hours ago [-]
hm. the way i see things, characters are the natural/obvious building blocks and tokenization is just an improvement on that. i do mention chatgpt et al. use tokens in the last q&a dropdown, though
msla 51 minutes ago [-]
About how many training steps are required to get good output?
b44 43 minutes ago [-]
not many. diminishing returns start before 1000 and past that you should just add a second/third layer
Rendered at 22:52:54 GMT+0000 (Coordinated Universal Time) with Vercel.
"Microgpt's larger cousins using building blocks called tokens representing one or more letters. That's hard to reason about, but essential for building sentences and conversations.
"So we'll just deal with spelling names using the English alphabet. That gives us 26 tokens, one for each letter."