In practice
It is a trade-off between huge vocabularies (one word = one token) and tiny ones (one character = one token). It handles unseen words, typos, and many languages without blowing up in size. Every modern LLM uses some form of subword tokenization.
Related terms
Seen in the wild
0 entries mentioning itNo archive entry mentions it explicitly. Appears in broader contexts.