Crayon — Next-Gen Production Tokenizer
Bottom Line Up Front: Crayon is a high-performance, memory-aligned tokenizer that achieves 2M+ tokens/second on CPU through Double-Array Trie (DAT) optimization, significantly reducing latency in production LLM pipelines.
- Peak Throughput
- 2M+
- tokens/second on CPU
- Vocabulary Capacity
- ~500K
- target vocabulary size
- Compute Efficiency
- <0.01¢
- per million tokens (goal)
Built on information-theoretic foundations, Crayon uses a longest-match trie with perfect hashing, SIMD acceleration, and zero-copy processing.