Model Capabilities
20B parameter transformer with 800B token context window
Supports 100+ languages including Lojban and Toki Pona
Live model updates with differential privacy protection
Neural Architecture
Multi-layer transformer with residual connections and attention scaling optimized for
parallelized inference