Step 3.5 Flash is StepFun's most capable open-source foundation model, engineered to deliver frontier reasoning and agentic capabilities with exceptional efficiency. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token. This intelligence density allows it to rival the reasoning depth of top-tier proprietary models while maintaining the agility required for real-time interaction.
The model achieves deep reasoning at speed through 3-way Multi-Token Prediction (MTP-3), delivering generation throughput of 100–300 tok/s in typical usage with peaks at 350 tok/s for single-stream coding tasks. It functions as a robust engine for coding and agents with seamless native OpenClaw integration, achieving 74.4% on SWE-bench Verified and 51.0% on Terminal-Bench 2.0. Step 3.5 Flash supports efficient long context with a 256K context window using a 3:1 Sliding Window Attention (SWA) ratio, integrating three SWA layers for every one full-attention layer.
The model operates on a sparse Mixture of Experts architecture that selectively activates only 11B parameters per token from its total 196B parameters. This approach provides intelligence density that rivals top-tier proprietary models while maintaining computational efficiency. The model integrates a scalable RL framework that drives consistent self-improvement and employs hybrid attention mechanisms for long-context processing.
Step 3.5 Flash demonstrates superior performance across reasoning, coding, and agentic tasks, handling complex multi-step reasoning chains with immediate responsiveness. It proves capable of sophisticated long-horizon tasks with unwavering stability and maintains consistent performance across massive datasets or long codebases while significantly reducing computational overhead.
The model is optimized for accessible local deployment, running securely on high-end consumer hardware like Mac Studio M4 Max and NVIDIA DGX Spark. It serves as a frontier open-source MoE model built specifically for OpenClaw agents, making it one of the best open models for running serious agents.
admin
Step 3.5 Flash is designed for developers and organizations needing frontier open-source AI models for agentic applications. It serves users requiring high-performance coding assistance, complex reasoning capabilities, and efficient local deployment options. The model specifically targets those working with OpenClaw agents and needing robust performance for serious agent applications while maintaining data privacy through local execution.