fireworks/models/gpt-oss-120b

Common Name: OpenAI gpt-oss-120b

Fireworks
Released on Oct 16 12:00 AMSupportedTool Invocation
CompareTry in Chat

Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-120b is used for production, general purpose, high reasoning use-cases that fits into a single H100 GPU.

Specifications

Context
128K
Inputtext
Outputtext

Performance (7-day Average)

Collecting…
Collecting…
Collecting…

Pricing

Input$0.17/MTokens
Output$0.66/MTokens

Availability Trend (24h)

Performance Metrics (24h)

Similar Models

$0.24/$0.97/M
ctx256Kmaxavailtps
InOutCap

Latest Qwen3 thinking model, competitive against the best close source models in Jul 2025.

$0.24/$0.97/M
ctx256Kmaxavailtps
InOutCap

Updated FP8 version of Qwen3-235B-A22B non-thinking mode, with better tool use, coding, instruction following, logical reasoning and text comprehension capabilities

$0.24/$0.97/M
ctx128Kmaxavailtps
InOutCap

Latest Qwen3 state of the art model, 235B with 22B active parameter model

$0.22/$0.22/M
ctx128Kmaxavailtps
InOutCap

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.