openai-oxide

Feature-complete OpenAI client for Rust, Node.js, and Python.

openai-oxide implements the full Responses API, Chat Completions, and 20+ other endpoints with persistent WebSockets, hedged requests, streaming with early-parse, and type-safe Structured Outputs.

Why openai-oxide?

Streaming — SSE parser with anti-buffering headers, 2.5x faster per-chunk vs official JS SDK
WebSocket Mode — Persistent wss:// connections, 29-44% faster on multi-turn benchmarks
Stream FC Early Parse — Execute tools the moment arguments.done arrives
Structured Outputs — parse::<T>() auto-generates JSON schema from Rust types
Hedged Requests — Send redundant requests, cancel the slower (technique from Google’s “The Tail at Scale”)
WASM First-Class — Full streaming in Cloudflare Workers and browsers
1100+ Types — Auto-synced from Python SDK via openai-types crate

Packages

Package	Registry	Install
`openai-oxide`	crates.io	`cargo add openai-oxide`
`openai-types`	crates.io	`cargo add openai-types`
`openai-oxide`	npm	`npm install openai-oxide`
`openai-oxide`	PyPI	`pip install openai-oxide`

OpenAI Compatibility

Parameter names match the official Python SDK exactly. If the OpenAI docs show model="gpt-5.4", use .model("gpt-5.4") in Rust or {model: "gpt-5.4"} in Node.js.

See the OpenAI Docs Mapping for a complete cross-reference.

Installation

openai-oxide is available for three platforms. Pick your language:

Rust — Full-featured async client
Node.js — Native NAPI-RS bindings
Python — PyO3 bindings

All packages share the same Rust core for consistent behavior and performance.

Rust Installation

Add to Cargo.toml

cargo add openai-oxide tokio --features tokio/full

Or manually:

[dependencies]
openai-oxide = "0.9"
tokio = { version = "1", features = ["full"] }

Feature Flags

Every API endpoint is behind a feature flag. All enabled by default.

# Minimal: only Responses API
openai-oxide = { version = "0.9", default-features = false, features = ["responses"] }

Available features: chat, responses, embeddings, images, audio, files, fine-tuning, models, moderations, batches, uploads, beta

Ecosystem features: websocket, websocket-wasm, simd, macros

Configuration

#![allow(unused)]
fn main() {
use openai_oxide::OpenAI;

// From environment variable (recommended)
let client = OpenAI::from_env()?; // Uses OPENAI_API_KEY

// Explicit key
let client = OpenAI::new("sk-...");

// Custom config
use openai_oxide::config::ClientConfig;
let client = OpenAI::with_config(
    ClientConfig::new("sk-...").base_url("https://...").timeout_secs(30)
);

// Azure
use openai_oxide::azure::AzureConfig;
let client = OpenAI::azure(
    AzureConfig::new().azure_endpoint("https://my.openai.azure.com").api_key("...")
)?;
}

Node.js Installation

Install

npm install openai-oxide
# or
pnpm add openai-oxide
# or
yarn add openai-oxide

Prebuilt native binaries for: macOS (x64, arm64), Linux (x64, arm64, glibc & musl), Windows (x64).

Setup

const { Client } = require("openai-oxide");

// Uses OPENAI_API_KEY from environment
const client = new Client();

// Explicit key
const client = new Client("sk-...");

Available Methods

Method	Description
`createResponse(params)`	Full Responses API call
`createText(model, input)`	Fast path — returns text only
`createStoredResponseId(model, input)`	Fast path — returns response ID
`createTextFollowup(model, input, prevId)`	Multi-turn fast path
`createStream(model, input)`	Streaming responses
`wsSession()`	WebSocket persistent connection

Python Installation

Install

pip install openai-oxide
# or
uv pip install openai-oxide
# or
uv add openai-oxide

No Rust toolchain required — prebuilt wheels available.

Setup

from openai_oxide import Client

# Uses OPENAI_API_KEY from environment
client = Client()

# Explicit key
client = Client("sk-...")

Available Methods

Method	Description
`await client.create(model, input)`	Basic request
`await client.create_stream(model, input)`	Streaming
`await client.create_structured(model, input, name, schema)`	Structured output
`await client.create_with_tools(model, input, tools)`	Function calling

Quick Start

Set your API key:

export OPENAI_API_KEY="sk-..."

Rust

use openai_oxide::{OpenAI, types::responses::*};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let response = client.responses().create(
        ResponseCreateRequest::new("gpt-5.4-mini")
            .input("Explain quantum computing in one sentence.")
            .max_output_tokens(100)
    ).await?;

    println!("{}", response.output_text());
    Ok(())
}

Node.js

const { Client } = require("openai-oxide");

const client = new Client();
const text = await client.createText("gpt-5.4-mini", "Hello from Node!");
console.log(text);

Python

import asyncio, json
from openai_oxide import Client

async def main():
    client = Client()
    res = json.loads(await client.create("gpt-5.4-mini", "Hello from Python!"))
    print(res["text"])

asyncio.run(main())

Drop-in Migration

Switch from the official OpenAI SDK by changing one import line. Rest of your code stays the same.

Python

- from openai import AsyncOpenAI
+ from openai_oxide.compat import AsyncOpenAI

Full working example (mirrors official openai examples/parsing.py):

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK parsing example.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio
from typing import List

from pydantic import BaseModel

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str


async def main():
    client = AsyncOpenAI()

    completion = await client.chat.completions.parse(
        model="gpt-5.4-mini",
        messages=[
            {"role": "system", "content": "You are a helpful math tutor."},
            {"role": "user", "content": "solve 8x + 31 = 2"},
        ],
        response_format=MathResponse,
    )

    message = completion.choices[0].message
    if message.parsed:
        for step in message.parsed.steps:
            print(f"  {step.explanation} → {step.output}")
        print("answer:", message.parsed.final_answer)
    else:
        print("refusal:", message.refusal)


asyncio.run(main())

Node.js

- const OpenAI = require('openai');
+ const { OpenAI } = require('openai-oxide/compat');

Full working example (mirrors official openai SDK):

/**
 * Drop-in replacement for official openai SDK demo.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // Non-streaming:
    console.log("----- standard request -----");
    const completion = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "Say this is a test" }],
    });
    console.log(completion.choices[0].message.content);

    // Streaming:
    console.log("----- streaming request -----");
    const stream = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "How do I list files in a directory using Node.js?" }],
        stream: true,
    });
    for await (const chunk of stream) {
        const content = chunk.choices?.[0]?.delta?.content;
        if (content) process.stdout.write(content);
    }
    console.log();
}

main();

Guides

Step-by-step guides for common tasks with openai-oxide.

Each guide shows code in all three languages (Rust, Node.js, Python) and links to the relevant OpenAI documentation.

Chat Completions

Send messages to GPT models and receive completions. This is the most common API for conversational AI.

See the official Chat Completions guide and API reference.

Rust

//! Basic chat completion example.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example chat`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "You are a helpful assistant.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text("What is the capital of France?".into()),
                name: None,
            },
        ],
    );

    let response = client.chat().completions().create(request).await?;

    for choice in &response.choices {
        println!(
            "[{}] {}",
            choice.finish_reason,
            choice.message.content.as_deref().unwrap_or("")
        );
    }

    if let Some(usage) = &response.usage {
        println!(
            "\nTokens: {} prompt + {} completion = {} total",
            usage.prompt_tokens.unwrap_or(0),
            usage.completion_tokens.unwrap_or(0),
            usage.total_tokens.unwrap_or(0),
        );
    }

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example chat

Next Steps

Streaming — Stream chat completion tokens as they arrive
Function Calling — Let the model call your functions
Structured Output — Get JSON responses matching a schema

Responses API

The Responses API is OpenAI’s latest endpoint for generating text, replacing Chat Completions for new projects. It supports built-in tools, multi-turn conversations via previous_response_id, and structured output.

See the official Responses API reference.

Rust

//! Responses API example — web search + function tools.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example responses_api`

use openai_oxide::OpenAI;
use openai_oxide::types::responses::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    // Simple text input
    let request = ResponseCreateRequest::new("gpt-4o")
        .input("What are the latest developments in Rust programming?")
        .instructions("Be concise and cite sources when possible.")
        .tools(vec![ResponseTool::WebSearch {
            search_context_size: Some("medium".into()),
            user_location: None,
        }])
        .temperature(0.7)
        .max_output_tokens(1024)
        .store(true);

    let response = client.responses().create(request).await?;

    println!("Response ID: {}", response.id);
    println!("Status: {:?}", response.status);
    println!("\n{}", response.output_text());

    if let Some(usage) = &response.usage {
        println!(
            "\nTokens: {} in + {} out = {} total",
            usage.input_tokens.unwrap_or(0),
            usage.output_tokens.unwrap_or(0),
            usage.total_tokens.unwrap_or(0),
        );
    }

    // Multi-turn with previous_response_id
    let follow_up = ResponseCreateRequest::new("gpt-4o")
        .input("Can you elaborate on the async ecosystem?")
        .previous_response_id(&response.id);

    let response2 = client.responses().create(follow_up).await?;
    println!("\n--- Follow-up ---\n{}", response2.output_text());

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example responses_api

Next Steps

Streaming — Stream response events in real time
Function Calling — Use tools with the Responses API
WebSocket Sessions — Persistent connections for agent loops

Streaming

Stream tokens and events as they are generated, reducing time-to-first-token (TTFT) and enabling real-time UI updates.

See the official Streaming documentation for event types and behavior.

Stream Helpers (recommended)

High-level wrapper with typed events and automatic accumulation — no manual chunk stitching.

//! Streaming chat completion example.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example chat_stream`

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "You are a helpful assistant.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text("Write a haiku about Rust programming.".into()),
                name: None,
            },
        ],
    );

    let mut stream = client.chat().completions().create_stream(request).await?;

    while let Some(result) = stream.next().await {
        match result {
            Ok(chunk) => {
                for choice in &chunk.choices {
                    if let Some(content) = &choice.delta.content {
                        print!("{content}");
                    }
                    if choice.finish_reason.is_some() {
                        println!();
                    }
                }
            }
            Err(e) => {
                eprintln!("\nStream error: {e}");
                break;
            }
        }
    }

    Ok(())
}

Event Types

Event	When	Fields
`Chunk`	Every SSE chunk	Raw `ChatCompletionChunk`
`ContentDelta`	New text fragment	`delta`, `snapshot` (accumulated)
`ContentDone`	Text complete	`content` (full text)
`ToolCallDelta`	Argument fragment	`index`, `name`, `arguments_delta`, `arguments_snapshot`
`ToolCallDone`	Tool call complete	`index`, `call_id`, `name`, `arguments`
`RefusalDelta/Done`	Model refuses	`delta`/`refusal`
`Done`	Stream finished	`finish_reason`

Node.js (drop-in replacement)

Same syntax as official openai package — for await over stream:

/**
 * Drop-in replacement for official openai SDK demo.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // Non-streaming:
    console.log("----- standard request -----");
    const completion = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "Say this is a test" }],
    });
    console.log(completion.choices[0].message.content);

    // Streaming:
    console.log("----- streaming request -----");
    const stream = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "How do I list files in a directory using Node.js?" }],
        stream: true,
    });
    for await (const chunk of stream) {
        const content = chunk.choices?.[0]?.delta?.content;
        if (content) process.stdout.write(content);
    }
    console.log();
}

main();

Python (drop-in replacement)

Same syntax as official openai package — async for over stream:

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK demo.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


async def main():
    client = AsyncOpenAI()

    # Non-streaming:
    print("----- standard request -----")
    completion = await client.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            },
        ],
    )
    print(completion.choices[0].message.content)

    # Streaming:
    print("----- streaming request -----")
    stream = await client.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[
            {
                "role": "user",
                "content": "How do I output all files in a directory using Python?",
            },
        ],
        stream=True,
    )
    async for event in stream:
        if event.get("type") == "OutputTextDelta":
            print(event.get("delta", ""), end="")
        elif event.get("delta"):
            print(event.get("delta", ""), end="")
    print()


asyncio.run(main())

Responses API Streaming

Typed events for the Responses API:

#![allow(unused)]
fn main() {
use futures_util::StreamExt;
use openai_oxide::types::responses::{ResponseCreateRequest, ResponseStreamEvent};

let mut stream = client.responses()
    .create_stream(ResponseCreateRequest::new("gpt-5.4-mini").input("Hi"))
    .await?;

while let Some(Ok(event)) = stream.next().await {
    match event {
        ResponseStreamEvent::OutputTextDelta { delta, .. } => print!("{delta}"),
        ResponseStreamEvent::ResponseCompleted { response } => {
            println!("\nDone: {}", response.output_text());
        }
        _ => {}
    }
}
}

Next Steps

Function Calling — Stream with early tool-call parsing
WebSocket Sessions — Even lower latency with persistent connections
Structured Output — Type-safe responses

Function Calling

Let the model invoke your functions by defining tools. openai-oxide supports early-parsing of function call arguments during streaming, allowing you to execute tools ~400ms before the response finishes.

See the official Function Calling guide for tool schema definitions.

Rust

//! Tool calling example — model calls a function, we return the result.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example tool_calling`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let tools = vec![Tool {
        type_: "function".into(),
        function: FunctionDef {
            name: "get_weather".into(),
            description: Some("Get current weather for a city".into()),
            parameters: Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            })),
            strict: Some(true),
        },
    }];

    // Step 1: Send message with tools
    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![ChatCompletionMessageParam::User {
            content: UserContent::Text("What's the weather in Tokyo?".into()),
            name: None,
        }],
    )
    .tools(tools.clone())
    .tool_choice(ToolChoice::Mode("auto".into()));

    let response = client.chat().completions().create(request).await?;
    let message = &response.choices[0].message;

    if let Some(tool_calls) = &message.tool_calls {
        for tc in tool_calls {
            println!(
                "Tool call: {} ({})",
                tc.function.name, tc.function.arguments
            );

            // Step 2: Simulate function result
            let result = r#"{"temperature": 22, "condition": "Sunny", "unit": "celsius"}"#;

            // Step 3: Send tool result back
            let follow_up = ChatCompletionRequest::new(
                "gpt-4o-mini",
                vec![
                    ChatCompletionMessageParam::User {
                        content: UserContent::Text("What's the weather in Tokyo?".into()),
                        name: None,
                    },
                    ChatCompletionMessageParam::Assistant {
                        content: None,
                        name: None,
                        tool_calls: Some(tool_calls.clone()),
                        refusal: None,
                    },
                    ChatCompletionMessageParam::Tool {
                        content: result.into(),
                        tool_call_id: tc.id.clone(),
                    },
                ],
            )
            .tools(tools.clone());

            let final_response = client.chat().completions().create(follow_up).await?;
            println!(
                "\nAssistant: {}",
                final_response.choices[0]
                    .message
                    .content
                    .as_deref()
                    .unwrap_or("")
            );
        }
    } else {
        println!("Assistant: {}", message.content.as_deref().unwrap_or(""));
    }

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example tool_calling

Next Steps

Streaming — Stream function call arguments as they arrive
Structured Output — Combine tools with structured responses

WebSocket Sessions

Persistent WebSocket connections eliminate per-request TLS handshakes and HTTP overhead, achieving 37% faster round-trip times for agent loops and multi-turn conversations.

See the official Realtime API guide for session configuration.

Rust

//! WebSocket Responses API example — persistent connection for multi-turn.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket`

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::types::responses::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    // --- Example 1: Simple send/receive ---
    println!("=== Example 1: Simple request ===\n");

    let mut session = client.ws_session().await?;

    let request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What is the capital of France?")
        .max_output_tokens(256);

    let response = session.send(request).await?;
    println!("Response: {}", response.output_text());
    if let Some(usage) = &response.usage {
        println!(
            "Tokens: {} in + {} out",
            usage.input_tokens.unwrap_or(0),
            usage.output_tokens.unwrap_or(0),
        );
    }

    // --- Example 2: Multi-turn via same session ---
    println!("\n=== Example 2: Multi-turn ===\n");

    let follow_up = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What about Germany?")
        .previous_response_id(&response.id);

    let response2 = session.send(follow_up).await?;
    println!("Follow-up: {}", response2.output_text());

    // --- Example 3: Streaming events ---
    println!("\n=== Example 3: Streaming ===\n");

    let stream_request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("Count from 1 to 5, one number per line.")
        .max_output_tokens(128);

    let mut stream = session.send_stream(stream_request).await?;
    while let Some(event) = stream.next().await {
        let event = event?;
        use openai_oxide::types::responses::ResponseStreamEvent::*;
        match event {
            ResponseOutputTextDelta(evt) => print!("{}", evt.delta),
            ResponseCompleted(_) => println!("\n\n[completed]"),
            _ => {} // Other events: created, output_item.added, etc.
        }
    }

    // --- Clean up ---
    session.close().await?;
    println!("\nSession closed.");

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket

When to Use WebSockets

Agent loops with 3+ sequential LLM calls
Real-time conversational UIs
High-throughput batch processing where latency matters

Known Issues

Decimal temperature causes silent close (code=1000)

Status: OpenAI bug as of March 2026

Sending temperature as a decimal (e.g. 0.7, 1.2) over WebSocket causes the server to immediately close the connection with code=1000 and an empty reason — no error event is returned. Integer values (0, 1, 2) work fine. The same decimal values work normally over HTTP.

Workaround: Omit temperature from WebSocket requests (the API uses model default ~1.0), or round to integer.

Tracking: OpenAI Community #1375536

Structured Output

Force the model to return JSON matching a specific schema. Guarantees valid, parseable output without prompt engineering tricks.

See the official Structured Outputs guide for schema format and limitations.

Rust — `parse::<T>()` (recommended)

Derive JsonSchema on your struct and call parse::<T>(). The SDK auto-generates the schema and deserializes the response.

Requires feature structured: cargo add openai-oxide --features structured

#![allow(unused)]
fn main() {
// Live test for all new features — requires OPENAI_API_KEY
//
// cargo run --example live_features_test --features structured

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::stream_helpers::ChatStreamEvent;
use openai_oxide::types::chat::{ChatCompletionMessageParam, ChatCompletionRequest, UserContent};

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct MathAnswer {
    steps: Vec<Step>,
    final_answer: String,
}

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct Step {
    explanation: String,
    output: String,
}

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct Sentiment {
    sentiment: String,
    confidence: f64,
}

fn msg(text: &str) -> Vec<ChatCompletionMessageParam> {
    vec![ChatCompletionMessageParam::User {
        content: UserContent::Text(text.into()),
}

Rust — Manual Schema

For full control, construct the schema yourself:

//! Structured output with JSON Schema — model returns a validated JSON object.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example structured_output`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "Extract structured data from user messages.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text(
                    "My name is Alice, I'm 30, and I work as a software engineer at Acme Corp."
                        .into(),
                ),
                name: None,
            },
        ],
    )
    .response_format(ResponseFormat::JsonSchema {
        json_schema: JsonSchema {
            name: "person_info".into(),
            description: Some("Extracted person information".into()),
            schema: Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "occupation": {"type": "string"},
                    "company": {"type": "string"}
                },
                "required": ["name", "age", "occupation", "company"],
                "additionalProperties": false
            })),
            strict: Some(true),
        },
    });

    let response = client.chat().completions().create(request).await?;
    let content = response.choices[0]
        .message
        .content
        .as_deref()
        .unwrap_or("{}");

    let parsed: serde_json::Value = serde_json::from_str(content)?;
    println!("Extracted data:");
    println!("  Name: {}", parsed["name"]);
    println!("  Age: {}", parsed["age"]);
    println!("  Occupation: {}", parsed["occupation"]);
    println!("  Company: {}", parsed["company"]);

    Ok(())
}

Node.js (drop-in replacement)

Same syntax as official openai package — change one import:

/**
 * Drop-in replacement for official openai SDK structured output example.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 *
 * For Zod support: npm install zod zod-to-json-schema
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // JSON Schema (works without Zod)
    const MathResponseSchema = {
        type: "object",
        properties: {
            steps: {
                type: "array",
                items: {
                    type: "object",
                    properties: {
                        explanation: { type: "string" },
                        output: { type: "string" },
                    },
                    required: ["explanation", "output"],
                    additionalProperties: false,
                },
            },
            final_answer: { type: "string" },
        },
        required: ["steps", "final_answer"],
        additionalProperties: false,
    };

    const result = await client.chat.completions.parse({
        model: "gpt-5.4-mini",
        messages: [
            { role: "system", content: "You are a helpful math tutor." },
            { role: "user", content: "solve 8x + 31 = 2" },
        ],
        response_format: {
            type: "json_schema",
            json_schema: {
                name: "MathResponse",
                schema: MathResponseSchema,
                strict: true,
            },
        },
    });

    const message = result.choices[0].message;
    const parsed = JSON.parse(message.content);
    for (const step of parsed.steps) {
        console.log(`  ${step.explanation} → ${step.output}`);
    }
    console.log("answer:", parsed.final_answer);
}

main();

Python (drop-in replacement)

Same syntax as official openai package — change one import:

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK parsing example.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio
from typing import List

from pydantic import BaseModel

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str


async def main():
    client = AsyncOpenAI()

    completion = await client.chat.completions.parse(
        model="gpt-5.4-mini",
        messages=[
            {"role": "system", "content": "You are a helpful math tutor."},
            {"role": "user", "content": "solve 8x + 31 = 2"},
        ],
        response_format=MathResponse,
    )

    message = completion.choices[0].message
    if message.parsed:
        for step in message.parsed.steps:
            print(f"  {step.explanation} → {step.output}")
        print("answer:", message.parsed.final_answer)
    else:
        print("refusal:", message.refusal)


asyncio.run(main())

Next Steps

Function Calling — Combine structured output with tool use
Streaming — Stream with typed events
Responses API — Full parameter reference

Hedged Requests

Hedged requests race multiple identical API calls and return the first successful response. This technique reduces P99 tail latency by 50-96% at the cost of additional API usage.

This is an openai-oxide exclusive feature not available in the official SDKs.

Rust

#![allow(unused)]
fn main() {
use openai_oxide::{OpenAI, types::responses::*};

let client = OpenAI::from_env()?;

// Race 2 identical requests, return whichever finishes first
let response = client.responses().hedged_request(
    ResponseCreateRequest::new("gpt-5.4-mini")
        .input("Quick question: what is 2+2?"),
    2, // number of concurrent requests
).await?;
}

When to Use

Latency-sensitive applications (real-time UIs, voice assistants)
Short, deterministic prompts where cost of duplicates is low
Production systems with strict P99 SLA requirements

Trade-offs

Uses N times the tokens (one request per hedge)
Best for short prompts where the latency gain outweighs cost
Not recommended for long-running completions with max_output_tokens > 1000

Webhook Verification

Verify OpenAI webhook signatures to ensure payloads are authentic and not replayed.

Requires feature webhooks: cargo add openai-oxide --features webhooks

See the official Webhooks documentation for setup.

Usage

#![allow(unused)]
fn main() {
use openai_oxide::resources::webhooks::Webhooks;

// Initialize with your webhook secret (from OpenAI dashboard)
let wh = Webhooks::new("whsec_YOUR_WEBHOOK_SECRET")?;

// In your HTTP handler — extract headers and body
let signature = headers.get("webhook-signature").unwrap();
let timestamp = headers.get("webhook-timestamp").unwrap();

// Verify and parse in one call
let event: serde_json::Value = wh.unwrap(body_bytes, signature, timestamp)?;

// Or verify only (without parsing)
wh.verify(body_bytes, signature, timestamp)?;
}

Security

HMAC-SHA256 signature validation
Timestamp replay protection (5-minute tolerance)
Supports multiple signature versions in header
Base64-encoded whsec_ secrets (auto-stripped)

Next Steps

API Reference — Full method signatures

WASM / Cloudflare Workers

openai-oxide compiles to WebAssembly for use in Cloudflare Workers, Deno Deploy, and browser environments. Full streaming support is included.

Setup

[dependencies]
openai-oxide = { version = "0.9", default-features = false, features = ["responses", "websocket-wasm"] }

Disable default features to exclude tokio and native TLS, which are not available in WASM.

Cloudflare Worker Example

use openai_oxide::OpenAI;
use worker::*;

#[event(fetch)]
async fn main(req: Request, env: Env, _ctx: Context) -> Result<Response> {
    let client = OpenAI::new(env.secret("OPENAI_API_KEY")?.to_string());

    let response = client.responses().create(
        ResponseCreateRequest::new("gpt-5.4-mini")
            .input("Hello from the edge!")
    ).await.map_err(|e| worker::Error::from(e.to_string()))?;

    Response::ok(response.output_text())
}

Limitations

No filesystem access (audio file uploads require bytes, not paths)
WebSocket mode uses websocket-wasm feature instead of websocket
SIMD feature is not available in WASM targets

Types (`openai-types` crate)

All OpenAI API types live in the standalone openai-types crate. It has zero runtime dependencies beyond serde and can be used independently of the HTTP client.

Architecture

openai-types/src/{domain}/
  _gen.rs     ← auto-generated from Python SDK (py2rust)
  manual.rs   ← hand-crafted: enums, builders, Option fields
  mod.rs      ← re-exports both

1100+ types across 24 domains
Auto-synced from the official Python SDK via py2rust.py
Manual overrides preserved on sync (enums with Other(String), builder patterns, precise Option fields)
Feature-gated — enable only the domains you need

Usage

The types are automatically available through openai-oxide:

#![allow(unused)]
fn main() {
use openai_oxide::types::chat::*;
use openai_oxide::types::responses::*;
use openai_oxide::types::audio::*;
}

Or use openai-types directly (no HTTP client dependency):

#![allow(unused)]
fn main() {
// Cargo.toml: openai-types = { version = "0.1", features = ["chat", "responses"] }
use openai_types::chat::ChatCompletionRequest;
use openai_types::responses::ResponseCreateRequest;
use openai_types::shared::{Role, Usage, FinishReason};
}

Domains

Domain	Types	Features
chat	~50	Request/response, messages, tools, streaming
responses	~340	Responses API, computer actions, MCP, code interpreter
audio	~36	Speech, transcription, translation, streaming events
beta	~80	Assistants, threads, runs, vector stores
realtime	~188	WebSocket events, session management
image	~52	Generate, edit, variations, streaming
embedding	~6	Create embeddings
file	~14	CRUD, chunking strategies
fine_tuning	~25	Jobs, methods, integrations
batch	~12	Batch processing
moderation	~8	Content moderation
model	~2	Model listing
uploads	~7	Multipart uploads
shared	~28	Role, Usage, FinishReason, ListResponse<T>
+ 10 more	~300	completion, containers, conversations, evals, graders, skills, vector_stores, video, webhooks, websocket

Enums with `Other(String)`

All enums include an Other(String) catch-all variant for forward compatibility. If OpenAI adds a new role or finish reason, your code won’t break:

#![allow(unused)]
fn main() {
use openai_types::shared::Role;

let role: Role = serde_json::from_str("\"developer\"").unwrap();
assert_eq!(role, Role::Developer);

// Unknown roles deserialize without error
let future_role: Role = serde_json::from_str("\"supervisor\"").unwrap();
assert!(matches!(future_role, Role::Other(_)));
}

Updating Types

When OpenAI updates their Python SDK:

# Re-generate all _gen.rs files (manual files untouched)
python3 scripts/py2rust.py sync ~/openai-python/src/openai/types/ openai-types/src/
cargo test

Benchmarks

All benchmarks: median of 3 runs × 5 iterations each. Model: gpt-5.4. Environment: macOS (M-series), release mode, warm HTTP/2 connections.

Benchmarks

Rust Ecosystem (`openai-oxide` vs `async-openai` vs `genai`)

Test	`openai-oxide`	`async-openai`	`genai`
Plain text	1011ms	960ms	835ms
Structured output	1331ms	N/A	1197ms
Function calling	1192ms	1748ms	1030ms
Multi-turn (2 reqs)	2362ms	3275ms	1641ms
Streaming TTFT	645ms	685ms	670ms
Parallel 3x	1165ms	1053ms	866ms

WebSocket mode (openai-oxide only)

Test	WebSocket	HTTP	Improvement
Plain text	710ms	1011ms	-29%
Multi-turn (2 reqs)	1425ms	2362ms	-40%
Rapid-fire (5 calls)	3227ms	5807ms	-44%

median of medians, 3×5 iterations. Model: gpt-5.4. macOS (M-series), release mode, warm HTTP/2 connections.

Reproduce: cargo run --example benchmark --features responses --release

Python Ecosystem (`openai-oxide-python` vs `openai`)

openai-oxide wins 10/12 tests. Native PyO3 bindings vs openai (openai 2.29.0).

Test	`openai-oxide`	`openai`	Winner
Plain text	845ms	997ms	OXIDE (+15%)
Structured output	1367ms	1379ms	OXIDE (+1%)
Function calling	1195ms	1230ms	OXIDE (+3%)
Multi-turn (2 reqs)	2260ms	3089ms	OXIDE (+27%)
Web search	3157ms	3499ms	OXIDE (+10%)
Nested structured	5377ms	5339ms	python (+1%)
Agent loop (2-step)	4570ms	5144ms	OXIDE (+11%)
Rapid-fire (5 calls)	5667ms	6136ms	OXIDE (+8%)
Prompt-cached	4425ms	5564ms	OXIDE (+20%)
Streaming TTFT	626ms	638ms	OXIDE (+2%)
Parallel 3x	1184ms	1090ms	python (+9%)
Hedged (2x race)	893ms	995ms	OXIDE (+10%)

median of medians, 3×5 iterations. Model: gpt-5.4.

Reproduce: cd openai-oxide-python && uv run python ../examples/bench_python.py

Node.js Ecosystem (`openai-oxide` vs `openai`)

openai-oxide wins 8/8 tests. Native napi-rs bindings vs official openai npm.

Test	`openai-oxide`	`openai`	Winner
Plain text	1075ms	1311ms	OXIDE (+18%)
Structured output	1370ms	1765ms	OXIDE (+22%)
Function calling	1725ms	1832ms	OXIDE (+6%)
Multi-turn (2 reqs)	2283ms	2859ms	OXIDE (+20%)
Rapid-fire (5 calls)	6246ms	6936ms	OXIDE (+10%)
Streaming TTFT	534ms	580ms	OXIDE (+8%)
Parallel 3x	1937ms	1991ms	OXIDE (+3%)
WebSocket hot pair	2181ms	N/A	OXIDE

median of medians, 3×5 iterations. Model: gpt-5.4.

Reproduce: cd openai-oxide-node && BENCH_ITERATIONS=5 node examples/bench_node.js

How to run

Rust

cargo run --example benchmark --features responses --release

Python

cd openai-oxide-python && uv run python ../examples/bench_python.py

Node.js

cd openai-oxide-node && BENCH_ITERATIONS=5 node examples/bench_node.js

Methodology

Warm connections: First request is a warmup (not measured). All subsequent requests reuse HTTP/2 connections with keep-alive.
Median of medians: Each test runs 5 iterations per run, 3 runs total. We report the median of the 3 median values.
Same prompts: Both clients send identical requests to the same model.
Release mode: Rust benchmarks compiled with --release. Python and Node use prebuilt native extensions.

Updating benchmarks

Edit benchmarks/results.json with new numbers
Run python3 benchmarks/generate.py to regenerate tables
Docs and README include from generated files

API Reference

Full API documentation for each platform:

Platform	Documentation
Rust	docs.rs/openai-oxide
Node.js	npmjs.com/package/openai-oxide
Python	pypi.org/project/openai-oxide

Rust API

The Rust crate provides the most complete API surface. All endpoints are accessed through the OpenAI client via resource methods:

Resource	Access	Docs
Chat Completions	`client.chat().completions()`	docs.rs
Responses	`client.responses()`	docs.rs
Embeddings	`client.embeddings()`	docs.rs
Images	`client.images()`	docs.rs
Audio	`client.audio()`	docs.rs
Files	`client.files()`	docs.rs
Fine-tuning	`client.fine_tuning()`	docs.rs
Models	`client.models()`	docs.rs
Moderations	`client.moderations()`	docs.rs
Batches	`client.batches()`	docs.rs
Uploads	`client.uploads()`	docs.rs
Assistants (beta)	`client.beta().assistants()`	docs.rs
Threads (beta)	`client.beta().threads()`	docs.rs
Runs (beta)	`client.beta().runs()`	docs.rs
Vector Stores (beta)	`client.beta().vector_stores()`	docs.rs

OpenAI Docs → openai-oxide

openai-oxide has 1:1 parity with the official Python SDK. Use OpenAI’s documentation as your primary reference — the same concepts, parameter names, and patterns apply.

Endpoint Mapping

OpenAI Guide	Rust	Node.js	Python
Chat Completions	`client.chat().completions().create()`	`client.createResponse({model, input})`	`await client.create(model, input)`
Responses API	`client.responses().create()`	`client.createText(model, input)`	`await client.create(model, input)`
Streaming	`client.responses().create_stream()`	`client.createStream(model, input)`	`await client.create_stream(model, input)`
Function Calling	`client.responses().create_stream_fc()`	`client.createResponse({model, input, tools})`	`await client.create_with_tools(model, input, tools)`
Structured Output	`ResponseCreateRequest::new(model).text_format(schema)`	`client.createResponse({model, input, text})`	`await client.create_structured(model, input, name, schema)`
Embeddings	`client.embeddings().create()`	via `createResponse()` raw	via `create_raw()`
Image Generation	`client.images().generate()`	via `createResponse()` raw	via `create_raw()`
Text-to-Speech	`client.audio().speech().create()`	via `createResponse()` raw	via `create_raw()`
Speech-to-Text	`client.audio().transcriptions().create()`	via `createResponse()` raw	via `create_raw()`
Fine-tuning	`client.fine_tuning().jobs().create()`	via `createResponse()` raw	via `create_raw()`
Realtime API	`client.ws_session()`	`client.wsSession()`	—
Assistants	`client.beta().assistants()`	via `createResponse()` raw	via `create_raw()`

Node.js and Python have typed helpers for the top 5 endpoints. All other endpoints work via raw JSON methods.

Parameter Names

Parameter names match the Python SDK exactly:

OpenAI Python	Rust	Node.js
`model="gpt-5.4"`	`.model("gpt-5.4")`	`{ model: "gpt-5.4" }`
`max_output_tokens=100`	`.max_output_tokens(100)`	`{ maxOutputTokens: 100 }`
`temperature=0.7`	`.temperature(0.7)`	`{ temperature: 0.7 }`
`stream=True`	`create_stream()`	`createStream()`
`store=True`	`.store(true)`	`{ store: true }`

openai-oxide Exclusive Features

These features are not available in the official SDKs:

Feature	API	Description
WebSocket Sessions	`client.ws_session()`	Persistent connection, 37% faster agent loops
Hedged Requests	`hedged_request()`	Race redundant requests, cut P99 latency
Stream FC Early Parse	`create_stream_fc()`	Execute tools 400ms before response finishes
SIMD JSON	`features = ["simd"]`	AVX2/NEON accelerated parsing
WASM	`default-features = false`	Full streaming in Cloudflare Workers

Migration Guides

Switching to openai-oxide from another OpenAI client? These guides cover the key differences and provide side-by-side code comparisons.

From openai-python — Migrate from the official Python SDK to Rust
From async-openai — Migrate from the async-openai Rust crate

Migrating from openai-python

openai-oxide uses the same parameter names and resource structure as the official openai-python SDK. If you know the Python API, you already know openai-oxide.

Key Differences

Python	Rust (openai-oxide)
`client = OpenAI()`	`let client = OpenAI::from_env()?;`
`client.chat.completions.create(...)`	`client.chat().completions().create(...).await?`
`client.responses.create(...)`	`client.responses().create(...).await?`
`stream=True` parameter	Separate `create_stream()` method
Dict / Pydantic models	Typed request/response structs
`None` for optional fields	`Option<T>` with builder methods
Exception handling	`Result<T, OpenAIError>`

Pattern: Python to Rust

# Python
response = client.responses.create(
    model="gpt-5.4",
    input="Hello",
    max_output_tokens=100,
    temperature=0.7,
)

#![allow(unused)]
fn main() {
// Rust
let response = client.responses().create(
    ResponseCreateRequest::new("gpt-5.4")
        .input("Hello")
        .max_output_tokens(100)
        .temperature(0.7)
).await?;
}

See the OpenAI Docs Mapping for a complete endpoint cross-reference.

Migrating from async-openai

openai-oxide and async-openai are both Rust OpenAI clients, but differ in API design, feature set, and architecture.

Key Differences

async-openai	openai-oxide
`Client::new()`	`OpenAI::from_env()?`
`CreateChatCompletionRequestArgs::default()...build()?`	`ChatCompletionRequest::new("model")...`
Derive-macro builders	Manual builder methods (no proc macros)
`backoff` crate for retries	Built-in configurable retry policy
No WebSocket support	Native WebSocket sessions
Partial WASM (no streaming/retry)	WASM with streaming and JSON retries
No hedged requests	Built-in hedged request support

Pattern: async-openai to openai-oxide

#![allow(unused)]
fn main() {
// async-openai
let client = Client::new();
let request = CreateChatCompletionRequestArgs::default()
    .model("gpt-5.4")
    .messages(vec![ChatCompletionRequestUserMessageArgs::default()
        .content("Hello")
        .build()?
        .into()])
    .build()?;
let response = client.chat().create(request).await?;
}

#![allow(unused)]
fn main() {
// openai-oxide
let client = OpenAI::from_env()?;
let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-5.4")
        .messages(vec![ChatMessage::user("Hello")])
).await?;
}

The main wins from switching: simpler builder API (no .build()? calls), WebSocket support, WASM compatibility, hedged requests, and feature-flag granularity.

Keyboard shortcuts

openai-oxide Documentation