Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

openai-oxide

A high-performance, feature-complete OpenAI client for Rust, Node.js, and Python.

openai-oxide implements the full Responses API, Chat Completions, and 20+ other endpoints with performance primitives like persistent WebSockets, hedged requests, and early-parsing for function calls.

Why openai-oxide?

  • Zero-Overhead Streaming — Custom zero-copy SSE parser, TTFT ~670ms
  • WebSocket Mode — Persistent wss:// connections, 37% faster agent loops
  • Stream FC Early Parse — Execute tools ~400ms before response finishes
  • SIMD JSON — Opt-in AVX2/NEON for microsecond parsing
  • Hedged Requests — 50-96% P99 tail latency reduction
  • WASM First-Class — Full streaming in Cloudflare Workers and browsers

Packages

PackageRegistryInstall
openai-oxidecrates.iocargo add openai-oxide
openai-oxidenpmnpm install openai-oxide
openai-oxidePyPIpip install openai-oxide

OpenAI Compatibility

Parameter names match the official Python SDK exactly. If the OpenAI docs show model="gpt-5.4", use .model("gpt-5.4") in Rust or {model: "gpt-5.4"} in Node.js.

See the OpenAI Docs Mapping for a complete cross-reference.

Installation

openai-oxide is available for three platforms. Pick your language:

  • Rust — Full-featured async client
  • Node.js — Native NAPI-RS bindings
  • Python — PyO3 bindings

All packages share the same Rust core for consistent behavior and performance.

Rust Installation

Add to Cargo.toml

cargo add openai-oxide tokio --features tokio/full

Or manually:

[dependencies]
openai-oxide = "0.9"
tokio = { version = "1", features = ["full"] }

Feature Flags

Every API endpoint is behind a feature flag. All enabled by default.

# Minimal: only Responses API
openai-oxide = { version = "0.9", default-features = false, features = ["responses"] }

Available features: chat, responses, embeddings, images, audio, files, fine-tuning, models, moderations, batches, uploads, beta

Ecosystem features: websocket, websocket-wasm, simd, macros

Configuration

#![allow(unused)]
fn main() {
use openai_oxide::OpenAI;

// From environment variable (recommended)
let client = OpenAI::from_env()?; // Uses OPENAI_API_KEY

// Explicit key
let client = OpenAI::new("sk-...");

// Custom config
use openai_oxide::config::ClientConfig;
let client = OpenAI::with_config(
    ClientConfig::new("sk-...").base_url("https://...").timeout_secs(30)
);

// Azure
use openai_oxide::azure::AzureConfig;
let client = OpenAI::azure(
    AzureConfig::new().azure_endpoint("https://my.openai.azure.com").api_key("...")
)?;
}

API Reference

Full API docs: docs.rs/openai-oxide

Node.js Installation

Install

npm install openai-oxide
# or
pnpm add openai-oxide
# or
yarn add openai-oxide

Prebuilt native binaries for: macOS (x64, arm64), Linux (x64, arm64, glibc & musl), Windows (x64).

Setup

const { Client } = require("openai-oxide");

// Uses OPENAI_API_KEY from environment
const client = new Client();

// Explicit key
const client = new Client("sk-...");

Available Methods

MethodDescription
createResponse(params)Full Responses API call
createText(model, input)Fast path — returns text only
createStoredResponseId(model, input)Fast path — returns response ID
createTextFollowup(model, input, prevId)Multi-turn fast path
createStream(model, input)Streaming responses
wsSession()WebSocket persistent connection

npm Package

npmjs.com/package/openai-oxide

Python Installation

Install

pip install openai-oxide
# or
uv pip install openai-oxide
# or
uv add openai-oxide

No Rust toolchain required — prebuilt wheels available.

Setup

from openai_oxide import Client

# Uses OPENAI_API_KEY from environment
client = Client()

# Explicit key
client = Client("sk-...")

Available Methods

MethodDescription
await client.create(model, input)Basic request
await client.create_stream(model, input)Streaming
await client.create_structured(model, input, name, schema)Structured output
await client.create_with_tools(model, input, tools)Function calling

PyPI Package

pypi.org/project/openai-oxide

Quick Start

Set your API key:

export OPENAI_API_KEY="sk-..."

Rust

use openai_oxide::{OpenAI, types::responses::*};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let response = client.responses().create(
        ResponseCreateRequest::new("gpt-5.4-mini")
            .input("Explain quantum computing in one sentence.")
            .max_output_tokens(100)
    ).await?;

    println!("{}", response.output_text());
    Ok(())
}

Node.js

const { Client } = require("openai-oxide");

const client = new Client();
const text = await client.createText("gpt-5.4-mini", "Hello from Node!");
console.log(text);

Python

import asyncio, json
from openai_oxide import Client

async def main():
    client = Client()
    res = json.loads(await client.create("gpt-5.4-mini", "Hello from Python!"))
    print(res["text"])

asyncio.run(main())

Drop-in Migration

Switch from the official OpenAI SDK by changing one import line. Rest of your code stays the same.

Python

- from openai import AsyncOpenAI
+ from openai_oxide.compat import AsyncOpenAI

Full working example (mirrors official openai examples/parsing.py):

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK parsing example.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio
from typing import List

from pydantic import BaseModel

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str


async def main():
    client = AsyncOpenAI()

    completion = await client.chat.completions.parse(
        model="gpt-5.4-mini",
        messages=[
            {"role": "system", "content": "You are a helpful math tutor."},
            {"role": "user", "content": "solve 8x + 31 = 2"},
        ],
        response_format=MathResponse,
    )

    message = completion.choices[0].message
    if message.parsed:
        for step in message.parsed.steps:
            print(f"  {step.explanation} → {step.output}")
        print("answer:", message.parsed.final_answer)
    else:
        print("refusal:", message.refusal)


asyncio.run(main())

Node.js

- const OpenAI = require('openai');
+ const { OpenAI } = require('openai-oxide/compat');

Full working example (mirrors official openai SDK):

/**
 * Drop-in replacement for official openai SDK demo.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // Non-streaming:
    console.log("----- standard request -----");
    const completion = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "Say this is a test" }],
    });
    console.log(completion.choices[0].message.content);

    // Streaming:
    console.log("----- streaming request -----");
    const stream = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "How do I list files in a directory using Node.js?" }],
        stream: true,
    });
    for await (const chunk of stream) {
        const content = chunk.choices?.[0]?.delta?.content;
        if (content) process.stdout.write(content);
    }
    console.log();
}

main();

Guides

Step-by-step guides for common tasks with openai-oxide.

Each guide shows code in all three languages (Rust, Node.js, Python) and links to the relevant OpenAI documentation.

Chat Completions

Send messages to GPT models and receive completions. This is the most common API for conversational AI.

See the official Chat Completions guide and API reference.

Rust

//! Basic chat completion example.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example chat`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "You are a helpful assistant.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text("What is the capital of France?".into()),
                name: None,
            },
        ],
    );

    let response = client.chat().completions().create(request).await?;

    for choice in &response.choices {
        println!(
            "[{}] {}",
            choice.finish_reason,
            choice.message.content.as_deref().unwrap_or("")
        );
    }

    if let Some(usage) = &response.usage {
        println!(
            "\nTokens: {} prompt + {} completion = {} total",
            usage.prompt_tokens.unwrap_or(0),
            usage.completion_tokens.unwrap_or(0),
            usage.total_tokens.unwrap_or(0),
        );
    }

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example chat

Next Steps

Responses API

The Responses API is OpenAI’s latest endpoint for generating text, replacing Chat Completions for new projects. It supports built-in tools, multi-turn conversations via previous_response_id, and structured output.

See the official Responses API reference.

Rust

//! Responses API example — web search + function tools.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example responses_api`

use openai_oxide::OpenAI;
use openai_oxide::types::responses::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    // Simple text input
    let request = ResponseCreateRequest::new("gpt-4o")
        .input("What are the latest developments in Rust programming?")
        .instructions("Be concise and cite sources when possible.")
        .tools(vec![ResponseTool::WebSearch {
            search_context_size: Some("medium".into()),
            user_location: None,
        }])
        .temperature(0.7)
        .max_output_tokens(1024)
        .store(true);

    let response = client.responses().create(request).await?;

    println!("Response ID: {}", response.id);
    println!("Status: {:?}", response.status);
    println!("\n{}", response.output_text());

    if let Some(usage) = &response.usage {
        println!(
            "\nTokens: {} in + {} out = {} total",
            usage.input_tokens.unwrap_or(0),
            usage.output_tokens.unwrap_or(0),
            usage.total_tokens.unwrap_or(0),
        );
    }

    // Multi-turn with previous_response_id
    let follow_up = ResponseCreateRequest::new("gpt-4o")
        .input("Can you elaborate on the async ecosystem?")
        .previous_response_id(&response.id);

    let response2 = client.responses().create(follow_up).await?;
    println!("\n--- Follow-up ---\n{}", response2.output_text());

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example responses_api

Next Steps

Streaming

Stream tokens and events as they are generated, reducing time-to-first-token (TTFT) and enabling real-time UI updates.

See the official Streaming documentation for event types and behavior.

High-level wrapper with typed events and automatic accumulation — no manual chunk stitching.

//! Streaming chat completion example.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example chat_stream`

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "You are a helpful assistant.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text("Write a haiku about Rust programming.".into()),
                name: None,
            },
        ],
    );

    let mut stream = client.chat().completions().create_stream(request).await?;

    while let Some(result) = stream.next().await {
        match result {
            Ok(chunk) => {
                for choice in &chunk.choices {
                    if let Some(content) = &choice.delta.content {
                        print!("{content}");
                    }
                    if choice.finish_reason.is_some() {
                        println!();
                    }
                }
            }
            Err(e) => {
                eprintln!("\nStream error: {e}");
                break;
            }
        }
    }

    Ok(())
}

Event Types

EventWhenFields
ChunkEvery SSE chunkRaw ChatCompletionChunk
ContentDeltaNew text fragmentdelta, snapshot (accumulated)
ContentDoneText completecontent (full text)
ToolCallDeltaArgument fragmentindex, name, arguments_delta, arguments_snapshot
ToolCallDoneTool call completeindex, call_id, name, arguments
RefusalDelta/DoneModel refusesdelta/refusal
DoneStream finishedfinish_reason

Node.js (drop-in replacement)

Same syntax as official openai package — for await over stream:

/**
 * Drop-in replacement for official openai SDK demo.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // Non-streaming:
    console.log("----- standard request -----");
    const completion = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "Say this is a test" }],
    });
    console.log(completion.choices[0].message.content);

    // Streaming:
    console.log("----- streaming request -----");
    const stream = await client.chat.completions.create({
        model: "gpt-5.4-mini",
        messages: [{ role: "user", content: "How do I list files in a directory using Node.js?" }],
        stream: true,
    });
    for await (const chunk of stream) {
        const content = chunk.choices?.[0]?.delta?.content;
        if (content) process.stdout.write(content);
    }
    console.log();
}

main();

Python (drop-in replacement)

Same syntax as official openai package — async for over stream:

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK demo.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


async def main():
    client = AsyncOpenAI()

    # Non-streaming:
    print("----- standard request -----")
    completion = await client.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            },
        ],
    )
    print(completion.choices[0].message.content)

    # Streaming:
    print("----- streaming request -----")
    stream = await client.chat.completions.create(
        model="gpt-5.4-mini",
        messages=[
            {
                "role": "user",
                "content": "How do I output all files in a directory using Python?",
            },
        ],
        stream=True,
    )
    async for event in stream:
        if event.get("type") == "OutputTextDelta":
            print(event.get("delta", ""), end="")
        elif event.get("delta"):
            print(event.get("delta", ""), end="")
    print()


asyncio.run(main())

Responses API Streaming

Typed events for the Responses API:

#![allow(unused)]
fn main() {
use futures_util::StreamExt;
use openai_oxide::types::responses::{ResponseCreateRequest, ResponseStreamEvent};

let mut stream = client.responses()
    .create_stream(ResponseCreateRequest::new("gpt-5.4-mini").input("Hi"))
    .await?;

while let Some(Ok(event)) = stream.next().await {
    match event {
        ResponseStreamEvent::OutputTextDelta { delta, .. } => print!("{delta}"),
        ResponseStreamEvent::ResponseCompleted { response } => {
            println!("\nDone: {}", response.output_text());
        }
        _ => {}
    }
}
}

Next Steps

Function Calling

Let the model invoke your functions by defining tools. openai-oxide supports early-parsing of function call arguments during streaming, allowing you to execute tools ~400ms before the response finishes.

See the official Function Calling guide for tool schema definitions.

Rust

//! Tool calling example — model calls a function, we return the result.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example tool_calling`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let tools = vec![Tool {
        type_: "function".into(),
        function: FunctionDef {
            name: "get_weather".into(),
            description: Some("Get current weather for a city".into()),
            parameters: Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            })),
            strict: Some(true),
        },
    }];

    // Step 1: Send message with tools
    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![ChatCompletionMessageParam::User {
            content: UserContent::Text("What's the weather in Tokyo?".into()),
            name: None,
        }],
    )
    .tools(tools.clone())
    .tool_choice(ToolChoice::Mode("auto".into()));

    let response = client.chat().completions().create(request).await?;
    let message = &response.choices[0].message;

    if let Some(tool_calls) = &message.tool_calls {
        for tc in tool_calls {
            println!(
                "Tool call: {} ({})",
                tc.function.name, tc.function.arguments
            );

            // Step 2: Simulate function result
            let result = r#"{"temperature": 22, "condition": "Sunny", "unit": "celsius"}"#;

            // Step 3: Send tool result back
            let follow_up = ChatCompletionRequest::new(
                "gpt-4o-mini",
                vec![
                    ChatCompletionMessageParam::User {
                        content: UserContent::Text("What's the weather in Tokyo?".into()),
                        name: None,
                    },
                    ChatCompletionMessageParam::Assistant {
                        content: None,
                        name: None,
                        tool_calls: Some(tool_calls.clone()),
                        refusal: None,
                    },
                    ChatCompletionMessageParam::Tool {
                        content: result.into(),
                        tool_call_id: tc.id.clone(),
                    },
                ],
            )
            .tools(tools.clone());

            let final_response = client.chat().completions().create(follow_up).await?;
            println!(
                "\nAssistant: {}",
                final_response.choices[0]
                    .message
                    .content
                    .as_deref()
                    .unwrap_or("")
            );
        }
    } else {
        println!("Assistant: {}", message.content.as_deref().unwrap_or(""));
    }

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example tool_calling

Next Steps

WebSocket Sessions

Persistent WebSocket connections eliminate per-request TLS handshakes and HTTP overhead, achieving 37% faster round-trip times for agent loops and multi-turn conversations.

See the official Realtime API guide for session configuration.

Rust

//! WebSocket Responses API example — persistent connection for multi-turn.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket`

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::types::responses::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    // --- Example 1: Simple send/receive ---
    println!("=== Example 1: Simple request ===\n");

    let mut session = client.ws_session().await?;

    let request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What is the capital of France?")
        .max_output_tokens(256);

    let response = session.send(request).await?;
    println!("Response: {}", response.output_text());
    if let Some(usage) = &response.usage {
        println!(
            "Tokens: {} in + {} out",
            usage.input_tokens.unwrap_or(0),
            usage.output_tokens.unwrap_or(0),
        );
    }

    // --- Example 2: Multi-turn via same session ---
    println!("\n=== Example 2: Multi-turn ===\n");

    let follow_up = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What about Germany?")
        .previous_response_id(&response.id);

    let response2 = session.send(follow_up).await?;
    println!("Follow-up: {}", response2.output_text());

    // --- Example 3: Streaming events ---
    println!("\n=== Example 3: Streaming ===\n");

    let stream_request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("Count from 1 to 5, one number per line.")
        .max_output_tokens(128);

    let mut stream = session.send_stream(stream_request).await?;
    while let Some(event) = stream.next().await {
        let event = event?;
        use openai_oxide::types::responses::ResponseStreamEvent::*;
        match event {
            ResponseOutputTextDelta(evt) => print!("{}", evt.delta),
            ResponseCompleted(_) => println!("\n\n[completed]"),
            _ => {} // Other events: created, output_item.added, etc.
        }
    }

    // --- Clean up ---
    session.close().await?;
    println!("\nSession closed.");

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket

When to Use WebSockets

  • Agent loops with 3+ sequential LLM calls
  • Real-time conversational UIs
  • High-throughput batch processing where latency matters

Known Issues

Decimal temperature causes silent close (code=1000)

Status: OpenAI bug as of March 2026

Sending temperature as a decimal (e.g. 0.7, 1.2) over WebSocket causes the server to immediately close the connection with code=1000 and an empty reason — no error event is returned. Integer values (0, 1, 2) work fine. The same decimal values work normally over HTTP.

Workaround: Omit temperature from WebSocket requests (the API uses model default ~1.0), or round to integer.

Tracking: OpenAI Community #1375536

Structured Output

Force the model to return JSON matching a specific schema. Guarantees valid, parseable output without prompt engineering tricks.

See the official Structured Outputs guide for schema format and limitations.

Derive JsonSchema on your struct and call parse::<T>(). The SDK auto-generates the schema and deserializes the response.

Requires feature structured: cargo add openai-oxide --features structured

#![allow(unused)]
fn main() {
// Live test for all new features — requires OPENAI_API_KEY
//
// cargo run --example live_features_test --features structured

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::stream_helpers::ChatStreamEvent;
use openai_oxide::types::chat::{ChatCompletionMessageParam, ChatCompletionRequest, UserContent};

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct MathAnswer {
    steps: Vec<Step>,
    final_answer: String,
}

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct Step {
    explanation: String,
    output: String,
}

#[derive(Debug, serde::Deserialize, schemars::JsonSchema)]
struct Sentiment {
    sentiment: String,
    confidence: f64,
}

fn msg(text: &str) -> Vec<ChatCompletionMessageParam> {
    vec![ChatCompletionMessageParam::User {
        content: UserContent::Text(text.into()),
}

Rust — Manual Schema

For full control, construct the schema yourself:

//! Structured output with JSON Schema — model returns a validated JSON object.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example structured_output`

use openai_oxide::OpenAI;
use openai_oxide::types::chat::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "Extract structured data from user messages.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text(
                    "My name is Alice, I'm 30, and I work as a software engineer at Acme Corp."
                        .into(),
                ),
                name: None,
            },
        ],
    )
    .response_format(ResponseFormat::JsonSchema {
        json_schema: JsonSchema {
            name: "person_info".into(),
            description: Some("Extracted person information".into()),
            schema: Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "occupation": {"type": "string"},
                    "company": {"type": "string"}
                },
                "required": ["name", "age", "occupation", "company"],
                "additionalProperties": false
            })),
            strict: Some(true),
        },
    });

    let response = client.chat().completions().create(request).await?;
    let content = response.choices[0]
        .message
        .content
        .as_deref()
        .unwrap_or("{}");

    let parsed: serde_json::Value = serde_json::from_str(content)?;
    println!("Extracted data:");
    println!("  Name: {}", parsed["name"]);
    println!("  Age: {}", parsed["age"]);
    println!("  Occupation: {}", parsed["occupation"]);
    println!("  Company: {}", parsed["company"]);

    Ok(())
}

Node.js (drop-in replacement)

Same syntax as official openai package — change one import:

/**
 * Drop-in replacement for official openai SDK structured output example.
 * Change: `const OpenAI = require('openai')` → `const { OpenAI } = require('openai-oxide/compat')`
 *
 * For Zod support: npm install zod zod-to-json-schema
 */

// ── One-line change from official SDK ──
// const OpenAI = require('openai');
const { OpenAI } = require('../compat');

async function main() {
    const client = new OpenAI();

    // JSON Schema (works without Zod)
    const MathResponseSchema = {
        type: "object",
        properties: {
            steps: {
                type: "array",
                items: {
                    type: "object",
                    properties: {
                        explanation: { type: "string" },
                        output: { type: "string" },
                    },
                    required: ["explanation", "output"],
                    additionalProperties: false,
                },
            },
            final_answer: { type: "string" },
        },
        required: ["steps", "final_answer"],
        additionalProperties: false,
    };

    const result = await client.chat.completions.parse({
        model: "gpt-5.4-mini",
        messages: [
            { role: "system", content: "You are a helpful math tutor." },
            { role: "user", content: "solve 8x + 31 = 2" },
        ],
        response_format: {
            type: "json_schema",
            json_schema: {
                name: "MathResponse",
                schema: MathResponseSchema,
                strict: true,
            },
        },
    });

    const message = result.choices[0].message;
    const parsed = JSON.parse(message.content);
    for (const step of parsed.steps) {
        console.log(`  ${step.explanation} → ${step.output}`);
    }
    console.log("answer:", parsed.final_answer);
}

main();

Python (drop-in replacement)

Same syntax as official openai package — change one import:

#!/usr/bin/env python3
"""
Drop-in replacement for official openai SDK parsing example.
Change: `from openai import AsyncOpenAI` → `from openai_oxide.compat import AsyncOpenAI`
"""

import asyncio
from typing import List

from pydantic import BaseModel

# ── One-line change from official SDK ──
# from openai import AsyncOpenAI
from openai_oxide.compat import AsyncOpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str


async def main():
    client = AsyncOpenAI()

    completion = await client.chat.completions.parse(
        model="gpt-5.4-mini",
        messages=[
            {"role": "system", "content": "You are a helpful math tutor."},
            {"role": "user", "content": "solve 8x + 31 = 2"},
        ],
        response_format=MathResponse,
    )

    message = completion.choices[0].message
    if message.parsed:
        for step in message.parsed.steps:
            print(f"  {step.explanation} → {step.output}")
        print("answer:", message.parsed.final_answer)
    else:
        print("refusal:", message.refusal)


asyncio.run(main())

Next Steps

Hedged Requests

Hedged requests race multiple identical API calls and return the first successful response. This technique reduces P99 tail latency by 50-96% at the cost of additional API usage.

This is an openai-oxide exclusive feature not available in the official SDKs.

Rust

#![allow(unused)]
fn main() {
use openai_oxide::{OpenAI, types::responses::*};

let client = OpenAI::from_env()?;

// Race 2 identical requests, return whichever finishes first
let response = client.responses().hedged_request(
    ResponseCreateRequest::new("gpt-5.4-mini")
        .input("Quick question: what is 2+2?"),
    2, // number of concurrent requests
).await?;
}

When to Use

  • Latency-sensitive applications (real-time UIs, voice assistants)
  • Short, deterministic prompts where cost of duplicates is low
  • Production systems with strict P99 SLA requirements

Trade-offs

  • Uses N times the tokens (one request per hedge)
  • Best for short prompts where the latency gain outweighs cost
  • Not recommended for long-running completions with max_output_tokens > 1000

Webhook Verification

Verify OpenAI webhook signatures to ensure payloads are authentic and not replayed.

Requires feature webhooks: cargo add openai-oxide --features webhooks

See the official Webhooks documentation for setup.

Usage

#![allow(unused)]
fn main() {
use openai_oxide::resources::webhooks::Webhooks;

// Initialize with your webhook secret (from OpenAI dashboard)
let wh = Webhooks::new("whsec_YOUR_WEBHOOK_SECRET")?;

// In your HTTP handler — extract headers and body
let signature = headers.get("webhook-signature").unwrap();
let timestamp = headers.get("webhook-timestamp").unwrap();

// Verify and parse in one call
let event: serde_json::Value = wh.unwrap(body_bytes, signature, timestamp)?;

// Or verify only (without parsing)
wh.verify(body_bytes, signature, timestamp)?;
}

Security

  • HMAC-SHA256 signature validation
  • Timestamp replay protection (5-minute tolerance)
  • Supports multiple signature versions in header
  • Base64-encoded whsec_ secrets (auto-stripped)

Next Steps

WASM / Cloudflare Workers

openai-oxide compiles to WebAssembly for use in Cloudflare Workers, Deno Deploy, and browser environments. Full streaming support is included.

Setup

[dependencies]
openai-oxide = { version = "0.9", default-features = false, features = ["responses", "websocket-wasm"] }

Disable default features to exclude tokio and native TLS, which are not available in WASM.

Cloudflare Worker Example

use openai_oxide::OpenAI;
use worker::*;

#[event(fetch)]
async fn main(req: Request, env: Env, _ctx: Context) -> Result<Response> {
    let client = OpenAI::new(env.secret("OPENAI_API_KEY")?.to_string());

    let response = client.responses().create(
        ResponseCreateRequest::new("gpt-5.4-mini")
            .input("Hello from the edge!")
    ).await.map_err(|e| worker::Error::from(e.to_string()))?;

    Response::ok(response.output_text())
}

Limitations

  • No filesystem access (audio file uploads require bytes, not paths)
  • WebSocket mode uses websocket-wasm feature instead of websocket
  • SIMD feature is not available in WASM targets

Benchmarks

All benchmarks: median of 3 runs × 5 iterations each. Model: gpt-5.4. Environment: macOS (M-series), release mode, warm HTTP/2 connections.

Benchmarks

Rust Ecosystem (openai-oxide vs async-openai vs genai)

Testopenai-oxideasync-openaigenai
Plain text1011ms960ms835ms
Structured output1331msN/A1197ms
Function calling1192ms1748ms1030ms
Multi-turn (2 reqs)2362ms3275ms1641ms
Streaming TTFT645ms685ms670ms
Parallel 3x1165ms1053ms866ms

WebSocket mode (openai-oxide only)

TestWebSocketHTTPImprovement
Plain text710ms1011ms-29%
Multi-turn (2 reqs)1425ms2362ms-40%
Rapid-fire (5 calls)3227ms5807ms-44%

median of medians, 3×5 iterations. Model: gpt-5.4. macOS (M-series), release mode, warm HTTP/2 connections.

Reproduce: cargo run --example benchmark --features responses --release


Python Ecosystem (openai-oxide-python vs openai)

openai-oxide wins 10/12 tests. Native PyO3 bindings vs openai (openai 2.29.0).

Testopenai-oxideopenaiWinner
Plain text845ms997msOXIDE (+15%)
Structured output1367ms1379msOXIDE (+1%)
Function calling1195ms1230msOXIDE (+3%)
Multi-turn (2 reqs)2260ms3089msOXIDE (+27%)
Web search3157ms3499msOXIDE (+10%)
Nested structured5377ms5339mspython (+1%)
Agent loop (2-step)4570ms5144msOXIDE (+11%)
Rapid-fire (5 calls)5667ms6136msOXIDE (+8%)
Prompt-cached4425ms5564msOXIDE (+20%)
Streaming TTFT626ms638msOXIDE (+2%)
Parallel 3x1184ms1090mspython (+9%)
Hedged (2x race)893ms995msOXIDE (+10%)

median of medians, 3×5 iterations. Model: gpt-5.4.

Reproduce: cd openai-oxide-python && uv run python ../examples/bench_python.py


Node.js Ecosystem (openai-oxide vs openai)

openai-oxide wins 8/8 tests. Native napi-rs bindings vs official openai npm.

Testopenai-oxideopenaiWinner
Plain text1075ms1311msOXIDE (+18%)
Structured output1370ms1765msOXIDE (+22%)
Function calling1725ms1832msOXIDE (+6%)
Multi-turn (2 reqs)2283ms2859msOXIDE (+20%)
Rapid-fire (5 calls)6246ms6936msOXIDE (+10%)
Streaming TTFT534ms580msOXIDE (+8%)
Parallel 3x1937ms1991msOXIDE (+3%)
WebSocket hot pair2181msN/AOXIDE

median of medians, 3×5 iterations. Model: gpt-5.4.

Reproduce: cd openai-oxide-node && BENCH_ITERATIONS=5 node examples/bench_node.js

How to run

Rust

cargo run --example benchmark --features responses --release

Python

cd openai-oxide-python && uv run python ../examples/bench_python.py

Node.js

cd openai-oxide-node && BENCH_ITERATIONS=5 node examples/bench_node.js

Methodology

  • Warm connections: First request is a warmup (not measured). All subsequent requests reuse HTTP/2 connections with keep-alive.
  • Median of medians: Each test runs 5 iterations per run, 3 runs total. We report the median of the 3 median values.
  • Same prompts: Both clients send identical requests to the same model.
  • Release mode: Rust benchmarks compiled with --release. Python and Node use prebuilt native extensions.

Updating benchmarks

  1. Edit benchmarks/results.json with new numbers
  2. Run python3 benchmarks/generate.py to regenerate tables
  3. Docs and README include from generated files

API Reference

Full API documentation for each platform:

Rust API

The Rust crate provides the most complete API surface. All endpoints are accessed through the OpenAI client via resource methods:

ResourceAccessDocs
Chat Completionsclient.chat().completions()docs.rs
Responsesclient.responses()docs.rs
Embeddingsclient.embeddings()docs.rs
Imagesclient.images()docs.rs
Audioclient.audio()docs.rs
Filesclient.files()docs.rs
Fine-tuningclient.fine_tuning()docs.rs
Modelsclient.models()docs.rs
Moderationsclient.moderations()docs.rs
Batchesclient.batches()docs.rs
Uploadsclient.uploads()docs.rs
Assistants (beta)client.beta().assistants()docs.rs
Threads (beta)client.beta().threads()docs.rs
Runs (beta)client.beta().runs()docs.rs
Vector Stores (beta)client.beta().vector_stores()docs.rs

OpenAI Docs → openai-oxide

openai-oxide has 1:1 parity with the official Python SDK. Use OpenAI’s documentation as your primary reference — the same concepts, parameter names, and patterns apply.

Endpoint Mapping

OpenAI GuideRustNode.jsPython
Chat Completionsclient.chat().completions().create()client.createResponse({model, input})await client.create(model, input)
Responses APIclient.responses().create()client.createText(model, input)await client.create(model, input)
Streamingclient.responses().create_stream()client.createStream(model, input)await client.create_stream(model, input)
Function Callingclient.responses().create_stream_fc()client.createResponse({model, input, tools})await client.create_with_tools(model, input, tools)
Structured OutputResponseCreateRequest::new(model).text_format(schema)client.createResponse({model, input, text})await client.create_structured(model, input, name, schema)
Embeddingsclient.embeddings().create()via createResponse() rawvia create_raw()
Image Generationclient.images().generate()via createResponse() rawvia create_raw()
Text-to-Speechclient.audio().speech().create()via createResponse() rawvia create_raw()
Speech-to-Textclient.audio().transcriptions().create()via createResponse() rawvia create_raw()
Fine-tuningclient.fine_tuning().jobs().create()via createResponse() rawvia create_raw()
Realtime APIclient.ws_session()client.wsSession()
Assistantsclient.beta().assistants()via createResponse() rawvia create_raw()

Node.js and Python have typed helpers for the top 5 endpoints. All other endpoints work via raw JSON methods.

Parameter Names

Parameter names match the Python SDK exactly:

OpenAI PythonRustNode.js
model="gpt-5.4".model("gpt-5.4"){ model: "gpt-5.4" }
max_output_tokens=100.max_output_tokens(100){ maxOutputTokens: 100 }
temperature=0.7.temperature(0.7){ temperature: 0.7 }
stream=Truecreate_stream()createStream()
store=True.store(true){ store: true }

openai-oxide Exclusive Features

These features are not available in the official SDKs:

FeatureAPIDescription
WebSocket Sessionsclient.ws_session()Persistent connection, 37% faster agent loops
Hedged Requestshedged_request()Race redundant requests, cut P99 latency
Stream FC Early Parsecreate_stream_fc()Execute tools 400ms before response finishes
SIMD JSONfeatures = ["simd"]AVX2/NEON accelerated parsing
WASMdefault-features = falseFull streaming in Cloudflare Workers

Migration Guides

Switching to openai-oxide from another OpenAI client? These guides cover the key differences and provide side-by-side code comparisons.

Migrating from openai-python

openai-oxide uses the same parameter names and resource structure as the official openai-python SDK. If you know the Python API, you already know openai-oxide.

Key Differences

PythonRust (openai-oxide)
client = OpenAI()let client = OpenAI::from_env()?;
client.chat.completions.create(...)client.chat().completions().create(...).await?
client.responses.create(...)client.responses().create(...).await?
stream=True parameterSeparate create_stream() method
Dict / Pydantic modelsTyped request/response structs
None for optional fieldsOption<T> with builder methods
Exception handlingResult<T, OpenAIError>

Pattern: Python to Rust

# Python
response = client.responses.create(
    model="gpt-5.4",
    input="Hello",
    max_output_tokens=100,
    temperature=0.7,
)
#![allow(unused)]
fn main() {
// Rust
let response = client.responses().create(
    ResponseCreateRequest::new("gpt-5.4")
        .input("Hello")
        .max_output_tokens(100)
        .temperature(0.7)
).await?;
}

See the OpenAI Docs Mapping for a complete endpoint cross-reference.

Migrating from async-openai

openai-oxide and async-openai are both Rust OpenAI clients, but differ in API design, feature set, and architecture.

Key Differences

async-openaiopenai-oxide
Client::new()OpenAI::from_env()?
CreateChatCompletionRequestArgs::default()...build()?ChatCompletionRequest::new("model")...
Derive-macro buildersManual builder methods (no proc macros)
backoff crate for retriesBuilt-in configurable retry policy
No WebSocket supportNative WebSocket sessions
No WASM supportFirst-class WASM target
No hedged requestsBuilt-in hedged request support

Pattern: async-openai to openai-oxide

#![allow(unused)]
fn main() {
// async-openai
let client = Client::new();
let request = CreateChatCompletionRequestArgs::default()
    .model("gpt-5.4")
    .messages(vec![ChatCompletionRequestUserMessageArgs::default()
        .content("Hello")
        .build()?
        .into()])
    .build()?;
let response = client.chat().create(request).await?;
}
#![allow(unused)]
fn main() {
// openai-oxide
let client = OpenAI::from_env()?;
let response = client.chat().completions().create(
    ChatCompletionRequest::new("gpt-5.4")
        .messages(vec![ChatMessage::user("Hello")])
).await?;
}

The main wins from switching: simpler builder API (no .build()? calls), WebSocket support, WASM compatibility, hedged requests, and feature-flag granularity.