Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

WebSocket Sessions

Persistent WebSocket connections eliminate per-request TLS handshakes and HTTP overhead, achieving 37% faster round-trip times for agent loops and multi-turn conversations.

See the official Realtime API guide for session configuration.

Rust

//! WebSocket Responses API example — persistent connection for multi-turn.
//!
//! Run with: `OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket`

use futures_util::StreamExt;
use openai_oxide::OpenAI;
use openai_oxide::types::responses::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAI::from_env()?;

    // --- Example 1: Simple send/receive ---
    println!("=== Example 1: Simple request ===\n");

    let mut session = client.ws_session().await?;

    let request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What is the capital of France?")
        .max_output_tokens(256);

    let response = session.send(request).await?;
    println!("Response: {}", response.output_text());
    if let Some(usage) = &response.usage {
        println!(
            "Tokens: {} in + {} out",
            usage.input_tokens.unwrap_or(0),
            usage.output_tokens.unwrap_or(0),
        );
    }

    // --- Example 2: Multi-turn via same session ---
    println!("\n=== Example 2: Multi-turn ===\n");

    let follow_up = ResponseCreateRequest::new("gpt-4o-mini")
        .input("What about Germany?")
        .previous_response_id(&response.id);

    let response2 = session.send(follow_up).await?;
    println!("Follow-up: {}", response2.output_text());

    // --- Example 3: Streaming events ---
    println!("\n=== Example 3: Streaming ===\n");

    let stream_request = ResponseCreateRequest::new("gpt-4o-mini")
        .input("Count from 1 to 5, one number per line.")
        .max_output_tokens(128);

    let mut stream = session.send_stream(stream_request).await?;
    while let Some(event) = stream.next().await {
        let event = event?;
        use openai_oxide::types::responses::ResponseStreamEvent::*;
        match event {
            ResponseOutputTextDelta(evt) => print!("{}", evt.delta),
            ResponseCompleted(_) => println!("\n\n[completed]"),
            _ => {} // Other events: created, output_item.added, etc.
        }
    }

    // --- Clean up ---
    session.close().await?;
    println!("\nSession closed.");

    Ok(())
}

Run: OPENAI_API_KEY=sk-... cargo run --example websocket --features websocket

When to Use WebSockets

  • Agent loops with 3+ sequential LLM calls
  • Real-time conversational UIs
  • High-throughput batch processing where latency matters

Known Issues

Decimal temperature causes silent close (code=1000)

Status: OpenAI bug as of March 2026

Sending temperature as a decimal (e.g. 0.7, 1.2) over WebSocket causes the server to immediately close the connection with code=1000 and an empty reason — no error event is returned. Integer values (0, 1, 2) work fine. The same decimal values work normally over HTTP.

Workaround: Omit temperature from WebSocket requests (the API uses model default ~1.0), or round to integer.

Tracking: OpenAI Community #1375536