Module streaming

Expand description

Streaming LLM provider support for Gestura

This module provides streaming capabilities for LLM responses, enabling real-time token-by-token delivery to the frontend with cancellation support.

Modules§

pricing: Pricing per 1M tokens (input/output) for various providers Prices are in USD and updated as of January 2026

AnthropicStreamRequest: Stream a response from Anthropic Claude API
CancellationToken: Cancellation token for streaming requests
PublicNarration: Structured public narration content rendered between major loop events.
TaskRuntimeSnapshot: Runtime-authored task scheduler snapshot streamed to UI surfaces.
TaskRuntimeTaskView: Compact task view for runtime-authored task-state updates.

CancellationDisposition: Requested interruption disposition for a streaming request.
NarrationStage: Public-facing narration stage for brief between-tool updates.
ShellOutputStream: Which output stream a shell chunk originated from.
ShellProcessState: Lifecycle state of a shell process.
ShellSessionState: Lifecycle state of a long-lived interactive shell session.
StreamChunk: A chunk of streaming response
TokenUsageStatus: Token usage status indicator for visual feedback

split_think_blocks: Split a complete assistant message into (user-facing text, optional thinking) based on <think>...</think> blocks.
start_streaming: Start a streaming LLM request based on config.
start_streaming_with_fallback: Start streaming with fallback to secondary provider on failure Implements jittered exponential backoff with rate-limit-aware delay selection before falling back.
stream_anthropic
stream_gemini: Stream a response from Google Gemini API (Generative Language API).
stream_ollama: Stream a response from Ollama local API
stream_openai