Module streaming

Module streaming 

Source
Expand description

Streaming LLM provider support for Gestura

This module provides streaming capabilities for LLM responses, enabling real-time token-by-token delivery to the frontend with cancellation support.

Modules§

pricing
Pricing per 1M tokens (input/output) for various providers Prices are in USD and updated as of January 2026

Structs§

AnthropicStreamRequest
Stream a response from Anthropic Claude API
CancellationToken
Cancellation token for streaming requests
PublicNarration
Structured public narration content rendered between major loop events.
TaskRuntimeSnapshot
Runtime-authored task scheduler snapshot streamed to UI surfaces.
TaskRuntimeTaskView
Compact task view for runtime-authored task-state updates.

Enums§

CancellationDisposition
Requested interruption disposition for a streaming request.
NarrationStage
Public-facing narration stage for brief between-tool updates.
ShellOutputStream
Which output stream a shell chunk originated from.
ShellProcessState
Lifecycle state of a shell process.
ShellSessionState
Lifecycle state of a long-lived interactive shell session.
StreamChunk
A chunk of streaming response
TokenUsageStatus
Token usage status indicator for visual feedback

Functions§

split_think_blocks
Split a complete assistant message into (user-facing text, optional thinking) based on <think>...</think> blocks.
start_streaming
Start a streaming LLM request based on config.
start_streaming_with_fallback
Start streaming with fallback to secondary provider on failure Implements jittered exponential backoff with rate-limit-aware delay selection before falling back.
stream_anthropic
stream_gemini
Stream a response from Google Gemini API (Generative Language API).
stream_ollama
Stream a response from Ollama local API
stream_openai