Expand description
Dynamic model capabilities discovery and caching.
This module provides runtime discovery of model capabilities (context length, max output tokens, feature support) through multiple strategies:
- API Discovery - Query provider model endpoints for metadata
- Anthropic:
/v1/models/{id}→max_input_tokens - Gemini:
/v1beta/models/{id}→inputTokenLimit - Grok:
/v1/language-models→ context window per model - Ollama:
/api/show→model_info.{arch}.context_length
- Anthropic:
- Error-Driven Learning - Parse limits from context_length_exceeded errors
- Cached Knowledge - Remember discovered limits across requests
- Conservative Fallback - Safe defaults for unknown models
§Design Goals
- Dynamic over static - Learn limits at runtime, not hardcoded
- Graceful degradation - Work even when APIs are unavailable
- Error recovery - Extract actual limits from error messages
§Usage
ⓘ
use gestura_core_llm::model_capabilities::{ModelCapabilities, ModelCapabilitiesCache};
let cache = ModelCapabilitiesCache::new();
// Discover from API (async)
cache.discover_from_api("anthropic", "claude-sonnet-4-20250514", Some(api_key)).await;
// Learn from an error (sync)
cache.learn_from_error("openai", "gpt-3.5-turbo",
"maximum context length is 16385 tokens");
// Get capabilities (uses discovered/learned value, falls back to heuristic)
let caps = cache.get("openai", "gpt-3.5-turbo");Structs§
- Model
Capabilities - Model capabilities describing limits and supported features.
- Model
Capabilities Cache - Thread-safe cache for learned model capabilities.
Enums§
- Capability
Source - How the capability information was obtained
Functions§
- get_
model_ capabilities - Convenience function - get capabilities without a cache (uses heuristics only)
- get_
model_ capabilities_ heuristic - Get capabilities using heuristics (static fallback).