Module model_capabilities

Module model_capabilities 

Source
Expand description

Dynamic model capabilities discovery and caching.

This module provides runtime discovery of model capabilities (context length, max output tokens, feature support) through multiple strategies:

  1. API Discovery - Query provider model endpoints for metadata
    • Anthropic: /v1/models/{id}max_input_tokens
    • Gemini: /v1beta/models/{id}inputTokenLimit
    • Grok: /v1/language-models → context window per model
    • Ollama: /api/showmodel_info.{arch}.context_length
  2. Error-Driven Learning - Parse limits from context_length_exceeded errors
  3. Cached Knowledge - Remember discovered limits across requests
  4. Conservative Fallback - Safe defaults for unknown models

§Design Goals

  • Dynamic over static - Learn limits at runtime, not hardcoded
  • Graceful degradation - Work even when APIs are unavailable
  • Error recovery - Extract actual limits from error messages

§Usage

use gestura_core_llm::model_capabilities::{ModelCapabilities, ModelCapabilitiesCache};

let cache = ModelCapabilitiesCache::new();

// Discover from API (async)
cache.discover_from_api("anthropic", "claude-sonnet-4-20250514", Some(api_key)).await;

// Learn from an error (sync)
cache.learn_from_error("openai", "gpt-3.5-turbo",
    "maximum context length is 16385 tokens");

// Get capabilities (uses discovered/learned value, falls back to heuristic)
let caps = cache.get("openai", "gpt-3.5-turbo");

Structs§

ModelCapabilities
Model capabilities describing limits and supported features.
ModelCapabilitiesCache
Thread-safe cache for learned model capabilities.

Enums§

CapabilitySource
How the capability information was obtained

Functions§

get_model_capabilities
Convenience function - get capabilities without a cache (uses heuristics only)
get_model_capabilities_heuristic
Get capabilities using heuristics (static fallback).