Caches the system prompt/tools and growing conversation history via
cache_control breakpoints, cutting cost and latency on repeated turns.
Covers both the regular chat path and the tool-calling loop
(chatWithToolMessages), which has its own request-building code and was
initially missed. Cost calculation now accounts for cache write/read
pricing instead of treating all input tokens as full price. Verified
live: cache reads grow turn-over-turn in oAI.log.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New AppleFoundationProvider using FoundationModels framework (macOS 27+)
- Streaming via streamResponse(to:) → ResponseStream<String> snapshot deltas
- Session built with system prompt + conversation history injected as instructions text
- Full error mapping: context exceeded, guardrail violation, rate limit, availability states
- Settings.Provider.appleOnDevice case wired through ProviderRegistry, Color+Extensions, CreditsView
- inferProvider() detects "apple-" prefix model IDs
- Settings → General: Apple Intelligence section with live availability badge and deep link to System Settings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Jarvis integration: manage oAI-Web agents and usage from inside the app (/jarvis command, Settings tab 11)
- Model category filter: keyword-based categorisation with popover picker in model selector
- Categories shown in ModelInfoView with coloured chips; dot indicators on model rows
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>