AI Providers
Connect to 10 different AI providers — cloud-hosted models or fully local.
Aura Work supports 10 AI providers covering both cloud-hosted and local (Ollama, LM Studio, Custom Endpoint) models. Each provider has its own adapter that handles authentication, model discovery, and chat completions. Providers marked as local never send data off your machine.
What are Providers?
Providers are the bridge between Aura Work and AI models. Each provider implements a standard interface that handles:
- Authentication — API keys, OAuth tokens, or local connections
- Model discovery — automatically detecting available models
- Chat completions — sending prompts and receiving responses
- Usage tracking — counting tokens and estimating costs
☁️ Cloud Providers
Cloud providers host models on their infrastructure. You need an API key to use them:
| Provider | Best For | Pricing |
|---|---|---|
| OpenAI | General purpose, code generation | Pay per token |
| Anthropic | Complex reasoning, long context | Pay per token |
| Google Gemini | Multimodal, large context windows | Pay per token |
| DeepSeek | Code-focused, cost-effective | Pay per token |
| Minimax | Chinese language tasks | Pay per token |
| Qwen | Chinese language, reasoning | Pay per token |
| Aura Cloud | Managed service with E2EE sync | Subscription |
🏠 Local Providers
Local providers run models on your own hardware. No API keys, no internet required, complete privacy:
| Provider | Setup | Best For |
|---|---|---|
| Ollama | ollama pull llama3 | Easy local model management |
| LM Studio | Download from lmstudio.ai | GUI for model management |
| Custom Endpoint | Any OpenAI-compatible API | Self-hosted models, proxies |
Local providers are ideal for privacy-sensitive work, offline environments, and cost savings (no per-token charges).
🔧 Setting Up Providers
To add a provider:
- 1. Go to Settings → Providers
- 2. Click on the provider you want to add
- 3. Enter your API key (for cloud providers)
- 4. Click "Validate" to test the connection
- 5. Select which models you want to use
- 6. Configure optional settings (base URL, max tokens, etc.)
For local providers, just install the software (Ollama/LM Studio) and Aura Work will auto-discover available models.
Aura Cloud Models
Hosted models via aura.work API. Includes Aura Fast (text+tools), Aura Coder (reasoning), and Aura Premium (vision+reasoning). Requires Aura Cloud sign-in.
Anthropic
Claude Sonnet 4 and Claude 3.5 Haiku. Best-in-class reasoning, code generation, and vision capabilities. API key required.
OpenAI
GPT-4o and GPT-4o mini. Industry-standard language models with broad tool-calling support. Also supports GitHub Copilot Codex accounts.
Google Gemini
Gemini 2.0 Flash. Google's fast, multimodal model with native vision capabilities. Free tier available via API key.
DeepSeek
DeepSeek V3. Cost-effective open-weight model with strong reasoning and code generation. API key required.
Ollama
Fully local model runner. Run Llama 3.2, Mistral, CodeLlama, and hundreds of other models on your own hardware. Zero cloud dependency.
Custom Endpoint
Any OpenAI-compatible API endpoint. Connect to local inference servers, self-hosted proxies, or any provider with an OpenAI-compatible chat completions API.
Minimax
Minimax abab6.5s. Chinese AI provider with competitive language model performance. API key required.
Qwen
Qwen Plus (DashScope). Alibaba's flagship LLM with strong multilingual capabilities including Chinese and English.
LM Studio
Local model server (http://127.0.0.1:1234). Run any GGUF model from Hugging Face with OpenAI-compatible API. Zero configuration needed.
How providers work
Each provider implements a ProviderAdapter interface with listModels(), validateCredentials(), and chat() methods. The system auto-discovers available models on connection and caches them. OpenAI-compatible providers share a single adapter implementation.
🔑 Credential Security
API keys are encrypted using the device-bound vault before storage. The vault uses:
- Windows — DPAPI (Data Protection API)
- macOS — Keychain
- Linux — Secret Service (GNOME Keyring / KWallet)
Credentials are never logged, never exposed to the agent, and never included in audit entries. The vault supports biometric unlock on supported platforms.
📊 Usage Tracking
Every task records detailed usage information:
- Input tokens — tokens sent to the model
- Output tokens — tokens received from the model
- Estimated cost — calculated from the pricing cache
- Model used — which provider and model handled the task
View usage statistics in the Dashboard or export them for billing. The audit log maintains a permanent record of all provider interactions.
🔄 Fallback & Retry
If a provider fails, Aura Work can automatically retry with an alternative:
- 1. Primary provider fails (rate limit, timeout, error)
- 2. System checks for fallback providers in your configuration
- 3. If fallback exists, retries with the alternative provider
- 4. If no fallback, notifies you and asks for manual intervention
Configure fallback providers in Settings → Providers → Fallback Chain.
💡 Cost Optimization Tips
- Use the cost-first routing policy for routine tasks
- Set up Ollama for development and testing (free)
- Use DeepSeek for code tasks (cheaper than OpenAI/Anthropic)
- Monitor usage in the Dashboard to identify expensive patterns
- Set token limits per task to prevent runaway costs
Local vs. Cloud
| Criteria | Local Providers | Cloud Providers |
|---|---|---|
| Privacy | Complete — data never leaves your machine | Requests are sent to the provider's servers |
| Cost | Free after setup | Pay per token used |
| Quality | Good, depends on your hardware | Usually best-in-class (large models) |
| Speed | Depends on your GPU/CPU | Usually fast with strong infrastructure |
| Internet required | No | Yes |
| Best for | Sensitive data, offline work, cost savings | Complex tasks, best possible quality |
Choosing the Right Model
🎯 Model Selection Guide
- For complex coding — Claude Sonnet 4, GPT-4o, or DeepSeek R1 (advanced reasoning)
- For fast/simple tasks — Claude 3.5 Haiku, GPT-4o mini, or Gemini 2.0 Flash
- For very long context — Gemini 2.5 Pro (massive context window)
- For maximum privacy — any model via Ollama or LM Studio
- For lowest possible cost — DeepSeek V3 or a local model via Ollama
Supported Providers
Aura Work supports 10 providers to maximize flexibility. Each provider has its own strengths:
Cloud
- OpenAI — GPT-4o, GPT-4.5, o1, o3. Excellent at coding and analysis
- Anthropic — Claude Opus, Sonnet, Haiku. Long context and computer vision
- Google Gemini — Gemini 2.5 Pro, Flash. Multimodal and lightweight models
- DeepSeek — DeepSeek-V3, R1. Strong performance at low cost
- xAI Grok — Grok-3. Cutting-edge models from xAI
Local
- Ollama — Llama, Mistral, Gemma, Qwen. Complete privacy
- LM Studio — a graphical interface for running models locally
- LocalAI — a local OpenAI-compatible API
How to Configure Providers
Use the following command to configure a provider:
aura config set providers.openai.apiKey sk-...
aura config set providers.anthropic.apiKey sk-ant-...
For local providers:
aura config set providers.ollama.baseUrl http://localhost:11434
aura config set providers.ollama.model llama3.2
All keys are stored encrypted using OS-level encryption (Keytar on macOS, DPAPI on Windows, libsecret on Linux).