Elementum supports a wide range of AI models across multiple providers. Models are accessed through configured AI providers — you’ll select your provider first, then choose from the models available through that provider. To use a model in your workflows, create an AI Service that references the provider and model you want.Documentation Index
Fetch the complete documentation index at: https://docs.elementum.io/llms.txt
Use this file to discover all available pages before exploring further.
Available Providers
- Snowflake — Claude, OpenAI (Cortex), Llama, Mistral, DeepSeek, embedding models
- OpenAI — GPT models
- Anthropic — Direct access to Claude models. Primary provider supported on Studio Agents. See the Anthropic setup guide for configuration details.
- Gemini — Multimodal Gemini models
- Bedrock — Claude models via AWS Bedrock. See the AWS Bedrock Setup guide to configure the provider, or AWS Bedrock Agents Setup to connect a Bedrock Agent built in AWS.
- Custom — Any OpenAI-compatible endpoint, including LLM gateways, proxies, and self-hosted models. See Configure a Custom Provider for setup details.
Quick Reference: Model Capabilities
This table shows which capabilities each model category supports:| Model | Provider | Multimodal | Structured Output | Reasoning | Agents | Best Use Case |
|---|---|---|---|---|---|---|
| GPT-5 Series | OpenAI | No | Yes | Yes | Yes | Complex reasoning, demanding applications |
| GPT-4.1 Series | OpenAI | No | Yes | No | Yes | Production deployments, reliable automation |
| GPT-4o Series | OpenAI | Yes | Yes | No | Yes | General-purpose, document analysis |
| o3-mini / o1-mini | OpenAI | No | No | Yes (req.) | No | Mathematical reasoning, logic problems |
| GPT-4 / GPT-3.5 | OpenAI | No | Partial | No | No | Legacy applications |
| Claude 4.5 Sonnet | Anthropic, Snowflake, Bedrock | No | No | No | Yes | Advanced reasoning, detailed analysis, Studio Agents |
| Claude 4 Opus | Anthropic, Snowflake, Bedrock | No | No | No | Yes | Highest-capability reasoning |
| Claude 4 Sonnet | Anthropic, Snowflake, Bedrock | No | No | No | Yes | Balanced performance |
| Claude Haiku 4.5 | Anthropic, Snowflake, Bedrock | No | No | No | Yes | Fast, efficient processing |
| Claude 3.7 / 3.5 | Anthropic, Snowflake, Bedrock | No | No | No | Yes | Cost-effective reasoning |
| Cortex GPT-5 | Snowflake | No | Yes | Yes | Yes | OpenAI through Snowflake |
| Cortex GPT-4.1 | Snowflake | No | Yes | No | Yes | Production OpenAI via Snowflake |
| Cortex o4-mini | Snowflake | No | No | Yes (req.) | Yes | Reasoning through Snowflake |
| Gemini 3 Pro | Gemini | Yes | No | No | Yes | Latest multimodal |
| Gemini 2.5 Pro/Flash | Gemini | Yes | No | No | Yes | Production multimodal |
| Gemini 2.0 Flash | Gemini | Yes | No | No | No | Cost-effective multimodal |
| Gemini 1.5 Pro | Gemini | Yes | No | No | No | Established multimodal |
| DeepSeek R1 | Snowflake | No | No | No | No | Open-source reasoning |
| Llama 3.3 70B | Snowflake | No | No | No | No | Open-source, balanced |
| Llama 3.1 Series | Snowflake | No | Partial | No | No | Open-source, structured output |
| Llama 3 Series | Snowflake | No | No | No | No | Open-source, function calling |
| Mistral Large 2 | Snowflake | No | No | No | No | Multilingual, European focus |
| Mistral 7B / Mixtral | Snowflake | No | No | No | No | Efficient small models |
| Snowflake Arctic | Snowflake | No | No | No | No | Data cloud native |
| Arctic Embeddings | Snowflake | N/A | N/A | N/A | N/A | Semantic search |
- Multimodal: Processes text and images together
- Structured Output: Guaranteed JSON/XML format responses
- Reasoning: Advanced reasoning mode (req. = required, always on)
- Agents: Supports Elementum agent workflows
Models by Provider
Snowflake Cortex
Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.Anthropic Claude (via Snowflake)
Claude 4.5 Sonnet
Claude 4.5 Sonnet
- Best for: Complex reasoning and analysis tasks
- Capabilities: Strong reasoning, nuanced understanding, extensive context windows
- Use cases: Complex research, detailed analysis, advanced automation, agent workflows
- Temperature range: 0.0 - 1.0 (default: 0.7)
- When to use: Demanding applications requiring deep understanding and analysis
Claude 4 Series
Claude 4 Series
- Best for: Highest-capability reasoning tasks
- Use cases: Strategic decisions, complex research, mission-critical analysis
- When to use: Tasks where quality matters more than cost
- Best for: Balanced performance and cost for demanding tasks
- Use cases: Business automation, production workflows, detailed analysis
- When to use: Production workloads needing strong reasoning
- Best for: Fast, efficient processing
- Use cases: High-volume operations, real-time interactions, simple automation
- When to use: Speed and cost-efficiency are priorities
Claude 3.7 & 3.5 Sonnet
Claude 3.7 & 3.5 Sonnet
- Best for: Daily tasks requiring strong reasoning at lower cost
- Use cases: Standard business automation, customer support, content generation
- When to use: Cost-effective production deployments
- Best for: Reliable production performance
- Use cases: Established workflows, production automation
- When to use: Stability and consistent performance matter
OpenAI via Cortex
Access OpenAI models through your Snowflake environment:Cortex GPT-5 Series
Cortex GPT-5 Series
- Capabilities: Structured output, reasoning mode, through Snowflake
- Use cases: Complex analysis, conversations, classification, intelligence features, agents
- When to use: Need OpenAI capabilities with Snowflake data residency
Cortex GPT-4.1 & o4-mini
Cortex GPT-4.1 & o4-mini
- Production-ready OpenAI through Snowflake
- Structured output, reliable reasoning
- Reasoning model through Snowflake
- Required temperature: 1.0 (not adjustable)
- System role not supported
Open Source Models (via Snowflake)
DeepSeek R1
DeepSeek R1
- Best for: Advanced reasoning with open-source flexibility
- Capabilities: Strong reasoning, open-source architecture
- Use cases: Research, academic applications, cost-conscious deployments
- When to use: Open-source requirements or research projects
Meta Llama Models
Meta Llama Models
- Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance
- Llama 3.2 3B (llama3.2-3b): Efficient, compact
- Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks
- Llama 3.1 405B (llama3.1-405b): Largest, most capable
- Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
- Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support
- Llama 3 70B (llama3-70b): Function calling, reliable
- Llama 3 8B (llama3-8b): Efficient operation
- Llama 2 70B Chat (llama2-70b-chat): Conversational focus
Mistral Models
Mistral Models
- Advanced capabilities, multilingual support, European markets
- Previous generation, reliable performance
- Compact, efficient, cost-effective
- Mixture-of-experts architecture, balanced performance
Other Snowflake Models
Other Snowflake Models
- Data cloud native processing, integrated with Snowflake infrastructure
- When to use: Data-intensive workflows within Snowflake
- Lightweight Google-developed model
- When to use: Efficient processing on smaller tasks
- Instruction-following optimization
- Note: Not recommended for JSON/YAML parsing
- Advanced processing or fast operation
Snowflake Embedding Models
- Arctic L V2.0 (
snowflake-arctic-embed-l-v2.0) — High-quality embeddings for semantic search. Use for new AI search implementations. - Arctic M V1.5 (
snowflake-arctic-embed-m-v1.5) — Balanced performance and quality. Use for production search systems.
Anthropic Direct
Access Claude models directly through the Anthropic API. The Anthropic provider is the primary provider supported on Studio Agents and gives organizations a direct path to Claude without routing requests through Snowflake or AWS Bedrock.Claude 4.5 Sonnet
Claude 4.5 Sonnet
- Best for: Complex reasoning, detailed analysis, and Studio Agents
- Capabilities: Strong reasoning, nuanced understanding, extensive context windows, agent support
- Use cases: Studio Agents, advanced automation, conversational agents, complex research
- When to use: Demanding applications that benefit from direct Anthropic access
Claude 4 Series
Claude 4 Series
- Best for: Highest-capability reasoning tasks
- Use cases: Strategic decisions, complex research, mission-critical analysis
- When to use: Tasks where quality matters more than cost
- Best for: Balanced performance and cost for demanding tasks
- Use cases: Business automation, production workflows, detailed analysis
- When to use: Production workloads needing strong reasoning
- Best for: Fast, efficient processing
- Use cases: High-volume operations, real-time interactions, simple automation
- When to use: Speed and cost-efficiency are priorities
Claude 3.7 & 3.5 Sonnet
Claude 3.7 & 3.5 Sonnet
- Best for: Daily tasks requiring strong reasoning at lower cost
- Use cases: Standard business automation, customer support, content generation
- When to use: Cost-effective production deployments
- Best for: Reliable production performance
- Use cases: Established workflows, production automation
- When to use: Stability and consistent performance matter
OpenAI Direct
Access OpenAI models directly through OpenAI API.GPT-5 Series - Latest Generation
GPT-5 Series - Latest Generation
- Best for: Complex reasoning and demanding applications
- Capabilities: Structured output, reasoning mode, advanced problem-solving
- Use cases: Complex analysis, strategic planning, research tasks
- Best for: Daily reasoning at lower cost
- Capabilities: Structured output, reasoning mode, balanced performance
- Use cases: Standard business logic, moderate analysis, automation
- Best for: Simple reasoning requiring efficiency
- Capabilities: Structured output, reasoning mode, cost-effective
- Use cases: Basic classification, simple analysis, high-volume operations
- Best for: Enhanced reasoning with improved accuracy
- Use cases: Business intelligence, detailed analysis, critical decisions
- Best for: Highest-tier reasoning
- Use cases: Complex problem-solving, research, mission-critical applications
GPT-4.1 Series - Production Ready
GPT-4.1 Series - Production Ready
- Best for: Production applications requiring consistent performance
- Capabilities: Structured output, reliable reasoning
- Use cases: Customer-facing applications, production workflows
- Best for: Cost-effective production deployments
- Use cases: High-volume automation, chatbots, content generation
- Best for: Maximum efficiency for simple tasks
- Use cases: Real-time interactions, simple classification, quick responses
GPT-4o Series - Optimized
GPT-4o Series - Optimized
- Best for: Balanced performance and capability
- Capabilities: Structured output, multimodal support (text + images)
- Use cases: General-purpose applications, document analysis, versatile automation
- Supports: Prompts, conversations, email analysis, classification, intelligence, agents
- Best for: Cost-effective general-purpose tasks
- Capabilities: Structured output, multimodal support, efficient operation
- Use cases: Standard automation, customer support, content processing
- Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents
Reasoning Models - o3-mini & o1-mini
Reasoning Models - o3-mini & o1-mini
- Best for: Latest reasoning-focused tasks requiring deep analysis
- Capabilities: Advanced reasoning mode (required temperature: 1.0)
- Use cases: Mathematical problems, logical analysis, complex problem-solving
- Note: System role not supported; fixed temperature requirement
- Best for: Previous-generation reasoning tasks
- Capabilities: Reasoning mode (required temperature: 1.0)
- Use cases: Logic puzzles, analytical tasks, structured problem-solving
- Note: System role not supported; fixed temperature requirement
Legacy Models
Legacy Models
- Function calling, extended context
- Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series
- Structured output, reliable performance
- Supports: Prompts, conversations, summarization, email analysis, classification
- Function calling, basic capabilities
- Recommendation: Upgrade to GPT-4.1 Mini for better performance
Google Gemini
Access Google’s multimodal Gemini models directly.Gemini 3 Series - Latest
Gemini 3 Series - Latest
- Best for: Latest multimodal capabilities
- Capabilities: Multimodal processing (text, images, audio), advanced reasoning
- Use cases: Document analysis with images, multimedia processing, complex automation
- Temperature range: 0.0 - 1.0 (default: 0.7)
- Supports: Prompts, translation, classification, file analysis, agents
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Series - Production Advanced
- Best for: Complex multimodal tasks requiring high performance
- Capabilities: Multimodal, large context windows, detailed analysis
- Use cases: Document understanding, comprehensive analysis, advanced automation
- Best for: Fast multimodal processing
- Capabilities: Multimodal, efficient operation, quick responses
- Use cases: Real-time document analysis, responsive automation
Gemini 2.0 Series - Efficient
Gemini 2.0 Series - Efficient
- Best for: Cost-effective multimodal processing
- Use cases: Standard document processing, general automation
- Best for: Lightweight multimodal tasks
- Use cases: Simple document analysis, high-volume operations
Gemini 1.5 Pro - Established
Gemini 1.5 Pro - Established
- Best for: Established multimodal performance
- Capabilities: Multimodal processing, reliable operation
- Use cases: Production workloads, established workflows
- Supports: Prompts, translation, classification, file analysis
AWS Bedrock
Access Claude models through your own AWS Bedrock account. Bedrock-hosted models run within your AWS infrastructure, keeping AI workloads inside your cloud compliance boundaries.Claude Models (via Bedrock)
Claude Models (via Bedrock)
- Best for: Organizations that need Claude capabilities within their own AWS environment
- Use cases: Automations, agents, content generation, classification, reasoning
- When to use: AWS compliance requirements, existing AWS infrastructure, data residency needs
Custom Providers
Connect any OpenAI-compatible endpoint as an AI provider in Elementum. This includes LLM gateways, proxies, internally hosted inference servers, and self-hosted models. Once configured, custom-provider models appear in the model dropdown alongside built-in providers and can be used in the same agents and automations.- Best for: Standardizing on internal infrastructure, routing through an enterprise LLM gateway, or adopting models that aren’t yet supported natively
- Use cases: Self-hosted open-source models, proxied access to multiple upstream providers, region- or compliance-specific deployments
- When to use: You need a model or routing path that isn’t covered by the built-in providers, or your organization requires all model traffic to flow through a managed endpoint
Model Selection Guide
By Use Case
Conversational Agents
Conversational Agents
- GPT-4o Mini - Best balance of cost and performance
- Claude 3.7 Sonnet - Strong reasoning at reasonable cost
- Gemini 2.5 Flash - Fast multimodal conversations
- GPT-5 Mini - Advanced reasoning for complex interactions
- Support structured output for reliable responses
- Handle context well for conversation continuity
- Cost-effective for high-volume interactions
- Consistent reliability in production
Document Analysis with Images
Document Analysis with Images
- Gemini 2.5 Pro - Complex multimodal analysis
- Gemini 3 Pro Preview - Latest document understanding
- GPT-4o - Strong multimodal processing
- Gemini 2.5 Flash - Fast multimodal analysis
- Multimodal support for images and text together
- Large context windows for lengthy documents
- Strong reasoning for extracting insights
- Handle charts, diagrams, and visual elements
Data Classification & Extraction
Data Classification & Extraction
- GPT-4.1 Nano - Fast, cost-effective
- GPT-4o Mini - Structured output for consistency
- Claude Haiku 4.5 - Quick, efficient
- GPT-4.1 Mini - Production-ready reliability
- Structured output ensures consistent categorization
- Cost-effective for high-volume operations
- Fast response times for real-time classification
- Reliable accuracy for business logic
Complex Reasoning & Analysis
Complex Reasoning & Analysis
- o3-mini - Specialized reasoning mode for logic
- Claude 4 Opus - Highest-capability reasoning
- GPT-5 - Advanced problem-solving
- Claude Sonnet 4.5 - Detailed analysis
- Advanced reasoning capabilities
- Handle multi-step logic effectively
- Understand complex relationships
- Provide detailed explanations
Content Generation
Content Generation
- Claude Sonnet 4.5 - High-quality writing
- GPT-5 - Creative and coherent content
- Gemini 2.5 Pro - Long-form content
- Claude 4 Sonnet - High-quality balanced output
- Natural, fluent writing style
- Good creativity control via temperature
- Handle various content types well
- Consistent quality and tone
Semantic Search
Semantic Search
- Snowflake Arctic L V2.0 - Latest, highest quality
- Snowflake Arctic M V1.5 - Reliable production
- Optimized for semantic similarity
- Consistent vector representations
- Efficient processing at scale
- Note: Must use Snowflake provider for embedding models
By Budget
- Cost-Conscious
- Balanced
- Performance
- GPT-4.1 Nano: Minimal cost, simple tasks
- Claude Haiku 4.5: Fast and efficient
- Gemini 2.0 Flash Lite: Lightweight multimodal
- Llama 3.2 1B/3B: Maximum efficiency
- Mistral 7B: Small but capable
By Provider Strengths
- Snowflake
- OpenAI Direct
- Anthropic
- Gemini
- Bedrock
- Custom
- Data residency in your cloud
- Wide model selection
- Claude and OpenAI access
- Native data processing
Key Model Capabilities
Multimodal Processing
What it is: Process text and images together in the same request Supported Models:- All Gemini models (2.0+)
- GPT-4o, GPT-4o Mini
- Document analysis with charts/diagrams
- Image-based data extraction
- Visual content understanding
- OCR and form processing
Structured Output
What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:- All GPT-4o, GPT-4.1, GPT-5 series
- Cortex GPT models
- GPT-4 (partial)
- Llama 3.1 8B, 70B
- Data extraction to databases
- Automated classification
- API integrations
- Workflow automation
Reasoning Mode
What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:- o1-mini, o3-mini (dedicated reasoning, always on)
- GPT-5 series (configurable)
- Cortex o4-mini (dedicated reasoning)
- Mathematical problems
- Logic puzzles
- Complex analysis
- Multi-step problem-solving
Agent Support
What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:- GPT-4o, GPT-4.1, GPT-5 series
- All Claude models (via Anthropic, Snowflake, or Bedrock)
- Cortex OpenAI models
- Gemini 2.5+, Gemini 3 Pro
- Conversational agents
- Multi-turn interactions
- Complex workflows
- Autonomous task execution
Temperature Settings
All models except dedicated reasoning models support customizable temperature. Temperature is configured at the LLM Service level in Organization Settings.- 0.0 - 0.3: Deterministic, consistent (classification, data extraction)
- 0.4 - 0.7: Balanced creativity (conversation, general tasks)
- 0.8 - 1.0: Creative, diverse (content generation, brainstorming)
Best Practices
Model Selection
- Identify your use case — Determine if you need conversation, classification, analysis, generation, or search.
- Check required capabilities — Verify if you need multimodal, structured output, or reasoning capabilities.
- Consider your provider — Choose based on data residency, integration, and model access requirements.
- Balance cost and performance — Select the smallest model that meets your quality requirements.
- Test before committing — Compare 2-3 models with your actual use cases.
- Monitor and optimize — Track quality, cost, and speed metrics to refine your selection.
Cost Optimization
Choose right-sized models:- Use Nano/Mini for simple tasks
- Reserve Pro/Opus for complex analysis
- Test if smaller models meet needs
- Write concise, clear instructions
- Remove unnecessary context
- Set appropriate max tokens
- Use structured output formats
- Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
- Direct provider access may be more cost-effective for high-volume, simple tasks
- Snowflake provides value through data residency and unified platform
Performance Optimization
For speed:- Use Mini/Nano/Haiku models
- Lower max tokens
- Choose geographically close providers
- Use Pro/Opus/Sonnet tier models
- Provide detailed context
- Test with real examples
- Use low temperature (0.0-0.2)
- Enable structured output
- Choose models with structured output support