AI Models - Elementum Documentation

Elementum supports a wide range of AI models across multiple providers. Models are accessed through configured AI providers — you’ll select your provider first, then choose from the models available through that provider. To use a model in your workflows, create an AI Service that references the provider and model you want.

Available Providers

Snowflake — Claude, OpenAI (Cortex), Llama, Mistral, DeepSeek, embedding models
OpenAI — GPT models
Anthropic — Direct access to Claude models. Primary provider supported on Studio Agents. See the Anthropic setup guide for configuration details.
Gemini — Multimodal Gemini models
Bedrock — Claude models via AWS Bedrock. See the AWS Bedrock Setup guide to configure the provider, or AWS Bedrock Agents Setup to connect a Bedrock Agent built in AWS.
Custom — Any OpenAI-compatible endpoint, including LLM gateways, proxies, and self-hosted models. See Configure a Custom Provider for setup details.

Important: Models are accessed through providers you configure in Organization Settings. Claude models, for example, can be accessed directly through the Anthropic provider or through the Snowflake or Bedrock providers, depending on your data residency and compliance needs.

Model availability varies by provider configuration, account tier, and region. Not all models listed below may be available in your environment. Check your provider’s service creation screen for the models currently accessible to your organization.

Quick Reference: Model Capabilities

This table shows which capabilities each model category supports:

Model	Provider	Multimodal	Structured Output	Reasoning	Agents	Best Use Case
GPT-5 Series	OpenAI	No	Yes	Yes	Yes	Complex reasoning, demanding applications
GPT-4.1 Series	OpenAI	No	Yes	No	Yes	Production deployments, reliable automation
GPT-4o Series	OpenAI	Yes	Yes	No	Yes	General-purpose, document analysis
o3-mini / o1-mini	OpenAI	No	No	Yes (req.)	No	Mathematical reasoning, logic problems
GPT-4 / GPT-3.5	OpenAI	No	Partial	No	No	Legacy applications
Claude 4.5 Sonnet	Anthropic, Snowflake, Bedrock	No	No	No	Yes	Advanced reasoning, detailed analysis, Studio Agents
Claude 4 Opus	Anthropic, Snowflake, Bedrock	No	No	No	Yes	Highest-capability reasoning
Claude 4 Sonnet	Anthropic, Snowflake, Bedrock	No	No	No	Yes	Balanced performance
Claude Haiku 4.5	Anthropic, Snowflake, Bedrock	No	No	No	Yes	Fast, efficient processing
Claude 3.7 / 3.5	Anthropic, Snowflake, Bedrock	No	No	No	Yes	Cost-effective reasoning
Cortex GPT-5	Snowflake	No	Yes	Yes	Yes	OpenAI through Snowflake
Cortex GPT-4.1	Snowflake	No	Yes	No	Yes	Production OpenAI via Snowflake
Cortex o4-mini	Snowflake	No	No	Yes (req.)	Yes	Reasoning through Snowflake
Gemini 3 Pro	Gemini	Yes	No	No	Yes	Latest multimodal
Gemini 2.5 Pro/Flash	Gemini	Yes	No	No	Yes	Production multimodal
Gemini 2.0 Flash	Gemini	Yes	No	No	No	Cost-effective multimodal
Gemini 1.5 Pro	Gemini	Yes	No	No	No	Established multimodal
DeepSeek R1	Snowflake	No	No	No	No	Open-source reasoning
Llama 3.3 70B	Snowflake	No	No	No	No	Open-source, balanced
Llama 3.1 Series	Snowflake	No	Partial	No	No	Open-source, structured output
Llama 3 Series	Snowflake	No	No	No	No	Open-source, function calling
Mistral Large 2	Snowflake	No	No	No	No	Multilingual, European focus
Mistral 7B / Mixtral	Snowflake	No	No	No	No	Efficient small models
Snowflake Arctic	Snowflake	No	No	No	No	Data cloud native
Arctic Embeddings	Snowflake	N/A	N/A	N/A	N/A	Semantic search

Legend:

Multimodal: Processes text and images together
Structured Output: Guaranteed JSON/XML format responses
Reasoning: Advanced reasoning mode (req. = required, always on)
Agents: Supports Elementum agent workflows

Models by Provider

Snowflake Cortex

Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.

Data Residency: All Snowflake Cortex models run within your Snowflake environment, keeping data in your cloud.

Anthropic Claude (via Snowflake)

Claude 4.5 Sonnet

claude-sonnet-4-5

Best for: Complex reasoning and analysis tasks
Capabilities: Strong reasoning, nuanced understanding, extensive context windows
Use cases: Complex research, detailed analysis, advanced automation, agent workflows
Temperature range: 0.0 - 1.0 (default: 0.7)
When to use: Demanding applications requiring deep understanding and analysis

Claude 4 Series

Claude 4 Opus (claude-4-opus)

Best for: Highest-capability reasoning tasks
Use cases: Strategic decisions, complex research, mission-critical analysis
When to use: Tasks where quality matters more than cost

Claude 4 Sonnet (claude-4-sonnet)

Best for: Balanced performance and cost for demanding tasks
Use cases: Business automation, production workflows, detailed analysis
When to use: Production workloads needing strong reasoning

Claude Haiku 4.5 (claude-haiku-4-5)

Best for: Fast, efficient processing
Use cases: High-volume operations, real-time interactions, simple automation
When to use: Speed and cost-efficiency are priorities

Claude 3.7 & 3.5 Sonnet

Claude 3.7 Sonnet (claude-3-7-sonnet)

Best for: Daily tasks requiring strong reasoning at lower cost
Use cases: Standard business automation, customer support, content generation
When to use: Cost-effective production deployments

Claude 3.5 Sonnet (claude-3-5-sonnet)

Best for: Reliable production performance
Use cases: Established workflows, production automation
When to use: Stability and consistent performance matter

OpenAI via Cortex

Access OpenAI models through your Snowflake environment:

Cortex GPT-5 Series

openai-gpt-5 - Advanced reasoning openai-gpt-5-mini - Efficient reasoning openai-gpt-5-nano - Maximum efficiency openai-gpt-5-chat - Optimized for conversations

Capabilities: Structured output, reasoning mode, through Snowflake
Use cases: Complex analysis, conversations, classification, intelligence features, agents
When to use: Need OpenAI capabilities with Snowflake data residency

Cortex GPT-4.1 & o4-mini

openai-gpt-4.1

Production-ready OpenAI through Snowflake
Structured output, reliable reasoning

openai-o4-mini

Reasoning model through Snowflake
Required temperature: 1.0 (not adjustable)
System role not supported

When to use: Production OpenAI workloads within Snowflake environment

Open Source Models (via Snowflake)

DeepSeek R1

deepseek-r1

Best for: Advanced reasoning with open-source flexibility
Capabilities: Strong reasoning, open-source architecture
Use cases: Research, academic applications, cost-conscious deployments
When to use: Open-source requirements or research projects

Meta Llama Models

Llama 3.3 Series

Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance

Llama 3.2 Series

Llama 3.2 3B (llama3.2-3b): Efficient, compact
Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks

Llama 3.1 Series

Llama 3.1 405B (llama3.1-405b): Largest, most capable
Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support

Llama 3 Series

Llama 3 70B (llama3-70b): Function calling, reliable
Llama 3 8B (llama3-8b): Efficient operation

Llama 2 Series

Llama 2 70B Chat (llama2-70b-chat): Conversational focus

When to use: Open-source requirements, cost optimization, specific model sizes

Mistral Models

Mistral Large 2 (mistral-large2)

Advanced capabilities, multilingual support, European markets

Mistral Large (mistral-large)

Previous generation, reliable performance

Mistral 7B (mistral-7b)

Compact, efficient, cost-effective

Mixtral 8x7B (mixtral-8x7b)

Mixture-of-experts architecture, balanced performance

When to use: International applications, multilingual needs, small model requirements

Other Snowflake Models

Snowflake Arctic (snowflake-arctic)

Data cloud native processing, integrated with Snowflake infrastructure
When to use: Data-intensive workflows within Snowflake

Gemma 7B (gemma-7b)

Lightweight Google-developed model
When to use: Efficient processing on smaller tasks

Jamba Instruct (jamba-instruct)

Instruction-following optimization
Note: Not recommended for JSON/YAML parsing

Reka Core / Flash (reka-core, reka-flash)

Advanced processing or fast operation

Snowflake Embedding Models

Arctic L V2.0 (snowflake-arctic-embed-l-v2.0) — High-quality embeddings for semantic search. Use for new AI search implementations.
Arctic M V1.5 (snowflake-arctic-embed-m-v1.5) — Balanced performance and quality. Use for production search systems.

Anthropic Direct

Access Claude models directly through the Anthropic API. The Anthropic provider is the primary provider supported on Studio Agents and gives organizations a direct path to Claude without routing requests through Snowflake or AWS Bedrock.

Studio Agents: Anthropic is the primary provider supported on Studio Agents. Configure an Anthropic provider before using Studio Agents to build automations, agents, and flows.

Claude 4.5 Sonnet

claude-sonnet-4-5

Best for: Complex reasoning, detailed analysis, and Studio Agents
Capabilities: Strong reasoning, nuanced understanding, extensive context windows, agent support
Use cases: Studio Agents, advanced automation, conversational agents, complex research
When to use: Demanding applications that benefit from direct Anthropic access

Claude 4 Series

Claude 4 Opus (claude-4-opus)

Best for: Highest-capability reasoning tasks
Use cases: Strategic decisions, complex research, mission-critical analysis
When to use: Tasks where quality matters more than cost

Claude 4 Sonnet (claude-4-sonnet)

Best for: Balanced performance and cost for demanding tasks
Use cases: Business automation, production workflows, detailed analysis
When to use: Production workloads needing strong reasoning

Claude Haiku 4.5 (claude-haiku-4-5)

Best for: Fast, efficient processing
Use cases: High-volume operations, real-time interactions, simple automation
When to use: Speed and cost-efficiency are priorities

Claude 3.7 & 3.5 Sonnet

Claude 3.7 Sonnet (claude-3-7-sonnet)

Best for: Daily tasks requiring strong reasoning at lower cost
Use cases: Standard business automation, customer support, content generation
When to use: Cost-effective production deployments

Claude 3.5 Sonnet (claude-3-5-sonnet)

Best for: Reliable production performance
Use cases: Established workflows, production automation
When to use: Stability and consistent performance matter

Note: Embeddings for AI Search are handled exclusively through Snowflake Cortex. Anthropic models are used for LLM services only. Specific model availability depends on your Anthropic account access.

OpenAI Direct

Access OpenAI models directly through OpenAI API.

GPT-5 Series - Latest Generation

GPT-5 (gpt-5)

Best for: Complex reasoning and demanding applications
Capabilities: Structured output, reasoning mode, advanced problem-solving
Use cases: Complex analysis, strategic planning, research tasks

GPT-5 Mini (gpt-5-mini)

Best for: Daily reasoning at lower cost
Capabilities: Structured output, reasoning mode, balanced performance
Use cases: Standard business logic, moderate analysis, automation

GPT-5 Nano (gpt-5-nano)

Best for: Simple reasoning requiring efficiency
Capabilities: Structured output, reasoning mode, cost-effective
Use cases: Basic classification, simple analysis, high-volume operations

GPT-5.1 (gpt-5.1)

Best for: Enhanced reasoning with improved accuracy
Use cases: Business intelligence, detailed analysis, critical decisions

GPT-5.2 (gpt-5.2)

Best for: Highest-tier reasoning
Use cases: Complex problem-solving, research, mission-critical applications

All GPT-5 models support: Prompts, conversations, classification, intelligence features, agents

GPT-4.1 Series - Production Ready

GPT-4.1 (gpt-4.1)

Best for: Production applications requiring consistent performance
Capabilities: Structured output, reliable reasoning
Use cases: Customer-facing applications, production workflows

GPT-4.1 Mini (gpt-4.1-mini)

Best for: Cost-effective production deployments
Use cases: High-volume automation, chatbots, content generation

GPT-4.1 Nano (gpt-4.1-nano)

Best for: Maximum efficiency for simple tasks
Use cases: Real-time interactions, simple classification, quick responses

All GPT-4.1 models support: Prompts, conversations, classification, intelligence features, agents

GPT-4o Series - Optimized

GPT-4o (gpt-4o)

Best for: Balanced performance and capability
Capabilities: Structured output, multimodal support (text + images)
Use cases: General-purpose applications, document analysis, versatile automation
Supports: Prompts, conversations, email analysis, classification, intelligence, agents

GPT-4o Mini (gpt-4o-mini-2024-07-18)

Best for: Cost-effective general-purpose tasks
Capabilities: Structured output, multimodal support, efficient operation
Use cases: Standard automation, customer support, content processing
Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents

Reasoning Models - o3-mini & o1-mini

o3-mini (o3-mini)

Best for: Latest reasoning-focused tasks requiring deep analysis
Capabilities: Advanced reasoning mode (required temperature: 1.0)
Use cases: Mathematical problems, logical analysis, complex problem-solving
Note: System role not supported; fixed temperature requirement

o1-mini (o1-mini)

Best for: Previous-generation reasoning tasks
Capabilities: Reasoning mode (required temperature: 1.0)
Use cases: Logic puzzles, analytical tasks, structured problem-solving
Note: System role not supported; fixed temperature requirement

When to use: Tasks requiring explicit step-by-step reasoning, math, logic

Legacy Models

GPT-4 Turbo Preview (gpt-4-turbo-preview)

Function calling, extended context
Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series

GPT-4 (gpt-4)

Structured output, reliable performance
Supports: Prompts, conversations, summarization, email analysis, classification

GPT-3.5 Turbo (gpt-3.5-turbo, gpt-3.5-turbo-1106)

Function calling, basic capabilities
Recommendation: Upgrade to GPT-4.1 Mini for better performance

Google Gemini

Access Google’s multimodal Gemini models directly.

Gemini 3 Series - Latest

Gemini 3 Pro Preview (gemini-3-pro-preview)

Best for: Latest multimodal capabilities
Capabilities: Multimodal processing (text, images, audio), advanced reasoning
Use cases: Document analysis with images, multimedia processing, complex automation
Temperature range: 0.0 - 1.0 (default: 0.7)
Supports: Prompts, translation, classification, file analysis, agents

Gemini 2.5 Series - Production Advanced

Gemini 2.5 Pro (gemini-2.5-pro)

Best for: Complex multimodal tasks requiring high performance
Capabilities: Multimodal, large context windows, detailed analysis
Use cases: Document understanding, comprehensive analysis, advanced automation

Gemini 2.5 Flash (gemini-2.5-flash)

Best for: Fast multimodal processing
Capabilities: Multimodal, efficient operation, quick responses
Use cases: Real-time document analysis, responsive automation

Both support: Prompts, translation, classification, file analysis, agents

Gemini 2.0 Series - Efficient

Gemini 2.0 Flash (gemini-2.0-flash)

Best for: Cost-effective multimodal processing
Use cases: Standard document processing, general automation

Gemini 2.0 Flash Lite (gemini-2.0-flash-lite)

Best for: Lightweight multimodal tasks
Use cases: Simple document analysis, high-volume operations

Both support: Prompts, translation, classification, file analysis

Gemini 1.5 Pro - Established

Gemini 1.5 Pro (gemini-1.5-pro)

Best for: Established multimodal performance
Capabilities: Multimodal processing, reliable operation
Use cases: Production workloads, established workflows
Supports: Prompts, translation, classification, file analysis

AWS Bedrock

Access Claude models through your own AWS Bedrock account. Bedrock-hosted models run within your AWS infrastructure, keeping AI workloads inside your cloud compliance boundaries.

Claude Models (via Bedrock)

Claude models available through Bedrock mirror the Anthropic Claude family. Availability depends on your AWS region and account access.

Best for: Organizations that need Claude capabilities within their own AWS environment
Use cases: Automations, agents, content generation, classification, reasoning
When to use: AWS compliance requirements, existing AWS infrastructure, data residency needs

Model Availability: Available Claude models depend on your AWS region and Bedrock model access grants. Check the Amazon Bedrock console for current availability.

Custom Providers

Connect any OpenAI-compatible endpoint as an AI provider in Elementum. This includes LLM gateways, proxies, internally hosted inference servers, and self-hosted models. Once configured, custom-provider models appear in the model dropdown alongside built-in providers and can be used in the same agents and automations.

Best for: Standardizing on internal infrastructure, routing through an enterprise LLM gateway, or adopting models that aren’t yet supported natively
Use cases: Self-hosted open-source models, proxied access to multiple upstream providers, region- or compliance-specific deployments
When to use: You need a model or routing path that isn’t covered by the built-in providers, or your organization requires all model traffic to flow through a managed endpoint

Capability availability: Multimodal input, structured output, reasoning mode, and other advanced features are only available if the underlying custom endpoint supports them. Test each model against your use case before relying on a specific capability. See Configure a Custom Provider for connection details.

Model Selection Guide

By Use Case

Conversational Agents

Recommended Models:

GPT-4o Mini - Best balance of cost and performance
Claude 3.7 Sonnet - Strong reasoning at reasonable cost
Gemini 2.5 Flash - Fast multimodal conversations
GPT-5 Mini - Advanced reasoning for complex interactions

Why these models:

Support structured output for reliable responses
Handle context well for conversation continuity
Cost-effective for high-volume interactions
Consistent reliability in production

Document Analysis with Images

Recommended Models:

Gemini 2.5 Pro - Complex multimodal analysis
Gemini 3 Pro Preview - Latest document understanding
GPT-4o - Strong multimodal processing
Gemini 2.5 Flash - Fast multimodal analysis

Why these models:

Multimodal support for images and text together
Large context windows for lengthy documents
Strong reasoning for extracting insights
Handle charts, diagrams, and visual elements

Data Classification & Extraction

Recommended Models:

GPT-4.1 Nano - Fast, cost-effective
GPT-4o Mini - Structured output for consistency
Claude Haiku 4.5 - Quick, efficient
GPT-4.1 Mini - Production-ready reliability

Why these models:

Structured output ensures consistent categorization
Cost-effective for high-volume operations
Fast response times for real-time classification
Reliable accuracy for business logic

Complex Reasoning & Analysis

Recommended Models:

o3-mini - Specialized reasoning mode for logic
Claude 4 Opus - Highest-capability reasoning
GPT-5 - Advanced problem-solving
Claude Sonnet 4.5 - Detailed analysis

Why these models:

Advanced reasoning capabilities
Handle multi-step logic effectively
Understand complex relationships
Provide detailed explanations

Content Generation

Recommended Models:

Claude Sonnet 4.5 - High-quality writing
GPT-5 - Creative and coherent content
Gemini 2.5 Pro - Long-form content
Claude 4 Sonnet - High-quality balanced output

Why these models:

Natural, fluent writing style
Good creativity control via temperature
Handle various content types well
Consistent quality and tone

Semantic Search

Required Models:

Snowflake Arctic L V2.0 - Latest, highest quality
Snowflake Arctic M V1.5 - Reliable production

Why these models:

Optimized for semantic similarity
Consistent vector representations
Efficient processing at scale
Note: Must use Snowflake provider for embedding models

By Budget

Cost-Conscious
Balanced
Performance

Lowest Cost Models:

GPT-4.1 Nano: Minimal cost, simple tasks
Claude Haiku 4.5: Fast and efficient
Gemini 2.0 Flash Lite: Lightweight multimodal
Llama 3.2 1B/3B: Maximum efficiency
Mistral 7B: Small but capable

Best for: High-volume operations, simple automation, basic classification

By Provider Strengths

Strengths:

Data residency in your cloud
Wide model selection
Claude and OpenAI access
Native data processing

Choose when: Data security, Snowflake integration, diverse model needs

Key Model Capabilities

Multimodal Processing

What it is: Process text and images together in the same request Supported Models:

All Gemini models (2.0+)
GPT-4o, GPT-4o Mini

Use cases:

Document analysis with charts/diagrams
Image-based data extraction
Visual content understanding
OCR and form processing

Structured Output

What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:

All GPT-4o, GPT-4.1, GPT-5 series
Cortex GPT models
GPT-4 (partial)
Llama 3.1 8B, 70B

Use cases:

Data extraction to databases
Automated classification
API integrations
Workflow automation

Reasoning Mode

What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:

o1-mini, o3-mini (dedicated reasoning, always on)
GPT-5 series (configurable)
Cortex o4-mini (dedicated reasoning)

Use cases:

Mathematical problems
Logic puzzles
Complex analysis
Multi-step problem-solving

Note: Dedicated reasoning models (o1/o3/o4-mini) require temperature = 1.0 and don’t support system roles

Agent Support

What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:

GPT-4o, GPT-4.1, GPT-5 series
All Claude models (via Anthropic, Snowflake, or Bedrock)
Cortex OpenAI models
Gemini 2.5+, Gemini 3 Pro

Use cases:

Conversational agents
Multi-turn interactions
Complex workflows
Autonomous task execution

Temperature Settings

All models except dedicated reasoning models support customizable temperature. Temperature is configured at the LLM Service level in Organization Settings.

0.0 - 0.3: Deterministic, consistent (classification, data extraction)
0.4 - 0.7: Balanced creativity (conversation, general tasks)
0.8 - 1.0: Creative, diverse (content generation, brainstorming)

Default: 0.7 for most models Special cases: o1-mini, o3-mini, Cortex o4-mini require temperature 1.0 (not adjustable)

Best Practices

Model Selection

Identify your use case — Determine if you need conversation, classification, analysis, generation, or search.
Check required capabilities — Verify if you need multimodal, structured output, or reasoning capabilities.
Consider your provider — Choose based on data residency, integration, and model access requirements.
Balance cost and performance — Select the smallest model that meets your quality requirements.
Test before committing — Compare 2-3 models with your actual use cases.
Monitor and optimize — Track quality, cost, and speed metrics to refine your selection.

Cost Optimization

Choose right-sized models:

Use Nano/Mini for simple tasks
Reserve Pro/Opus for complex analysis
Test if smaller models meet needs

Optimize prompts:

Write concise, clear instructions
Remove unnecessary context
Set appropriate max tokens
Use structured output formats

Consider provider costs:

Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
Direct provider access may be more cost-effective for high-volume, simple tasks
Snowflake provides value through data residency and unified platform

Performance Optimization

For speed:

Use Mini/Nano/Haiku models
Lower max tokens
Choose geographically close providers

For quality:

Use Pro/Opus/Sonnet tier models
Provide detailed context
Test with real examples

For consistency:

Use low temperature (0.0-0.2)
Enable structured output
Choose models with structured output support

Next Steps

AI Providers

Set up your AI provider connections

Create AI Services

Configure specific model instances for your workflows

Build Agents

Create conversational AI assistants using these models

AI Automations

Use AI models in automation workflows

Documentation Index

​Available Providers

​Quick Reference: Model Capabilities

​Models by Provider

​Snowflake Cortex

​Anthropic Claude (via Snowflake)

​OpenAI via Cortex

​Open Source Models (via Snowflake)

​Snowflake Embedding Models

​Anthropic Direct

​OpenAI Direct

​Google Gemini

​AWS Bedrock

​Custom Providers

​Model Selection Guide

​By Use Case

​By Budget

​By Provider Strengths

​Key Model Capabilities

​Multimodal Processing

​Structured Output

​Reasoning Mode

​Agent Support

​Temperature Settings

​Best Practices

​Model Selection

​Cost Optimization

​Performance Optimization

​Next Steps

AI Providers

Create AI Services

Build Agents

AI Automations

Available Providers

Quick Reference: Model Capabilities

Models by Provider

Snowflake Cortex

Anthropic Claude (via Snowflake)

OpenAI via Cortex

Open Source Models (via Snowflake)

Snowflake Embedding Models

Anthropic Direct

OpenAI Direct

Google Gemini

AWS Bedrock

Custom Providers

Model Selection Guide

By Use Case

By Budget

By Provider Strengths

Key Model Capabilities

Multimodal Processing

Structured Output

Reasoning Mode

Agent Support

Temperature Settings

Best Practices

Model Selection

Cost Optimization

Performance Optimization

Next Steps