Available Providers
- Snowflake — Claude, OpenAI (Cortex), Llama, Mistral, DeepSeek, embedding models
- OpenAI — GPT models
- Gemini — Multimodal Gemini models
- Bedrock — AWS Bedrock agents (agent orchestration only). See the Bedrock setup guide for configuration details.
Important: Models are accessed through providers you configure in Organization Settings. For example, Claude models are accessed through the Snowflake provider, not directly from Anthropic.
Model availability varies by provider configuration, account tier, and region. Not all models listed below may be available in your environment. Check your provider’s service creation screen for the models currently accessible to your organization.
Quick Reference: Model Capabilities
This table shows which capabilities each model category supports:| Model | Provider | Multimodal | Structured Output | Reasoning | Agents | Best Use Case |
|---|---|---|---|---|---|---|
| GPT-5 Series | OpenAI | No | Yes | Yes | Yes | Complex reasoning, demanding applications |
| GPT-4.1 Series | OpenAI | No | Yes | No | Yes | Production deployments, reliable automation |
| GPT-4o Series | OpenAI | Yes | Yes | No | Yes | General-purpose, document analysis |
| o3-mini / o1-mini | OpenAI | No | No | Yes (req.) | No | Mathematical reasoning, logic problems |
| GPT-4 / GPT-3.5 | OpenAI | No | Partial | No | No | Legacy applications |
| Claude 4.5 Sonnet | Snowflake | No | No | No | Yes | Advanced reasoning, detailed analysis |
| Claude 4 Opus | Snowflake | No | No | No | Yes | Highest-capability reasoning |
| Claude 4 Sonnet | Snowflake | No | No | No | Yes | Balanced performance |
| Claude Haiku 4.5 | Snowflake | No | No | No | Yes | Fast, efficient processing |
| Claude 3.7 / 3.5 | Snowflake | No | No | No | Yes | Cost-effective reasoning |
| Cortex GPT-5 | Snowflake | No | Yes | Yes | Yes | OpenAI through Snowflake |
| Cortex GPT-4.1 | Snowflake | No | Yes | No | Yes | Production OpenAI via Snowflake |
| Cortex o4-mini | Snowflake | No | No | Yes (req.) | Yes | Reasoning through Snowflake |
| Gemini 3 Pro | Gemini | Yes | No | No | Yes | Latest multimodal |
| Gemini 2.5 Pro/Flash | Gemini | Yes | No | No | Yes | Production multimodal |
| Gemini 2.0 Flash | Gemini | Yes | No | No | No | Cost-effective multimodal |
| Gemini 1.5 Pro | Gemini | Yes | No | No | No | Established multimodal |
| DeepSeek R1 | Snowflake | No | No | No | No | Open-source reasoning |
| Llama 3.3 70B | Snowflake | No | No | No | No | Open-source, balanced |
| Llama 3.1 Series | Snowflake | No | Partial | No | No | Open-source, structured output |
| Llama 3 Series | Snowflake | No | No | No | No | Open-source, function calling |
| Mistral Large 2 | Snowflake | No | No | No | No | Multilingual, European focus |
| Mistral 7B / Mixtral | Snowflake | No | No | No | No | Efficient small models |
| Snowflake Arctic | Snowflake | No | No | No | No | Data cloud native |
| Arctic Embeddings | Snowflake | N/A | N/A | N/A | N/A | Semantic search |
- Multimodal: Processes text and images together
- Structured Output: Guaranteed JSON/XML format responses
- Reasoning: Advanced reasoning mode (req. = required, always on)
- Agents: Supports Elementum agent workflows
Models by Provider
Snowflake Cortex
Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.Data Residency: All Snowflake Cortex models run within your Snowflake environment, keeping data in your cloud.
Anthropic Claude (via Snowflake)
Claude 4.5 Sonnet
Claude 4.5 Sonnet
claude-sonnet-4-5
- Best for: Complex reasoning and analysis tasks
- Capabilities: Strong reasoning, nuanced understanding, extensive context windows
- Use cases: Complex research, detailed analysis, advanced automation, agent workflows
- Temperature range: 0.0 - 1.0 (default: 0.7)
- When to use: Demanding applications requiring deep understanding and analysis
Claude 4 Series
Claude 4 Series
Claude 4 Opus (claude-4-opus)
- Best for: Highest-capability reasoning tasks
- Use cases: Strategic decisions, complex research, mission-critical analysis
- When to use: Tasks where quality matters more than cost
- Best for: Balanced performance and cost for demanding tasks
- Use cases: Business automation, production workflows, detailed analysis
- When to use: Production workloads needing strong reasoning
- Best for: Fast, efficient processing
- Use cases: High-volume operations, real-time interactions, simple automation
- When to use: Speed and cost-efficiency are priorities
Claude 3.7 & 3.5 Sonnet
Claude 3.7 & 3.5 Sonnet
Claude 3.7 Sonnet (claude-3-7-sonnet)
- Best for: Daily tasks requiring strong reasoning at lower cost
- Use cases: Standard business automation, customer support, content generation
- When to use: Cost-effective production deployments
- Best for: Reliable production performance
- Use cases: Established workflows, production automation
- When to use: Stability and consistent performance matter
OpenAI via Cortex
Access OpenAI models through your Snowflake environment:Cortex GPT-5 Series
Cortex GPT-5 Series
openai-gpt-5 - Advanced reasoning
openai-gpt-5-mini - Efficient reasoning
openai-gpt-5-nano - Maximum efficiency
openai-gpt-5-chat - Optimized for conversations
- Capabilities: Structured output, reasoning mode, through Snowflake
- Use cases: Complex analysis, conversations, classification, intelligence features, agents
- When to use: Need OpenAI capabilities with Snowflake data residency
Cortex GPT-4.1 & o4-mini
Cortex GPT-4.1 & o4-mini
openai-gpt-4.1
- Production-ready OpenAI through Snowflake
- Structured output, reliable reasoning
- Reasoning model through Snowflake
- Required temperature: 1.0 (not adjustable)
- System role not supported
Open Source Models (via Snowflake)
DeepSeek R1
DeepSeek R1
deepseek-r1
- Best for: Advanced reasoning with open-source flexibility
- Capabilities: Strong reasoning, open-source architecture
- Use cases: Research, academic applications, cost-conscious deployments
- When to use: Open-source requirements or research projects
Meta Llama Models
Meta Llama Models
Llama 3.3 Series
- Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance
- Llama 3.2 3B (llama3.2-3b): Efficient, compact
- Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks
- Llama 3.1 405B (llama3.1-405b): Largest, most capable
- Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
- Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support
- Llama 3 70B (llama3-70b): Function calling, reliable
- Llama 3 8B (llama3-8b): Efficient operation
- Llama 2 70B Chat (llama2-70b-chat): Conversational focus
Mistral Models
Mistral Models
Mistral Large 2 (mistral-large2)
- Advanced capabilities, multilingual support, European markets
- Previous generation, reliable performance
- Compact, efficient, cost-effective
- Mixture-of-experts architecture, balanced performance
Other Snowflake Models
Other Snowflake Models
Snowflake Arctic (snowflake-arctic)
- Data cloud native processing, integrated with Snowflake infrastructure
- When to use: Data-intensive workflows within Snowflake
- Lightweight Google-developed model
- When to use: Efficient processing on smaller tasks
- Instruction-following optimization
- Note: Not recommended for JSON/YAML parsing
- Advanced processing or fast operation
Snowflake Embedding Models
- Arctic L V2.0 (
snowflake-arctic-embed-l-v2.0) — High-quality embeddings for semantic search. Use for new AI search implementations. - Arctic M V1.5 (
snowflake-arctic-embed-m-v1.5) — Balanced performance and quality. Use for production search systems.
OpenAI Direct
Access OpenAI models directly through OpenAI API.GPT-5 Series - Latest Generation
GPT-5 Series - Latest Generation
GPT-5 (gpt-5)
- Best for: Complex reasoning and demanding applications
- Capabilities: Structured output, reasoning mode, advanced problem-solving
- Use cases: Complex analysis, strategic planning, research tasks
- Best for: Daily reasoning at lower cost
- Capabilities: Structured output, reasoning mode, balanced performance
- Use cases: Standard business logic, moderate analysis, automation
- Best for: Simple reasoning requiring efficiency
- Capabilities: Structured output, reasoning mode, cost-effective
- Use cases: Basic classification, simple analysis, high-volume operations
- Best for: Enhanced reasoning with improved accuracy
- Use cases: Business intelligence, detailed analysis, critical decisions
- Best for: Highest-tier reasoning
- Use cases: Complex problem-solving, research, mission-critical applications
GPT-4.1 Series - Production Ready
GPT-4.1 Series - Production Ready
GPT-4.1 (gpt-4.1)
- Best for: Production applications requiring consistent performance
- Capabilities: Structured output, reliable reasoning
- Use cases: Customer-facing applications, production workflows
- Best for: Cost-effective production deployments
- Use cases: High-volume automation, chatbots, content generation
- Best for: Maximum efficiency for simple tasks
- Use cases: Real-time interactions, simple classification, quick responses
GPT-4o Series - Optimized
GPT-4o Series - Optimized
GPT-4o (gpt-4o)
- Best for: Balanced performance and capability
- Capabilities: Structured output, multimodal support (text + images)
- Use cases: General-purpose applications, document analysis, versatile automation
- Supports: Prompts, conversations, email analysis, classification, intelligence, agents
- Best for: Cost-effective general-purpose tasks
- Capabilities: Structured output, multimodal support, efficient operation
- Use cases: Standard automation, customer support, content processing
- Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents
Reasoning Models - o3-mini & o1-mini
Reasoning Models - o3-mini & o1-mini
o3-mini (o3-mini)
- Best for: Latest reasoning-focused tasks requiring deep analysis
- Capabilities: Advanced reasoning mode (required temperature: 1.0)
- Use cases: Mathematical problems, logical analysis, complex problem-solving
- Note: System role not supported; fixed temperature requirement
- Best for: Previous-generation reasoning tasks
- Capabilities: Reasoning mode (required temperature: 1.0)
- Use cases: Logic puzzles, analytical tasks, structured problem-solving
- Note: System role not supported; fixed temperature requirement
Legacy Models
Legacy Models
GPT-4 Turbo Preview (gpt-4-turbo-preview)
- Function calling, extended context
- Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series
- Structured output, reliable performance
- Supports: Prompts, conversations, summarization, email analysis, classification
- Function calling, basic capabilities
- Recommendation: Upgrade to GPT-4.1 Mini for better performance
Google Gemini
Access Google’s multimodal Gemini models directly.Gemini 3 Series - Latest
Gemini 3 Series - Latest
Gemini 3 Pro Preview (gemini-3-pro-preview)
- Best for: Latest multimodal capabilities
- Capabilities: Multimodal processing (text, images, audio), advanced reasoning
- Use cases: Document analysis with images, multimedia processing, complex automation
- Temperature range: 0.0 - 1.0 (default: 0.7)
- Supports: Prompts, translation, classification, file analysis, agents
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Pro (gemini-2.5-pro)
- Best for: Complex multimodal tasks requiring high performance
- Capabilities: Multimodal, large context windows, detailed analysis
- Use cases: Document understanding, comprehensive analysis, advanced automation
- Best for: Fast multimodal processing
- Capabilities: Multimodal, efficient operation, quick responses
- Use cases: Real-time document analysis, responsive automation
Gemini 2.0 Series - Efficient
Gemini 2.0 Series - Efficient
Gemini 2.0 Flash (gemini-2.0-flash)
- Best for: Cost-effective multimodal processing
- Use cases: Standard document processing, general automation
- Best for: Lightweight multimodal tasks
- Use cases: Simple document analysis, high-volume operations
Gemini 1.5 Pro - Established
Gemini 1.5 Pro - Established
Gemini 1.5 Pro (gemini-1.5-pro)
- Best for: Established multimodal performance
- Capabilities: Multimodal processing, reliable operation
- Use cases: Production workloads, established workflows
- Supports: Prompts, translation, classification, file analysis
Model Selection Guide
By Use Case
Conversational Agents
Conversational Agents
Recommended Models:
- GPT-4o Mini - Best balance of cost and performance
- Claude 3.7 Sonnet - Strong reasoning at reasonable cost
- Gemini 2.5 Flash - Fast multimodal conversations
- GPT-5 Mini - Advanced reasoning for complex interactions
- Support structured output for reliable responses
- Handle context well for conversation continuity
- Cost-effective for high-volume interactions
- Consistent reliability in production
Document Analysis with Images
Document Analysis with Images
Recommended Models:
- Gemini 2.5 Pro - Complex multimodal analysis
- Gemini 3 Pro Preview - Latest document understanding
- GPT-4o - Strong multimodal processing
- Gemini 2.5 Flash - Fast multimodal analysis
- Multimodal support for images and text together
- Large context windows for lengthy documents
- Strong reasoning for extracting insights
- Handle charts, diagrams, and visual elements
Data Classification & Extraction
Data Classification & Extraction
Recommended Models:
- GPT-4.1 Nano - Fast, cost-effective
- GPT-4o Mini - Structured output for consistency
- Claude Haiku 4.5 - Quick, efficient
- GPT-4.1 Mini - Production-ready reliability
- Structured output ensures consistent categorization
- Cost-effective for high-volume operations
- Fast response times for real-time classification
- Reliable accuracy for business logic
Complex Reasoning & Analysis
Complex Reasoning & Analysis
Recommended Models:
- o3-mini - Specialized reasoning mode for logic
- Claude 4 Opus - Highest-capability reasoning
- GPT-5 - Advanced problem-solving
- Claude Sonnet 4.5 - Detailed analysis
- Advanced reasoning capabilities
- Handle multi-step logic effectively
- Understand complex relationships
- Provide detailed explanations
Content Generation
Content Generation
Recommended Models:
- Claude Sonnet 4.5 - High-quality writing
- GPT-5 - Creative and coherent content
- Gemini 2.5 Pro - Long-form content
- Claude 4 Sonnet - High-quality balanced output
- Natural, fluent writing style
- Good creativity control via temperature
- Handle various content types well
- Consistent quality and tone
Semantic Search
Semantic Search
Required Models:
- Snowflake Arctic L V2.0 - Latest, highest quality
- Snowflake Arctic M V1.5 - Reliable production
- Optimized for semantic similarity
- Consistent vector representations
- Efficient processing at scale
- Note: Must use Snowflake provider for embedding models
By Budget
- Cost-Conscious
- Balanced
- Performance
Lowest Cost Models:
- GPT-4.1 Nano: Minimal cost, simple tasks
- Claude Haiku 4.5: Fast and efficient
- Gemini 2.0 Flash Lite: Lightweight multimodal
- Llama 3.2 1B/3B: Maximum efficiency
- Mistral 7B: Small but capable
By Provider Strengths
- Snowflake
- OpenAI Direct
- Gemini
Strengths:
- Data residency in your cloud
- Wide model selection
- Claude and OpenAI access
- Native data processing
Key Model Capabilities
Multimodal Processing
What it is: Process text and images together in the same request Supported Models:- All Gemini models (2.0+)
- GPT-4o, GPT-4o Mini
- Document analysis with charts/diagrams
- Image-based data extraction
- Visual content understanding
- OCR and form processing
Structured Output
What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:- All GPT-4o, GPT-4.1, GPT-5 series
- Cortex GPT models
- GPT-4 (partial)
- Llama 3.1 8B, 70B
- Data extraction to databases
- Automated classification
- API integrations
- Workflow automation
Reasoning Mode
What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:- o1-mini, o3-mini (dedicated reasoning, always on)
- GPT-5 series (configurable)
- Cortex o4-mini (dedicated reasoning)
- Mathematical problems
- Logic puzzles
- Complex analysis
- Multi-step problem-solving
Agent Support
What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:- GPT-4o, GPT-4.1, GPT-5 series
- All Claude models (via Snowflake)
- Cortex OpenAI models
- Gemini 2.5+, Gemini 3 Pro
- Conversational agents
- Multi-turn interactions
- Complex workflows
- Autonomous task execution
Temperature Settings
All models except dedicated reasoning models support customizable temperature. Temperature is configured at the LLM Service level in Organization Settings.- 0.0 - 0.3: Deterministic, consistent (classification, data extraction)
- 0.4 - 0.7: Balanced creativity (conversation, general tasks)
- 0.8 - 1.0: Creative, diverse (content generation, brainstorming)
Best Practices
Model Selection
- Identify your use case — Determine if you need conversation, classification, analysis, generation, or search.
- Check required capabilities — Verify if you need multimodal, structured output, or reasoning capabilities.
- Consider your provider — Choose based on data residency, integration, and model access requirements.
- Balance cost and performance — Select the smallest model that meets your quality requirements.
- Test before committing — Compare 2-3 models with your actual use cases.
- Monitor and optimize — Track quality, cost, and speed metrics to refine your selection.
Cost Optimization
Choose right-sized models:- Use Nano/Mini for simple tasks
- Reserve Pro/Opus for complex analysis
- Test if smaller models meet needs
- Write concise, clear instructions
- Remove unnecessary context
- Set appropriate max tokens
- Use structured output formats
- Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
- Direct provider access may be more cost-effective for high-volume, simple tasks
- Snowflake provides value through data residency and unified platform
Performance Optimization
For speed:- Use Mini/Nano/Haiku models
- Lower max tokens
- Choose geographically close providers
- Use Pro/Opus/Sonnet tier models
- Provide detailed context
- Test with real examples
- Use low temperature (0.0-0.2)
- Enable structured output
- Choose models with structured output support
Next Steps
AI Providers
Set up your AI provider connections
Create AI Services
Configure specific model instances for your workflows
Build Agents
Create conversational AI assistants using these models
AI Automations
Use AI models in automation workflows