AI Models
Use any LLM. Run it anywhere. Power agents with the best models for your enterprise.
Kubeark gives you full flexibility to use the AI models that work for your business — from enterprise SaaS LLMs like Azure OpenAI, Gemini, and AWS Bedrock, to open-source models hosted on-premises or in air-gapped environments.
Whether you need control, speed, compliance, or scale — Kubeark makes it easy to plug in the right model for the job.

Your Agents. Your Models. Total Flexibility.
Kubeark is model-agnostic and enterprise-ready. Choose your LLM based on context, cost, and compliance — and switch effortlessly as your needs evolve.
• Use popular SaaS models out of the box
• Host open-source models securely in your own environment
• Mix and match models per agent, per task, or per business line
Kubeark is not tied to a single model provider — you’re in control.
Supported Models & Platforms
Kubeark supports the full spectrum of commercial and open-source models:
• Cloud-Hosted SaaS LLMs
-
- Azure OpenAI Service — GPT-4, GPT-3.5, fine-tuned variants
- Google Gemini — Gemini Pro, Gemini 1.5 via Vertex AI
- AWS Bedrock — Anthropic Claude, Cohere, Mistral, Amazon Titan
- OpenAI API — Direct access to ChatGPT, GPT-4 Turbo, and more
Use them directly through secure connectors with integrated cost and usage controls.
• On-Prem & Private Models
-
- Open Source Models — LLaMA 2/3, Mistral, Mixtral, Falcon, Gemma, Command-R, and others
- Fine-tuned Models — Host your own domain-tuned LLMs and serve them through Kubeark
- Multi-model GPU Sharing — Run multiple LLMs on shared GPUs for efficiency and cost savings
Only Kubeark lets you run open models alongside SaaS models — in the same agent ecosystem.

Use the Right Model for the Right Job
Kubeark lets you choose different models for different contexts:
| Use Case | Ideal Model Type |
| Customer-facing agent | SaaS-hosted GPT-4 or Gemini Pro |
| Sensitive data classification | On-prem LLaMA 2 or Mistral |
| Contract analysis agent | Fine-tuned Command-R hosted on local GPUs |
| Executive Q&A from internal docs | Hybrid RAG agent with Gemini and internal vector DB |
| Supplier escalation bot | Cost-efficient GPT-3.5 Turbo or Claude Instant |
Mix-and-match logic is built in — assign models at the agent or step level.
Zero Lock-In, Full Control
With Kubeark, you get:
- Model Switching Without Rebuilding Agents
Swap or upgrade models without changing business logic. - GPU-Efficient Hosting
Run multiple models concurrently on shared GPU infrastructure. - Fine-Tuning Ready
Import or connect your own fine-tuned weights and versions. - Model Usage Governance
Track consumption, manage access, and control prompt visibility. - Data Privacy by Design
Keep your data in-house, or choose where it flows — at every step.

Built for the Realities of Enterprise AI
Kubeark’s AI Model layer is designed to work within your security, procurement, and IT architecture:
• SaaS, on-prem, and air-gapped support
• Connectors for RAG, embeddings, vector DBs
• Enterprise logging, RBAC, and observability
• Multi-tenant and workspace-aware deployments
Whether you’re in finance, manufacturing, government, or healthcare — Kubeark adapts to your compliance posture.
Supercharge Agents with Smarter Models
Kubeark gives your AI agents access to models that are:
- Accurate for reasoning
- Fast for response times
- Private for sensitive use cases
- Customizable for business language and tone
- Composable for multi-model, multi-step workflows

Start Building with the Best of AI
- Choose from the best commercial and open-source LLMs
- Build agents that learn, reason, and act intelligently
- Run everything securely — wherever your business demands
→ Kubeark AI Models: Any model. Anywhere. One orchestration layer.