AI Models

Use any LLM. Run it anywhere. Power agents with the best models for your enterprise.

Kubeark gives you full flexibility to use the AI models that work for your business — from enterprise SaaS LLMs like Azure OpenAI, Gemini, and AWS Bedrock, to open-source models hosted on-premises or in air-gapped environments.

Whether you need control, speed, compliance, or scale — Kubeark makes it easy to plug in the right model for the job.

Your Agents. Your Models. Total Flexibility.

Kubeark is model-agnostic and enterprise-ready. Choose your LLM based on context, cost, and compliance — and switch effortlessly as your needs evolve.

Use popular SaaS models out of the box
Host open-source models securely in your own environment
Mix and match models per agent, per task, or per business line

Kubeark is not tied to a single model provider — you’re in control.

Supported Models & Platforms

Kubeark supports the full spectrum of commercial and open-source models:

 Cloud-Hosted SaaS LLMs

    • Azure OpenAI Service — GPT-4, GPT-3.5, fine-tuned variants
    • Google Gemini — Gemini Pro, Gemini 1.5 via Vertex AI
    • AWS Bedrock — Anthropic Claude, Cohere, Mistral, Amazon Titan
    • OpenAI API — Direct access to ChatGPT, GPT-4 Turbo, and more

Use them directly through secure connectors with integrated cost and usage controls.

On-Prem & Private Models

    • Open Source Models — LLaMA 2/3, Mistral, Mixtral, Falcon, Gemma, Command-R, and others
    • Fine-tuned Models — Host your own domain-tuned LLMs and serve them through Kubeark
    • Multi-model GPU Sharing — Run multiple LLMs on shared GPUs for efficiency and cost savings

Only Kubeark lets you run open models alongside SaaS models — in the same agent ecosystem.

Use the Right Model for the Right Job

Kubeark lets you choose different models for different contexts:

Use Case Ideal Model Type
Customer-facing agent SaaS-hosted GPT-4 or Gemini Pro
Sensitive data classification On-prem LLaMA 2 or Mistral
Contract analysis agent Fine-tuned Command-R hosted on local GPUs
Executive Q&A from internal docs Hybrid RAG agent with Gemini and internal vector DB
Supplier escalation bot Cost-efficient GPT-3.5 Turbo or Claude Instant

Mix-and-match logic is built in — assign models at the agent or step level.

Zero Lock-In, Full Control

With Kubeark, you get:

  • Model Switching Without Rebuilding Agents
    Swap or upgrade models without changing business logic.
  • GPU-Efficient Hosting
    Run multiple models concurrently on shared GPU infrastructure.
  • Fine-Tuning Ready
    Import or connect your own fine-tuned weights and versions.
  • Model Usage Governance
    Track consumption, manage access, and control prompt visibility.
  • Data Privacy by Design
    Keep your data in-house, or choose where it flows — at every step.

Built for the Realities of Enterprise AI

Kubeark’s AI Model layer is designed to work within your security, procurement, and IT architecture:

SaaS, on-prem, and air-gapped support
Connectors for RAG, embeddings, vector DBs
Enterprise logging, RBAC, and observability
Multi-tenant and workspace-aware deployments

Whether you’re in finance, manufacturing, government, or healthcare — Kubeark adapts to your compliance posture.

Supercharge Agents with Smarter Models

Kubeark gives your AI agents access to models that are:

  • Accurate for reasoning
  • Fast for response times
  • Private for sensitive use cases
  • Customizable for business language and tone
  • Composable for multi-model, multi-step workflows

Start Building with the Best of AI

  • Choose from the best commercial and open-source LLMs
  • Build agents that learn, reason, and act intelligently
  • Run everything securely — wherever your business demands

→ Kubeark AI Models: Any model. Anywhere. One orchestration layer.