Use Cases¶
Each use case is a self-contained AI application deployed on top of the Red Hat OpenShift AI (RHOAI) platform.
Structure¶
The repository separates models (individual model deployments) from services (applications that consume models):
usecases/
├── models/ # One directory per model
│ └── <model-name>/
│ ├── manifests/ # ServingRuntime, InferenceService, PVC, download Job
│ └── profiles/
│ └── tier1-minimal/ # Kustomize overlay (auto-discovered by cluster-models AppSet)
└── services/ # Application services
└── <service-name>/
├── manifests/
│ ├── base/ # Namespace, RBAC, config, network
│ ├── services/ # Deployments, Routes
│ └── training/ # Training infrastructure + workloads
└── profiles/
└── tier1-minimal/ # Kustomize overlay (auto-discovered by cluster-services AppSet)
Current Models¶
| Model | Description | Deployed by Default |
|---|---|---|
| gpt-oss-120b | OpenAI GPT-OSS 120B MoE (MXFP4, 4x L40S tensor-parallel, Red Hat AI validated ModelCar) | Yes |
| orchestrator-8b | NVIDIA Nemotron-Orchestrator-8B for multi-tool coordination | No (excluded) |
| qwen-math-7b | Qwen2.5-Math-7B-Instruct math specialist | No (excluded) |
Re-enabling excluded models
Models marked "excluded" have their manifests in Git but are excluded from ArgoCD discovery via exclude entries in cluster-models-appset.yaml. To re-enable a model, remove its exclude entry and push to Git.
Current Services¶
| Service | Description | Model Dependencies | Deployed by Default | Guide |
|---|---|---|---|---|
| llamastack | Meta's LlamaStack Distribution with agents, RAG, and tool use | gpt-oss-120b (remote by default) | Yes | LlamaStack |
| genai-toolbox | GenAI Toolbox MCP Server for database tools | None (uses llamastack's PostgreSQL) | Yes | GenAI Toolbox |
| rhokp | Red Hat OKP MCP Server for RHEL documentation, CVEs, errata | None (self-contained with OKP Solr) | Yes | Red Hat OKP |
| toolorchestra-app | NVIDIA ToolOrchestra UI for multi-model orchestration | orchestrator-8b, qwen-math-7b | No (excluded) | ToolOrchestra |
Re-enabling excluded services
Services marked "excluded" have their manifests in Git but are excluded from ArgoCD discovery via exclude entries in cluster-services-appset.yaml. To re-enable a service, remove its exclude entry (and re-enable its model dependencies) and push to Git.
Deploy models before services
Services depend on model endpoints being reachable. When deploying manually, deploy all required models and wait for them to become Ready before deploying services. In GitOps mode, both cluster-models and cluster-services ApplicationSets deploy in parallel, so models typically become ready before services finish initializing.
Adding a New Model¶
-
Create a directory under
usecases/models/: -
The
cluster-modelsApplicationSet auto-discoversusecases/models/*/profiles/tier1-minimaldirectories. Push to Git and a newmodel-<name>Application is created automatically.
Adding a New Service¶
-
Create a directory under
usecases/services/: -
The
cluster-servicesApplicationSet auto-discoversusecases/services/*/profiles/tier1-minimaldirectories. Push to Git and a newservice-<name>Application is created automatically.
Model download jobs
For model download jobs, always:
- Add
argocd.argoproj.io/sync-wave: "-1"to PVCs so they bind before download Jobs - Add
argocd.argoproj.io/sync-wave: "0"so downloads run before InferenceService (wave 1) - Omit
ttlSecondsAfterFinishedso completed jobs persist and ArgoCD doesn't recreate them