
Unifying Platforms, AI, and Operators
We've written about three foundational CNCF whitepapers (and why we build on CNCF and open source in the first place):
- Platforms, the why and how of Internal Developer Platforms
- Cloud Native AI, running AI/ML workloads on Kubernetes
- Operator Pattern, encoding operational expertise into software
Each whitepaper is excellent in isolation. But the real power emerges when you read them together, because they describe three layers of the same system.
At Scaletific, we've spent the last year building GoldenPath IDP with exactly this unified model in mind. This article explains how the three pillars fit together, where they overlap, and what the resulting architecture looks like.
The Three Layers
Think of modern cloud native infrastructure as three interconnected layers:
┌─────────────────────────────────────────────────┐
│ PLATFORM LAYER │
│ Portal · Golden Paths · Self-Service · Govern. │
├─────────────────────────────────────────────────┤
│ INTELLIGENCE LAYER │
│ RAG Pipelines · LLM Agents · Vector DBs · Graph │
├─────────────────────────────────────────────────┤
│ OPERATOR LAYER │
│ Controllers · Reconciliation · Domain Knowledge │
├─────────────────────────────────────────────────┤
│ KUBERNETES + CLOUD │
└─────────────────────────────────────────────────┘
The Platform Layer (CNCF Platforms whitepaper) provides the human interface, the portals, golden paths, documentation, and governance that developers interact with daily.
The Intelligence Layer (CNCF Cloud Native AI whitepaper) provides cognitive capabilities, RAG pipelines that answer questions, LLM agents that generate code, vector databases that power semantic search, and ML models that detect anomalies.
The Operator Layer (CNCF Operator whitepaper) provides the automation backbone, controllers that continuously reconcile desired state with reality, encoding operational expertise into software that runs 24/7.
Where the Layers Intersect
The real value isn't in any single layer. It's in the connections between them.
Platforms + Operators: Self-Healing Infrastructure
The Platforms whitepaper calls for self-service delivery and security by default. The Operator whitepaper explains how to implement that: through custom controllers that encode provisioning, upgrading, backup, and auto-remediation logic.
When a developer requests a new database through the platform portal, an operator handles the actual provisioning, configuration tuning, backup scheduling, and ongoing health management. The platform provides the interface; the operator provides the automation.
In GoldenPath IDP, this manifests as:
-
Backstage service catalog (platform portal) backed by Terraform modules (infrastructure operators)
-
CI/CD pipelines that validate governance policies (continuous reconciliation)
-
Certified scripts that encode operational procedures with automated verification
Platforms + AI: Intelligent Developer Experience
The Platforms whitepaper emphasises cognitive load reduction. The Cloud Native AI whitepaper provides the toolkit to make that reduction intelligent.
Instead of static golden path templates, imagine:
-
A RAG pipeline that answers "how do I deploy to staging?" by searching your ADRs, runbooks, and past incident reports
-
An LLM agent that generates Terraform modules based on natural language descriptions, pre-validated against your governance policies
-
Anomaly detection that identifies when a deployment pattern deviates from your golden path
In GoldenPath IDP, this manifests as:
-
RAG-powered documentation search across 183+ ADRs and 678+ docs
-
Hybrid retrieval (vector + graph) for context-aware answers
-
AI-assisted code generation with governance guardrails
AI + Operators: Autonomous Operations
The Cloud Native AI whitepaper notes that AI can enhance cloud native operations through pattern analysis, anomaly detection, and natural language interfaces. The Operator whitepaper provides the control loop that acts on those insights.
Combine them and you get autonomous operations:
- ML models detect that query latency is increasing
- The operator's control loop receives this signal
- Domain knowledge encoded in the operator determines the correct remediation (add a read replica, not just scale pods)
- The action is executed, verified, and logged
This isn't science fiction, it's the logical extension of both whitepapers' recommendations.
The Unified Architecture
Here's how we think about the unified model at Scaletific.
Layer 1: Foundation
Kubernetes as the orchestration layer. Cloud provider services for managed databases, object storage, and networking. GPU scheduling for AI workloads via Volcano, Kueue, and Dynamic Resource Allocation.
Layer 2: Operators
Infrastructure operators (Terraform, Helm-based controllers). Application operators for database, cache, and message queue lifecycle. Governance operators for policy enforcement and compliance checking. AI/ML operators including Kubeflow Training Operator and KServe.
Layer 3: Intelligence
Vector databases (ChromaDB, Milvus) for semantic search. Graph databases (Neo4j) for relationship-aware retrieval. RAG pipelines for context-aware question answering. LLM integration with guardrails and observability. ML models for anomaly detection and pattern recognition.
Layer 4: Platform
Backstage portal for service discovery and provisioning. Golden path templates with governance validation. Self-service workflows backed by operators. AI-powered documentation and assistance. Observability dashboards via OpenTelemetry, Prometheus, and Grafana.
Cross-Cutting: Governance
- 30+ automated policies enforced at every layer
- Architecture Decision Records capturing rationale
- Certified scripts with CI-validated compliance
- RBAC and security scoping per the Operator whitepaper's recommendations
- Cost tracking and sustainability reporting per the CNAI whitepaper
Why a Unified Model Matters
Most organisations build these layers in silos. The platform team builds a portal. The ML team builds AI pipelines. The SRE team builds operators. Nobody talks to each other. This is the same problem that plagues platform engineer hiring, treating each domain as isolated rather than integrated.
The result is fragmentation, three separate systems with three separate interfaces, three separate governance models, and three separate failure modes.
A unified model means:
-
One governance framework, policies apply consistently across platform actions, AI pipeline outputs, and operator reconciliation
-
One observability stack, OpenTelemetry traces flow from the portal through the AI pipeline into the operator and back
-
One golden path, developers don't need to know which layer handles their request
-
One security model, RBAC, network policies, and audit logging applied uniformly
What We're Building With GoldenPath IDP
This isn't a whiteboard exercise. GoldenPath IDP is our production implementation of the unified model, and each layer is already delivering value.
The Platform Layer (Live)
Developers interact with a Backstage service catalog that lets them discover services, provision infrastructure, and follow golden path templates, all without filing tickets. Behind the scenes, 30+ governance policies run automatically in CI, validating every change against our architecture standards. There are 183 Architecture Decision Records that capture not just what we built, but why, making onboarding faster and decisions auditable.
This is the CNCF Platforms whitepaper in practice: self-service, documentation-first, governance by default.
The Automation Layer (Live)
Every infrastructure change flows through Terraform modules that act as our operators, declaring desired state and continuously reconciling it. Our 89 certified scripts encode operational procedures that used to live in people's heads: deployment sequences, migration steps, rollback procedures. CI-driven policy reconciliation ensures that what's deployed always matches what's declared.
This is the CNCF Operator whitepaper in practice: domain knowledge codified into software that runs 24/7.
The Intelligence Layer (In Development)
We're building a RAG pipeline that combines vector search (ChromaDB) with graph-based retrieval (Neo4j) to answer questions across our entire documentation corpus. Ask "how do I deploy to staging?" and it searches ADRs, runbooks, and past incident reports to give you a context-aware answer, not a generic wiki link.
This is the CNCF Cloud Native AI whitepaper in practice: AI workloads running on cloud native infrastructure, governed by the same policies as everything else.
What Comes Next
The real unlock is when these layers talk to each other. Imagine an AI agent that detects a governance violation in a pull request, explains why it violates the policy by citing the relevant ADR, and suggests a compliant alternative, all before a human reviewer even looks at it.
That's the unified model. That's where we're headed.
The Invitation
We believe the future of platform engineering is intelligent, automated, and continuously reconciled.
The three CNCF whitepapers provide the theoretical foundation. GoldenPath IDP is our proof that the unified model works in practice.
If you're building an Internal Developer Platform and wondering how AI fits in, or if you're building AI infrastructure and wondering how governance scales, we've been down both roads and found they converge.
Read the whitepapers that inform this model:
Want to explore the unified model for your organisation? Start a conversation, we'll show you what's possible.