The Evolution of Platform Engineering: Scaling with CNCF and User-Centric Design

Introduction

Platform engineering has emerged as a critical discipline in modern software development, enabling organizations to scale efficiently while maintaining governance and developer productivity. This article explores the evolution of platform engineering at Kasan, a large-scale organization with over 5,000 engineers and thousands of users. By leveraging CNCF technologies like Kubernetes and adopting a user-centric approach, Kasan transformed its infrastructure provisioning from hours to minutes, addressing challenges such as tool duplication, lack of standardization, and developer engagement.

Core Concepts and Technical Foundations

Platform Engineering Defined

Platform engineering focuses on abstracting infrastructure complexity to allow developers to focus on delivering business value. It involves creating reusable, standardized tools and processes that reduce friction in development workflows. The core principle is to lower the barrier to entry for developers while ensuring scalability, security, and observability.

Key Technologies and CNCF Integration

Kasan’s platform engineering initiative is deeply rooted in the Cloud Native Computing Foundation (CNCF) ecosystem. By adopting Kubernetes as the foundation, the organization established a unified infrastructure layer that supports containerized applications. This integration enabled the creation of a technical committee and special interest groups (SIGs) to drive standardization and governance across the organization.

The Evolution of Infrastructure Provisioning

From 3S Engine to SRP/MRP

The initial architecture relied on a 3S engine (likely an acronym for specific provisioning strategies) built on Terraform, which allowed developers to define infrastructure via YAML. While this reduced the need for infrastructure expertise, it had limitations, such as supporting only internet connectivity and lacking internal network capabilities.

To address these gaps, Kasan introduced SRP (Single Region Provisioning) and MRP (Multi-Region Provisioning) products. These solutions extended the 3S engine by adding internal network connectivity and introducing Platform Score (PFC API), enabling users to configure networking and security resources autonomously. This shift reduced reliance on manual support requests and streamlined provisioning workflows.

Developer Gateway and User Experience

A dedicated developer gateway was created to centralize access to platform tools and provide feedback mechanisms. This enhanced user experience by offering a unified interface for infrastructure management, reducing cognitive load for developers.

Technical Challenges and Solutions

Migration Complexity and User Adoption

The transition from the 3S engine to SRP/MRP faced significant challenges. Existing users encountered migration costs due to the need for re-deployment, while new users struggled to perceive the value of SRP/MRP, as its features overlapped with the 3S engine.

To resolve this, Kasan regressed to the 3S engine, incorporating the missing features of SRP/MRP. This approach minimized development overhead while ensuring backward compatibility. Additionally, the team focused on backwards-compatible updates to deepen the 3S engine’s technical capabilities.

Platform Orchestrator and Integration

A platform orchestrator was developed to unify infrastructure provisioning across different maturity levels of users. By integrating the orchestrator with the developer gateway, Kasan improved adoption rates and ensured alignment with organizational goals.

Key Lessons and Future Directions

Prioritizing User-Centric Design

The primary takeaway is that platform engineering must reduce user complexity rather than transferring technical burdens to developers. Success hinges on aligning platform capabilities with user maturity levels, avoiding over-engineering or under-delivery.

Embracing Flexibility and Iteration

Adaptability is critical in platform engineering. Kasan’s experience highlights the importance of iterative development, accepting failures as part of the process, and continuously refining tools based on user feedback.

Governance and Change Management

Establishing a technical committee and SIGs ensured standardized practices, while proactive change management strategies addressed resistance to adoption. Quarterly planning and dedicated delivery managers optimized cross-team collaboration and resource allocation.

Conclusion

Platform engineering at Kasan demonstrates how leveraging CNCF technologies, such as Kubernetes, and adopting a user-centric design can transform infrastructure provisioning from a time-consuming task to an efficient, scalable process. By focusing on reducing complexity, aligning with organizational maturity, and fostering developer engagement, Kasan achieved significant improvements in efficiency and governance. The journey underscores the importance of flexibility, user feedback, and continuous iteration in building a robust platform engineering practice.