Platform engineers need to be empowered in an organization’s security program. Their work has huge leverage over a product's security posture, arguably as great an impact (some would even say greater) than application vulnerabilities. Despite the significance of the impact of their work, their role in security programs remain ill-defined. While cloud vendors publish well defined shared responsibility models with their customers, organizations rarely spell out the ownership of security between application developers and platform engineers.
In this blog post, we’ll explore what a platform engineer is and highlight some of their challenges that product security teams should be aware of.
What is a platform engineer?
Platform engineers work to build tooling and systems for developer self-service in their organizations. They build “paved-paths” that allow developers to trade off optionality for lower cognitive load, better ergonomics and a simpler software development lifecycle. No longer encumbered by the paradox of choice and steep learning curves, application developers are better able to focus on delighting their customers.
Platform engineers consume a large set of complex and fragmented infrastructure building blocks and provide a smaller set of simpler, less feature-rich components. They essentially combine Lego blocks to build something that looks like Duplo.
When done correctly, application developers get all the customization they need for 80% of their work without having to become experts on infrastructure.
What does platform engineering have to do with security?
In the process of abstracting detail away from the infrastructure itself, platform engineers are making decisions about how the infrastructure should be configured in the first place. These decisions have a huge impact on the security posture of the organization. One could argue that design flaws and misconfigurations have a higher impact on the posture of the security program than an application vulnerability would. In fact, in Snyk's 2021 State of Cloud Native Application Security report, we found that 45% of respondents had misconfiguration-driven vulnerabilities.
Consider all the errors that could take place in the design and implementation of each layer of a typical internal development platform:
- Application management: logging agents, CD, container orchestration, secret management, image replication, data backup and life cycles, artifact build and storage
- Application compute: container clusters, VMs, load balancing, certificates
- Network: DNS, isolated networks, internet connectivity, ingress/egress rules
- Access: identity, privileges
And that list will only get bigger if you talk to a platform engineer. Now let's take a look at some of the problems platforms engineers can face.
Hard to secure a growing box of increasingly complicated Legos
The set of components is large and growing. Each component is becoming increasingly more sophisticated. Microservices are also transforming into SaaS, as customers offload components of their architecture to external software vendors.
Platform engineers are often generalists, with some areas of deeper knowledge. They face a similar knowledge gap around security vulnerabilities as application developers do. However, their gaps are arguably more varied and harder to understand, as it often relies on deep knowledge of a given vendor’s service and its configuration.
Complex change management, exacerbated by immature testing practices
For application developers, testing is well understood and broadly practiced. Developers are expected to write tests for their code and understand the advantages and disadvantages of different testing strategies throughout the testing pyramid. As a result, the cost of merging application security fixes is relatively low, as an organization can rely on its testing maturity as a protection against regressions, which means that a successful build is often all that is needed to merge a security patch.
The situation for platform engineering is different. The codification of infrastructure is still slowly progressing across industries and within companies. Even beyond this basic prerequisite of version control for some form of testing, the testing practices for infrastructure changes are ill-defined and as a result, the risk associated with any individual change is high. To make matters worse, not every attribute in a configuration change is created equal. Some attribute updates, omissions, and additions actually trigger the replacement of resources, which can lead to serious implications for dependent resources, the applications they serve and their users.
Vulnerabilities in infrastructure are more costly to patch today. Platform engineers nervously eyeball changes individually, as their best bet to avoid calamity when deploying changes. This practice will not scale, but the nascent platform engineering community has not defined best practices yet.
Self-serve leads to roll-out problems
Platform engineering, or platform as a product as it is often known, is about providing a paved path for the user and enabling developer self-service. Some changes to the platform are rolled out by the platform engineers themselves, including changes to the configuration of the infrastructure across the company's instances. However, some changes are dependent on the uptake and installation from application development teams, such as new Helm charts, terraform modules, deployment orbs/plugins, instrumentation libraries and many many more.
The security posture of the organization changes across these versions, and update lags in just a few teams could spell challenges for the whole security program that remain unknown to Application Security.
Summary
As a result of these challenges, platform engineers need tooling to find misconfiguration early in their SDLC, but also assistance in going beyond testing, to improve the design of their paved paths.
Platform engineers will have immense leverage over the security posture of their organizations and more generally over the multi-vendor, SaaS-driven world of software. Emerging best practices and investments in platform maturity will eliminate and mitigate security risks, while also enabling application developers to fix application vulnerabilities, at a lower cost and higher speed than ever before.