The service mesh era: Securing your environment with Istio
Welcome to the third installment of our series on the Istio service mesh. So far, we’ve discussed the benefits of using a service mesh like Istio and also demonstrated how to deploy applications and manage traffic. In this post we’ll look at something that keeps IT professionals up at night: security.
As we’ve discussed in our previous posts, we’re seeing accelerated adoption of containers, Kubernetes, and microservices, driven by the desire to increase developer productivity, deployment velocity, and operational scalability. The adoption of these technologies and paradigms results in a dynamic production environment where containerized workloads are deployed to a pool of hosts (or VMs) and are typically ephemeral. Further, there is a significant increase in the network surface area within the perimeter as microservices are exposed as network endpoints. Lastly, IP-based network flow logs are no longer sufficient to demonstrate compliance to internal and external stakeholders. Thus, there is a need to reconsider the traditional approach to network security for such environments.
One of Istio’s more important value propositions, then, is how it can effectively secure your modern production environments, without sacrificing developer productivity.
Given the proliferation of threats within the production network and the increased points of privileged access, it is increasingly necessary to adopt a zero-trust network security approach for microservices architectures. This approach requires that all accesses are strongly authenticated, authorized based on context, logged, and monitored … and the controls must be optimized for dynamic production environments.
Istio on Google Kubernetes Engine (GKE) helps with these security goals in a few ways.
It provides defense in depth; it layers on top of your existing layer 3 network security controls to provide an independent layer of network security. It provides the foundation for implementing zero-trust network security, where trust and access are determined by strongly authenticated peer identities and additional context of the request rather than by presence inside the same network perimeter. It enables you to demonstrate compliance using access logs that capture service identities and layer 7 attributes than just 5-tuple information. Finally, it lets you configure this security by default—you don’t need to change your application code or infrastructure to turn it on.
The best way to demonstrate the value of the Istio security layer is to show it in action. Specifically, let’s look at how Istio on GKE can help you adopt a zero-trust security approach through authentication—who a service or user is, and whether we can trust that they are who they say they are—and authorization—what specific permissions this user or service has. Together, these protect your environment from security threats like access using stolen credentials and replay attacks, and keep your sensitive data safe. As you read, you can follow along with this hands-on demo.
Authentication with mutual TLS
One of the anti-patterns with microservices authentication is to rely on a bearer token, e.g. a JWT, to authenticate a peer. Bearer tokens can be stolen—from the source, destination or through man-in-the middle attacks—and replayed, enabling lateral movement of threats and privilege escalation.
An approach to mitigate this risk is to ensure that peers are only authenticated using non-portable identities. Mutual TLS authentication (mTLS) ensures that peer identities are bound to the TLS channel and cannot be replayed. It also ensures that all communication is encrypted in transit, and mitigates the risk of man-in-the middle attacks and replay attacks by the destination service. While mutual TLS helps strongly identify the network peer, end user identities (or identity of origin) can still be propagated using bearer tokens like JWT.
While mTLS is an important security tool, it’s often difficult and time consuming to manage. To start, you have to create, distribute, and rotate keys and certificates to a large number of services. You then need to ensure you are properly implementing mTLS on all of your clients and servers. And when you adopt a microservices architecture, a manual approach is hard to scale to an increasing number of services. Finally, traditional X.509 certificates used for TLS identify domains and are not optimized for authenticating workloads.
Istio on GKE supports mTLS and can help ease many of these challenges. It automates key and certificate management, including generation, distribution, and rotation, and its certificates identify the workload using a Service Identity (vs. the host or domain). Istio uses the Envoy sidecar proxy to enforce mTLS and requires no code changes to implement. The approach of using Service Identities enables workload portability across clusters and clouds with no changes to the access control policies. You can easily enable Istio mTLS on GKE today, by choosing an mTLS option from a simple dropdown menu.
Permissive mode is the default. It allows services in your mesh to accept both mTLS authenticated and non-mTLS traffic. In this mode, existing clients that are not enabled for mTLS can continue accessing the service while mTLS is incrementally rolled out across your environment. Istio clients can be configured to enable mTLS by changing the destination rule. The objective should be to lock down a port to only mTLS enabled clients over time using the strict mTLS mode.
When you select strict mTLS mode, Istio on GKE enforces mTLS for all accesses to services; all calls are encrypted and authenticated based on the certificate-based identity. While this is an ideal end state, you need to ensure that all clients to the service are mTLS enabled, otherwise you may break your existing application.
Many organizations choose to first enable permissive mTLS for the entire namespace, and then transition to strict mode on a service-by-service or even port-by-port basis. You can also override client-side defaults with destination-specific rules. This is one of the major benefits of Istio—it lets you incrementally adopt mTLS, or turn it on and off for your whole mesh. This incremental adoption model lets you implement the security features of mTLS without breaking existing applications.
To enable mTLS incrementally you first need a Policy for inbound traffic, and a DestinationRule for outbound. The YAML and instructions you need to do it are here. Enabling mTLS for all services in a namespace is a very similar process. Just set up another policy and DestinationRule, this time for the full namespace, then execute it.
With mTLS enabled, you now have a strong authenticated peer identity that can be used for access control (authorization). You can also rely on additional context such as the end user (also known as origin) identity for granting access. Istio can validate JSON web tokens so that you can safely build authorization policies that rely on authenticated claims in the token—such as the end user identity—in addition to authenticated channel attributes. You can also see this capability in the demo.
Authorization tools to protect your data
Another key component to building a zero-trust network security posture is to ensure that access to sensitive data is only granted to authorized clients and users.
Istio Authorization—which is built on Kubernetes role-based access control (RBAC)—provides access control for the services in your mesh based on multiple attributes in the request context. With Istio authorization, you can constrain who can access a service endpoint based on the certificate-based identity of the peer, as well as claims in a JWT. Further, Istio authorization is a layer 7 policy and be used to grant specific permissions based on the URL.
At its most basic, Istio RBAC maps subjects to roles. An Istio authorization policy involves groups of permissions for accessing services (the ServiceRole specification), and then determining which users, groups, and services gets those specific access permissions (ServiceRoleBinding). A ServiceRole contains a list of permissions, while a ServiceRoleBinding assigns a specific ServiceRole to a list of subjects.
When you’re configuring Istio authorization policies, you can specify a wide range of different groups of permissions, and grant access to them at the level that makes sense, down to the user level. The demo shows how this structure makes it easy to enable authorization on an entire namespace by applying the authorization policies to the cluster.
Network access logging with service level information
With these features set up, you can also address an increasingly important aspect of security: demonstrating to both internal and external stakeholders that all services and accesses are in compliance with required network security policies. Istio on GKE’s robust logging and metrics collection features can help provide this.
We hope this tour of Istio's security features demonstrated how Istio makes it easier for you to implement and manage a comprehensive microservices security strategy that makes sense for your organization.
To try out the Istio security features we discussed here, head over to the demo. In our next post, we’ll take a deep dive into observability, tracing, and SLOs using Istio and Stackdriver.