KubeCon Barcelona 2019 was a blast. There were lots of interesting talks, presentations and panel discussions. The Replex team was there in force to soak in all the goodness and participated in a number of talks besides networking with community members and stakeholders on all levels in the wider Kubernetes ecosystem. In this blog post, we will review some of the more interesting talks that the Replex team attended.
The conference officially kicked off with the keynote address from Dan Kohn, Executive Director of CNCF. Dan Kohn tackled the question of why Kubernetes has emerged as the go-to container orchestration tool from the plethora of options available.
The presentation starts out by defining cloud-native as the convergence of three trends: Orchestration, Containerization and Microservices.
Dan points to a number of instances from recent history where roughly similar technologies have emerged at the same time but only some have won out. The rise of Kubernetes mirrors this where it has pulled ahead of a large number of alternative container orchestration solutions already in the market.
Dan identifies three reasons behind this; the first being the fact that it simply works better having incorporated ideas and brainpower from such titans of technology as Google, Huawei and Redhat. Kubernetes being vendor neutral and open source is the second reason that he points to. He credits Google with opening up the Kubernetes project to developers from other companies and giving them stakes in its continuous improvement. The third reason he identifies are the contributions of the developer community that has sprung up around Kubernetes which has contributed crucial improvements and contributions to the codebase.
The talk looks at Kubernetes access control as it exists now and identifies instances where access control tools are not sufficient or where access control is not properly set up. It also identifies issues with the namespace permission model, points out possible alternatives to it as well as outlining how pod network policies, the open policy agent (OPA) and custom controllers get around some access control issues.
The talk identifies three broad categories of Kubernetes access control. The first is runtime characteristics which define what can run in Kubernetes and in what way. Next is network access which controls what can communicate with what. The third category deals with what can access the Kubernetes control and data planes.
The tools that are currently available to manage access control in Kubernetes include RBAC, Network policies, custom Kubernetes API gateways, admission webhooks and OPA (a third party external tool). The talk takes a deep dive into each of these and identifies some limitations of each.
The talk then identifies two approaches to Kubernetes access control; Managing access control within Namespaces by having granular sub-namespace permissions or splitting Kubernetes clusters into lots of Namespaces with smaller footprints and using RBAC to manage access.
The talk ends by presenting a list of recommendations to handle access control in Kubernetes. These include mapping namespaces to individual services, restricting pod traffic between namespaces with a network policy and using multiple controllers or service accounts to operate on objects across Namespaces. Using a well thought out RBAC policy and giving out access on an as-needed basis is another recommendation. It is also recommended to keep the RBAC permission footprint broad rather than too narrow. For more granular access control beyond RBAC tools like OPA and admission controllers should be preferred.
This talk presented by Tasha Drew from VMware was meant as an introduction to the Kubernetes multi-tenancy working group. The multi-tenancy working group was formed under the Kubernetes Auth SIG at KubeCon North America in 2017.
The talk starts off by defining the concept of multi-tenancy in Kubernetes and laying out some of the reasons behind companies wanting to have multi-tenant architectures. In the context of Kubernetes multi-tenant clusters are ones that share cluster resources among many different users, teams, applications or customers.
Organizations tend to prefer multi-tenant clusters to reduce sprawl and improve resource utilization. Multi-tenant clusters also reduce the operations and costs of having to manage multiple control planes and software stacks. The talk lays out the concept of a tenant in Kubernetes as “...an entity representing a group of resources with defined accessibility and quota.”
It also identifies two multi-tenancy models in Kubernetes; Soft multi-tenancy and hard multi-tenancy. Soft multi-tenant configurations tend to have a higher degree of trust between tenants where they are trusted to behave as good actors. Hard multi-tenancy configurations have zero trust between users of a shared cluster. It is possible to achieve soft multi-tenancy in Kubernetes, however, there are some feature gaps that still need to be filled. There are also no standard soft or hard multi-tenancy configurations that are agreed upon by the Kubernetes community.
The working group is looking to identify new user stories and use-cases. Current and future work includes developing consensus around the definitions for a secure single tenant cluster, a secure soft multi-tenant cluster and a secure cluster that uses existing features among others.
This talk was meant as an introduction to the OPA Gatekeeper project. OPA Gatekeeper is a joint project of Microsoft, Google, and Styra. OPA Gatekeeper is a customizable Kubernetes admission webhook that helps enforce policies and strengthen governance for Kubernetes.
The Gatekeeper project is aimed at allowing organizations running multiple Kubernetes clusters to control what end-users can do on those clusters. OPA Gatekeeper also helps organizations implement and integrate policies that are based on best practices or regulatory requirements. It does all this while still giving users the flexibility and agility that are a hallmark of Kubernetes.
The Gatekeeper project has its genesis in kube-mgmt first open sourced by Styra. The latest iteration v3 was released during the current conference (KubeCon Barcelona) and introduces a new concept called constraints. Constraints refer to snippets of policies that can be declared on the API server and enforced. Constraints are used to implement and enforce policies and can be easily configured and parameterized. Constraint templates provide the source code for constraints and can be easily developed inhouse or sourced from the community. Templates also make it easier for teams to cooperate, share, test and configure constraints.
The Gatekeeper project is already in Alpha and is currently on the lookout for issues, feedback and use-cases from end users. Future work on the project aims to extend the packaging, reusability and declaration to mutating webhooks, incorporate external data into the system, expand the audit functionality, generate metrics and expand developer tooling.
This talk was presented by Karo and Beata both of whom work on VPA and are involved with the Kubernetes autoscaling SIG. The talk focuses on the current limitation in Kubernetes where running pods cannot be resized without first being restarted. This means that pod resources both CPU and memory cannot be updated in-place and pods have to be restarted for the new resource values to take effect.
In Kubernetes resource requests are static and are there for the entire lifetime of the workload. This does not match the real world where resource demands vary over time. Many factors can drive this variation from daily, weekly or seasonal patterns to popularity or lifecycle changes.
The VPA allows pods to scale vertically. It can be run in three modes: Off, where it recommends resource requests and limits, Initial, where it updates resource requests for newly created pods and Auto, where it updates resources requests for running pods.
The reason pods have to be restarted for resource requests to be updated is because the pod spec, where resource requests and limits are specified, is immutable. To be able to update pod resources without restarting code changes are required to a number of key Kubernetes concepts. These include the Scheduler, Kubelet and Quota.
The presenters point out that implementing these changes would require the involvement of multiple SIGs. A KEP (Kubernetes enhancement proposal) has already been set up to coordinate these changes and is pending approval.
The changes required would involve making the resource requirement field mutable, as well as updating the resource allocations section and the resources allocated to refer to the vetted state and the actual state. Two new pod conditions PodResizing and PodResizeSuccess would also be introduced as well as two new knobs PodSpec.Container.ResizePolicy and PodSpec.RetryPolicy. The VPA will also be complemented by two new update modes including in-place only and in-place preferred.
This talk from Rob Scott identifies a number of challenges with developing and implementing effective RBAC policies in Kubernetes. Ineffective RBAC policies can result in a much broader permission footprint than necessary and can compromise security. The talk also outlines best practices and open source tools geared towards making RBAC simpler to manage for organizations.
Authorization systems often tend to be either way too simple or way too complex; either not providing a sufficient degree of granularity or providing too much of it. The way organizations handle authorization also has a part to play in inefficient authorization policies. Most start off with highly granular authorization regimes. However, these regimes slack with time as multiple hacks are introduced to get the system to run properly.
It also introduces a new open source tool from reactiveops called RBAC manager. RBAC manager does for roles and role bindings what Deployments do for pods. RBAC manager makes Kubernetes RBAC definitions more concise, introduces automation to it and also makes it easier to manage for ephemeral environments.
The talk then goes on to outline some best practices when working with Kubernetes RBAC. It recommends following the principle of least privilege, making effective use of Kubernetes namespaces by making them granular enough to support the RBAC policy, centralizing configuration by grouping RBAC configuration in one place and using default roles as much as possible.
This talk from Eddie Zaneski tackles day 2 issues on Kubernetes. It outlines the process of setting up monitoring and logging for a Kubernetes cluster from scratch. These pipelines leverage open source tooling to gain visibility into Kubernetes clusters.
The presentation is split up into two parts; monitoring and logging. In the monitoring sections, the talk first identifies what needs to be monitored. It outlines metrics on the Node, Pod and application layer as well as ones for the core control plane components running in the kube-system namespace.
Prometheus and Grafana are identified as the main open source tools that can be used to set up this monitoring pipeline. Prometheus is a systems and service monitoring system that works via the scrap method of reaching out to systems and scraping metrics from those systems. Grafana is a really great way to visualize these metrics in a user-friendly dashboard and create information-rich graphs.
In the second section of the talk, tools that can be used for logging Kubernetes clusters are identified. Logging tools can be split up into two broad categories: shippers and ingesters. The talk recommends to generate and capture logs for every event and to write these out to standardout. An open source logging stack for Kubernetes would include elasticsearch, fluentd and kibana. This open source logging stack can be complemented with Loki, which is a new logging tool from Grafana and is optimized for Prometheus and Kubernetes.
This talk given by Amy Chen and Eryn Muetzel takes a deep dive into Kubernetes Namespaces. It outlines the common ways organizations use namespaces today and walks through the process of handling, identity, resources and security in the context of Namespaces.
Kubernetes Namespaces are virtual partitions of Kubernetes clusters. The way organizations make use of Namespaces has implications for identity, efficiency and the security posture. Namespaces abstract away workloads from the physical cluster, in the process increasing developer productivity since they no longer have to worry about managing the cluster lifecycle.
The talk identifies Namespaces as a useful way to organize and group workloads. One common way of doing this is to create separate Namespaces for each individual micro service that makes up an application. Organizations also split up clusters into separate Namespaces for individual teams. Another common way to split up clusters is based on environments e.g. having separate namespaces for the development or production environments. Organizations also use more granular and complex Namespace groupings which combine teams, applications and environments.
The talk also outlines key policy types that should be considered when working with Namespaces:
It also identifies the Kubernetes resources for which quotas can be enforced on a Namespace level. These include CPU, Memory, storage and objects
In addition to these admission controllers can also be used to enforce individual policies across clusters. OPA is a general purpose policy engine that can help make policy definition and enforcement consistent across clusters. OPA makes policy enforcement easier and quicker by avoiding manual tracing and editing of policy decision trees.
This panel discussion brings together experts and members of the Kubernetes multi-tenancy working group for a deep dive into the concepts surrounding multi-tenant Kubernetes. It covers already existing solutions for multi-tenancy, dives into the concepts of soft and hard multi-tenant models and looks at new features being developed by the community.
The talk kicks off by conceding that Kubernetes does not formally support the concept of multi-tenancy as of now. There are a number of custom vendor solutions for multi-tenancy in Kubernetes, however, these do not translate well across clusters from different vendors.
The panel identified the degree of trust between tenants as the main differentiating factor between soft and hard multi-tenant models. Environments with a higher degree of trust will require a soft multi-tenant configuration while others with a lower threshold of trust will require a hard multi-tenant configuration.
In terms of real-world multi-tenant configurations, Paul (cloud engineer at Singular) shared that they decided against a hard multi-tenant configuration with separate clusters for each team. This decision was driven by considerations of resource usage optimization and ease of operations by having a single control plane.
Elaborating further on the design he shared that they provisioned separate Namespaces for each tenant. They also put in place access controls so that tenants would not be aware of other tenants using the same cluster.
Erica (Software Engineer at Redhat), outlined the multiple approaches to multi-tenancy they employ at Openshift. Soft multi-tenancy is accomplished by having multiple namespaces or security and access control mechanisms. Hard multi-tenancy involves spreading tenants across separate clusters.
One common use case that Tasha (Product Manager at VMware) sees with VMware customers is the ability to share Kubernetes clusters between teams and departments. This is further complicated by the demand for tools to restrict access and isolate the tenants as well as provide a similar toolset to all tenants.
Sanjeev Rampal (Principal Engineer at Cisco) also outlined four potential approaches to multi-tenancy that have been a part of the working group discussions;
The working group aims at defining at standardizing both soft and hard multi-tenant configurations. These working configurations would then be shared with the community via the bug bounty program, followed by a process of integration and iteration on feedback received from security researchers.
The panel also identified eBPF, Cilium and OPA as some of the more interesting upcoming solutions that could potentially tackle some of the issues around multi-tenant Kubernetes setups.
Replex is the central platform for Kubernetes governance and cost management for the cloud-native enterprise. It works across multiple abstraction layers and technologies to help enterprises control and optimize costs, allocate and showback costs for individual teams or environments and enforce granular policies.
Interested in taking Replex out for a spin? Get in touch today!
Fan of all things cloud, containers and micro-services!
Cloud native has taken the IT landscape by storm. But what is it? We sat down with Pini Reznik, CTO at Container Solutions and co-author of “Cloud Native Transformation: Practical Patterns for Innovation” to try and figure out what exactly Cloud native is, which specific technology pieces, processes and cultural dynamics need to come together to create Cloud native environments and the best way for organisations to forge into the Cloud native future.
April 22, 2020
12 min read
In this instalment of our Kubernetes best practices series we review the concepts of Kubernetes tenants and multi-tenancy, identify the challenges that have to be overcome and outline best practices for DevOps and cluster admins operating multi-tenant Kubernetes clusters.
April 20, 2020
12 min read
Part four of our Kubernetes and Cloud native application checklist evaluates service mesh tools based on ease of use in cloud native environments as well as their traffic management, security and observability feature-sets.
April 8, 2020
12 min read