Documentation Menu Close Menu

Datica CKS Documentation

Getting started

What is CKS?

Everyone is excited about Kubernetes, including us! Datica is proud to offer a secure, compliant Kubernetes managed service we’re calling CKS.

CKS stands for “Compliant Kubernetes Service”. CKS is a managed Kubernetes service — a service that can be defined in a number of different ways. At a foundational level, a managed Kubernetes service assumes management of the control plane, a group of controllers that take care of routine tasks to ensure the desired state of the cluster matches the observed state. Because Kubernetes can be configured and deployed in a number of different ways, users are often confused about how to setup and manage a Kubernetes cluster.

It is because of this problem that managed Kubernetes service providers exist. The biggest players in this space are the three large cloud providers, Google Cloud Platform, Amazon Web Services, and Microsoft Azure. As Kubernetes adoption increases, so too is the need for a secure, compliant version of Kubernetes.

Although many of the existing managed Kubernetes services offer HIPAA eligibility, that doesn’t necessarily make them compliant, at least not out of the box. In order to ensure a cluster is defensibly compliant, a user would have to develop a clear definition of everything the product does to ensure compliance (in a signed legal agreement) as well as have a BAA in place with the service provider. The only other option is for the user to be responsible for the configuration of the underlying cluster, the operating system that runs the cluster and the instances (servers) the operating systems lives on.

The big cloud providers will provide sufficient descriptions of their product and possibly even sign a BAA, but gaps remain. Configuring logging, monitoring, intrusion detection, networking, an operating system, as well as a whole slew of infrastructure level configurations defeats the purpose of a managed service — the primary benefit of which is that a customer doesn’t have to worry about the aforementioned components or their intended configurations.

If you’re working within a regulated industry like healthcare you have very few options when it comes to using Kubernetes — either you manage the Kubernetes control plane, the operating system and the underlying infrastructure to maintain the flexibility required for compliance, or you risk falling out of compliance by using a managed service (because they don’t provide the configuration flexibility to achieve compliance). Datica is changing this.

This is where CKS comes in. In addition to managing the control plane, Datica also configures the underlying operating system (and manages it in perpetuity), configures the underlying instances — and installs, configures and manages a set of deployments required for compliance. These deployments include logging, monitoring, intrusion detection, vulnerability scanning and antivirus software.

In addition to the technical configuration, Datica assumes the liability for the complete stack in our BAA, something not available or possible with any other managed Kubernetes offering. CKS delivers a compliant managed Kubernetes service that functions like any other Kubernetes cluster, but one that comes pre-configured for compliance out of the box.

CKS architecture

CKS Architecture Overview

In this section we’ll review the CKS architecture and the various components outside of a user’s cluster that help Datica maintain compliance and security. Below we cover each component from the diagram above.

Cloud accounts


The two largest boxes in the diagram include both Datica’s Cloud Account and Customer’s Cloud Account labels. These correspond to each parties existing cloud account and are intended to represent separate entities (Datica and a third party).


Inside the cloud account boxes there are two darker boxes that correspond to Kubernetes clusters. The one in Datica’s Cloud Account is called the Lighthouse Cluster and the one in the customer’s cloud account is called CKS. We name these clusters differently so it’s easier to discuss throughout this guide, but in reality the Lighthouse Cluster is a CKS cluster itself.

The Lighthouse Cluster is a centrally managed CKS cluster that provides Datica with a number of different tools to help manage compliance. Those include:

  • Core API: Datica’s authentication system that manages organizations, users, groups and ACLs;
  • Syndication: The command center for managing all CKS clusters including making software updates and receiving compliance state information;
  • Vault: Datica’s public key infrastructure responsible for managing encryption across all CKS clusters;
  • Compliance engine: Responsible for serving the Cloud Compliance Management System, including continuous compliance checks against the cluster’s running state;

Secure connection


In between the two clusters is a box representing the secure TLS connection between the Lighthouse and every customer cluster managed by Datica. This connection is required to receive real-time compliance information from each CKS cluster. The Lighthouse processes this information and feeds that back into our compliance model to determine if a customer’s cluster is compliant or not.

Inside a CKS Cluster


A standard CKS cluster is comprised of three controllers and three workers. Datica configures the cluster for high availability to avoid a single point of failure. The compliance deployments — logging, monitoring, intrusion detection, networking, and vulnerability scanning, consuming roughly 8GB of memory on a single worker. However, some of these pods only need to live on a single worker.

Customers will have roughly 40GB of additional memory to allocate to their workloads. Of course, a CKS cluster can handle almost an unlimited number of workers.

More information

Below is a brief slide deck on CKS. These slides go over why we built a Kubernetes offering and the support and services associated with this new product. In addition, they give an overview of the Datica managed deployments and the shared responsibility model (what you do vs. what we do). This slide deck is not intended to replace the rest of the getting started guide, rather to reinforce the concepts we’ll discuss later on.

If you have questions while viewing the slides please read on throughout the rest of the guide as they will likely be answered in later sections.

To better understand CKS, it will help to briefly review Datica’s Legacy Platform product. In the world of cloud computing there are a number of different paths that lead to running and managing workloads. Whether you’re building a complex data processing application that performs sentiment analysis, or you’re a startup with a basic monolithic application that helps doctor offices schedule better — you will at some point make a decision around how you’re going to manage these pieces of software (and likely several times over throughout the course of their life cycles).

The path of least resistance to the cloud has traditionally involved utilizing a Platform as a Service (PaaS). A PaaS allows users to focus on application development, rather than the repetitive, lower-level tasks of managing workloads on a server. We’ve written extensively about PaaS offerings and encourage users to read through these articles.

In 2014 when Datica built its first product we did so with the assumption that PaaS was the future of application development and management in the cloud. While that turned out to be true to a certain extent, concepts like containerization and projects like Kubernetes have disrupted that thinking. Software has become too complex for PaaS. Organizations are growing and their use cases require more flexibility. As a result of this shift, Datica embarked on creating a compliant Kubernetes offering in late 2017.

Our goals for CKS were straight forward:

  • (For application developers) ensure CKS functions just as any other Kubernetes cluster.
  • Have no vendor lock-in, offer CKS as a cross-cloud solution.
  • Map Kubernetes controls to HITRUST and fill in the gaps with managed deployments.
  • Ensure all CKS clusters are continuously compliant through real-time configuration monitoring.

From December 2017 to May 2018 this is what we focused on building. CKS officially went generally available on June 1. Since then we’ve been accepting new customers as well as helping current Legacy PaaS customers migrate.

In our list of goals, we’ve achieved the following so far:

Functionality: CKS functions just as any other Kubernetes cluster. 95% of what Datica adds in terms of compliance and security happens behind the scenes.

Architecture: We’ve architected CKS to avoid vendor lock-in. We currently work with AWS and we look forward to supporting other cloud service providers in the future.

HITRUST: We’ve mapped HITRUST controls to Kubernetes functionality and configuration while providing additional tooling for logging, monitoring, intrusion detection, vulnerability scanning and more as Datica managed deployments.

Continuous Compliance: We’ve also completed writing a set of verification checks that continuously monitor the running state of each cluster. This work will provide us with a base to build better visibility into the security and compliance of the system. This visibility will be released as a new product — Datica’s Cloud Compliance Management System (CCMS). The CCMS will function as a compliance dashboard for CKS clusters. We currently do not have a timeline for the CCMS dashboard. However, major backend functionality is in progress.

CKS vs. others

As you explore the CKS product, you may be wondering how it is differentiated from other Kubernetes managed services like Amazon Elastic Container Service for Kubernetes (EKS) or Google’s Kubernetes Engine (GKE). One of the common points of confusion we see when talking to users is compliance eligibility vs. compliance provability. Eligibility simply means that the underlying CSP will sign a BAA — a document outlining how PHI will be stored, transmitted and accessed. The end user is still responsible for making the service compliant through proper logging, monitoring, configuration and more.

CKS on the other hand comes pre-configured with logging, monitoring, intrusion detection, volume backups, network encryption and a managed OS, among other things. With CKS, users no longer have to configure or manage their cluster for compliance.

Where CKS fits

If you’re new to Datica, CKS will be home to all applications that store or transmit sensitive information, like PII or PHI. Because Datica is aligned with HITRUST, customers can take advantage of global deployments and rest easy knowing they’re covered for localized compliance regimes like GDPR, IRAP and more. We’re also constantly adding support for new compliance frameworks, for more information see our compliance roadmap here.

We’re hopeful about the future of Kubernetes and cloud native applications. Our efforts to build CKS align with demand from our customers to allow for more flexibility within Datica’s product offering. As an organization we are committed to helping the cloud native community grow through adoption and a commitment to open source software.

If you have any questions, comments or feedback about our shift to CKS, please email us directly at


Throughout the rest of the getting started guide we’ll walk you through:

  • How to learn Kubernetes with a curated list of resources
  • How to request a CKS cluster
  • How to access your new cluster
  • How to access Datica managed deployments
  • Understanding the shared responsibility model

CKS versioning policy

Datica addresses vulnerabilities and security issues as they become publicly available. Kubernetes upgrades and deployment tooling updates happen on a quarterly basis (unless a security liability is present, in which case we would update as soon as possible). Below is a list of the current versions of software we run:

Cloud Resources

Currently, CKS is only supported in AWS. CKS clusters makes use of the following resources:

  • IAM Policies
  • IAM Profiles
  • IAM Roles
  • Key Pairs
  • VPC Network
  • VPC NAT Gateways
  • Route 53 Zone
  • Route 53 Entries
  • VPC Subnets
  • VPC Internet Gateway
  • VPC Route Table and Routes
  • VPC Network ACLs
  • Security Groups
  • S3 Buckets
  • ELB
  • NLB
  • EC2 Instances
  • Elastic IP Addresses (one per NAT Gateway)
  • EBS Volumes (and snapshots)

For cost estimation, a standard CKS cluster will use the following minimum resources:

  • 6 m5.xlarge EC2 Instances
  • Various EBS volumes, using around 500 GiB of GP2 storage, and a small amount of standard storage.
  • 4 EBS snapshots per volume (1 per day, stored for two days and region-replicated)
  • 1 Elastic IP per Availability Zone (varies per region)
  • 1 ELB
  • 1 NLB
  • 3 S3 buckets
  • 1 Private Hosted Zone in Route53
  • 1 VPC

The costs of these resources may vary based on network usage, customer deployments, and any customizations that are made to the cluster.

AWS Instance Retirement

Occasionally, AWS may schedule an EC2 instance used by your cluster for retirement. When this happens, you should stop the EC2 instance, then start it again. Starting the stopped instance migrates it to new hardware, resolving the issue. CKS nodes are resilient to reboots and will rejoin the cluster when the instance is started again.

Learning resources

Already familiar with Kubernetes? Feel free to skip this section and jump right to Basic Prerequisites

Preparing for Kubernetes can seem daunting on the surface. Luckily Datica takes care of the hard parts for you. The biggest effort to end users is figuring out what their deployments look like on Kubernetes. Tactically what that means is you’ll need to figure out logical groupings of containers, services, and other components of your application.

While Kubernetes can essentially work however you’d like, and since you can containerize pretty much anything, the lift from a non-Kubernetes deployment model to a Kubernetes-ready deployment model can be fairly light. However, if you’re re-architecting, and moving toward a cloud native model — containerized workloads with micro-services automatically orchestrated by Kubernetes, it can be somewhat involved.

The good news is that Kubernetes and containerization has drastically increased in popularity over the last year. The number of resources online are growing daily. This article is intended to give you quality, recommended resources as a starting point for your Kubernetes journey. Each set of resources is broken down by category below.


Article: Principles of Container-based Application Design - in this article, Bilgin Ibryam discusses the seven principles of container-based application design and how you can integrate those into your development process.

White-paper: Principles of Container-based Application Design - this is the accompanying white-paper to the article above. This paper goes into more detail on the seven principles.

Article: Developing on Kubernetes - in this article, Michael Hausenblas and Ilya Dmitrichenko discuss approaching Kubernetes development as a software developer. The authors walk you through various CI/CD tools and approaches, with several hands on examples that you can try. This is a great article for those looking for a better understanding of how cloud native applications are deployed and kept up to date.

Working with Kubernetes

Once you’ve containerized your application and or made the necessary architectural changes, the next step is to start working with Kubernetes directly. The very first step in that process is getting a cluster stood up:

  • The simple way: By far the easiest way to get a working cluster up is by install minikube locally on your machine. Minikube is a single-node Kubernetes cluster that requires almost no configuration and is the path of least resistance to start using Kubernetes.
  • The hard way: If you’re looking to really learn how Kubernetes works, this guide is one of the best resources available. It goes into intricate detail on getting a cluster stood up and configured.

Note: Both options above are for getting a handle on Kubernetes, not getting a compliant Kubernetes cluster up and running. Once you’re ready for production, you will work with Datica to get a compliant cluster stood up. The above recommendations are for local testing purposes only.

After you’ve got a cluster stood up using one of the three options above, the next step is start deploying and playing around with the Kubernetes internals. By far the most important step in this process is figuring out how to map your architecture to the various concepts within Kubernetes. Those important concepts include:


Deployments help users manage replicated pods by allowing for easy modification of configuration. Having control over this configuration allows for Kubernetes to manage updates between application versions, and maintenance of historical events.

More information on Deployments »


Pods are the smallest unit of scheduling in the world of Kubernetes. Pods typically contain a number of tightly coupled containers. These containers typically share a common network or perhaps a specific configuration. Pods are almost always created via Deployments in Kubernetes.

More information on Pods »


StatefulSets are specialized pod controllers that allow for better control over stable and unique network identifiers, persistent storage, graceful deployments and scaling. The primary use case for StatefulSets is in managing databases and persistent storage objects within Kubernetes.

More information on StatefulSets »


Like StatefulSets, DaemonSets are specialized pod controllers that allow for an instance of a pod to run on a set of specified nodes. Datica uses DaemonSets for managing logging, monitoring and other compliance related tooling.

More information on DaemonSets »

Other Resources

Basic prerequisites

Once you’ve established contact with Datica, the first step is to work with our sales department on understanding your organization’s needs as they pertain to compliance and security in the cloud. This will likely require a few in depth discussions as well as a technical overview of CKS.

After you’ve purchased CKS, Datica will provide you with a set of pre-requisite activities and detailed instructions for provisioning your CKS clusters. Clusters are typically provisioned 3-5 business days following the completion of your provisioning request.


Once we’ve provisioned your new cluster, we’ll grant you access to that cluster. That process is as follows:

We’ll create an organization on your behalf using the legal business name collected above. This organization lives within Datica’s centralized authentication system. This system is responsible for managing users and cluster access. After we’ve created the organization, you’ll be sent an invite to your email on file (as well as any other administrators). Use this email to activate your account.

Once you’ve activated your account, you’ll need to download and install the Datica datikube CLI utility. You can download the package and view instructions for installation here.

Once you’ve installed datikube, you’ll need three pieces of information:

  • NAME - This is the name you’d like to use for your cluster (ex: “prod”, “staging”, etc.). Datica will configure this for you.
  • CLUSTER-URL- This is a URL at which this cluster’s kube-apiserver is accessible. Datica will provide this to you.
  • CA-FILE - This should be the relative path to the CA cert for this cluster. Datica will provide you with this file.

After you’ve gathered your cluster’s name, cluster-url, and the ca-file, you can run the following command:

datikube set-context <NAME> <CLUSTER-URL> <CA-FILE>

Ex: datikube set-context prod-cluster ~/.example/ca.crt

After successfully running the datikube set-context command with the parameters above, you can begin using your new compliant cluster!

Before deploying your workloads onto your new Kubernetes cluster. You’ll want to ensure you can access the various deployments Datica provides. Those include:

  • Logging access: kubectl port-forward -n logging service/kibana 8001:5601 - In your browser, the kibana dashboard can be accessed at the following url: http://localhost:8001
  • Monitoring access: kubectl port-forward -n monitoring service/grafana 8002:3000 - In your browser, the grafana dashboard can be accessed at the following url: http://localhost:8002
  • Metrics access: kubectl port-forward -n monitoring service/prometheus-k8s 8003:9090 - In your browser, the prometheus dashbaord can be accessed at the following url: http://localhost:8003
  • Alerting access: kubectl port-forward -n monitoring service/alertmanager-main 8004:9093 - In your browser, the alertmanager dashboard can be accessed at the following url: http://localhost:8004

IMPORTANT You should always make an effort to use kubectl authenticated with your Datica account credentials. However, sometimes you may want or need to use tools that require system roles. These types of tools cannot make use of Datica’s webhook authentication/authorization. You must make use of Kubernetes RBAC functionality. Since this is in your application space, you will be responsible for proving and ensuring the security and compliance of the roles you set up in accordance with your own company policies.

Groups and ACLs

Kubernetes ACLs can be constructed using the following sections, separated by:

  • product - The first part of the ACL string should always be the exact string “product”.
  • cluster - The second part of the ACL string should always be the exact string “cluster”.
  • cluster name - The name of the cluster you want the ACL to apply to.
  • action - This part of the ACL string should always be the exact string “action” OR “*”.
  • group - A group is a kubernetes-specific concept that overlaps with Datica’s groups. With Datica CKS, this should always be *.
  • namespace - A namespace is a kubernetes-specific concept. You can learn more about namespaces here.
  • resource - A resource is a Kubernetes-specific concept and is essentially any object that is set up on Kubernetes. You can see the full list of Kubernetes resource types here.
  • verb - The last part of the ACL string is the HTTP verb. The list of possible verbs can be viewed here. Make sure to use the request verbs, not the HTTP verbs. Case matters, so get will work as a verb, but GET will not. A final note on verbs: the kubectl port-forward command requires the ability to create pods. When completely assembled, the string should look something like product:cluster:[cluster-name]:action:*:[namespace]:[resource]:[verb]

ACL String Examples: To give a group access to retrieve the pods from a specific namespace, use the following ACL string: product:cluster:mycluster:action:*:examplenamespace:pods:list. This ACL string will provide users in this group access to list pods in the “examplenamespace” namespace using kubectl like kubectl -n examplenamespace get pods.

Some resources in Kubernetes are not “namespaced”, meaning they are general cluster resources rather than belonging to a single namespace. In order to grant access to non-namespaced resources, use * for the namespace section in the ACL. For instance, to grant access to listing all namespaces within a cluster, use the following ACL string: product:cluster:mycluster:action:*:*:namespaces:list. This ACL string will allow users in this group to run kubectl get namespaces.

To give a group access to view monitoring, use the following ACL string: product:cluster:mycluster:action:*:monitoring:*. This ACL string will provide users in this group access to retrieve all resources that are in the “monitoring” namespace.

To give a group access to view logging, use the following ACL string: product:cluster:mycluster:action:*:logging:*. This ACL string will provide users in this group access to retrieve all resources that are in the “logging” namespace.

To give a group full access to a specific namespace, use an ACL string like this: product:cluster:mycluster:action:*:examplenamespace:*:*. This ACL string will provide users in the group complete access to the “examplenamespace” namespace.

Limiting Application Access

Pod Security Policies

CKS allows customers to optionally make use of Kubernetes’ PodSecurityPolicy admission controller to manage the security context within which pods are allowed to run. CKS provides a default PodSecurityPolicy with extremely limited permissions that can be used to run your workloads. If you have pods which require greater permissions than those defined in the default PodSecurityPolicy, you will need to define your own PodSecurityPolicy and authorize the ServiceAccount that creates the pod to make use of it. The Kubernetes documentation linked above has instructions for doing so.

Please contact Datica Support if you wish to enable this feature.

Cross-Application Communication

Developers may come across use cases in which an application will require talking to another component within CKS. For example, a CI/CD pipeline using Jenkins or Gitlab that deploys directly into a cluster will require interacting with the API server. In these cases, you should create a dedicated serviceaccount with permissions limited to only what your application requires.

Note: The Jenkins tutorial linked above makes use of minikube. Be aware that some commands will differ from CKS.

The general steps (taken from the article linked above) for using a serviceaccount to provide limited permissions to an application are as follows:

  1. Create namespace: kubectl create ns myapp
  2. Create a serviceaccount: kubectl -n myapp create sa thesa
  3. Create a rolebinding: kubectl -n myapp create rolebinding samplerolebinding --clusterrole=edit --serviceaccount=myapp:thesa
  4. Deploy the application: kubectl -n myapp run theapp --serviceaccount=thesa

For more information about default and user-facing roles available for use with RBAC, see the Kubernetes documentation.



In order to allow your users and the public to reach your site, you should use the ingress (a collection of rules that allow inbound connections to reach your services) that Datica provides. This is important to help ensure that all traffic is encrypted.

The ingress maps a hostname and path (which could just be / to route all traffic from the hostname) to a kubernetes service. For each cluster there is a single ingress-controller which has a public NLB. As an end-user, you can create a CNAME record that maps whatever domain name you’d like to the NLB’s DNS name. You would then use that CNAME as the hostname in the ingress resource for your application. What this does is tell the ingress-controller that any traffic sent to the CNAME should be directed to the service that is specified in the ingress.

Managing TLS Certificates

The most common way of serving HTTPS on CKS is to set up a TLS definition in your ingress resources. This will tell the ingress-controller to load the specified TLS secret and serve HTTPS for any requests routed to the associated ingress resource. TLS will terminate at the ingress-controller, and from there the request will be routed to the appropriate service over the encrypted cluster network. The ingress YAML from our k8s-example project has an example of how this works. The secret specified in secretName must be an existing TLS secret object in the same namespace as the ingress resource. The certificates in the secret may be sourced from any public or private CA you wish to use.

Global Configurations

To set custom nginx configurations that apply to all ingress resources, you can edit the nginx-configuration configmap in the ingress-nginx namespace. While most resources created by Datica cannot be edited without losing your changes the next time Datica applies an update, any changes made to this configmap will be left untouched.

As an example, here is how you might add custom headers to be returned to the client on any request proxied through ingress-nginx:

Create a file called custom-headers.yaml with the headers to be applied:

apiVersion: v1
kind: ConfigMap
  X-Custom-Header: "this is a header"
  X-Time: ${msec}
  name: custom-headers
  namespace: ingress-nginx

Apply the new configmap to the cluster:

kubectl apply -f custom-headers.yaml

Now apply a patch to the nginx-configuration configmap that instructs the ingress-controller to retrieve headers from the custom-headers configmap:

kubectl -n ingress-nginx patch configmap nginx-configuration \
  --patch '{"data": {"add-headers": "ingress-nginx/custom-headers"}}'

The ingress-controller will automatically detect that changes have been made, and reload nginx. Updates to the configmaps referenced by either the add-headers or proxy-set-headers configurations are not monitored, however. In order to see the effect of new changes made to the custom-headers configmap, you will need to manually restart the ingress-controller pods, like so:

kubectl -n ingress-nginx get po
# Select first nginx-ingress-controller pod

kubectl -n ingress-nginx delete po <FIRST_POD_NAME>

kubectl -n ingress-nginx get po -w
# Wait for first pod to be replaced and become ready

kubectl -n ingress-nginx delete po <SECOND_POD_NAME>

For a full list of available configurations, reference the NGINX Ingress Controller configmap documentation

Ingress Configurations

Individual ingress resources can also be configured via annotations, which will affect only the ingress on which they were applied. For instance, if you wanted to expose multiple services over the same hostname you might assign one ingress resource the path /app1 and the second ingress /app2. But unless both apps are expecting their respective path prefixes, they will not know how the handle the request. To get around this, you can use the annotation to tell ingress-nginx which parts of the path to keep. Rewrite-target uses regex capture groups defined on the ingress path to determine what the final path should be. In the simplest case, where the /app1 prefix is stripped off before the request is forwarded to the appropriate service, you would define the path in the ingress resource as /app1(/|$)(.*) to create two capture groups. The first is either / or the end of the path, and the second is anything that follows (defined by .*). Since we only want the contents of the second capture group, you would then add the annotation /$2 to instruct ingress-nginx to rewrite the request path with anything that follows /app1 in the original request URI. More information on the rewrite-target annotaion can be found here

For a full list of available annotations, reference the NGINX Ingress Controller annotaion documentation


It is important to note that when creating resources on Kubernetes, it is imperative to always use Datica-provided networking solutions. In particular, do not use the host network for resources (such as the setting hostNetwork: true) as this can result in unencrypted traffic. ALWAYS use Datica’s provided ingress.

Persistent Storage


There are many ways to configure persistent storage in Kubernetes. Any resource that creates a Pod can have a PersistentVolumeClaim to attach a PersistentVolume to the pod. By referencing a StorageClass in the PVC, a PersistentVolume can be dynamically provisioned according to the PVC’s request. All pods provisioned by Datica that require a volume use the persistent-storage StorageClass, which is configured to create encrypted SSD EBS volumes, allow volume expansion, and retain the PersistentVolume if the attached PersistentVolumeClaim is deleted. This enables a pod to reattach to the same PersistentVolume when it is rescheduled. For your own pods, you may use the existing StorageClass created by Datica, or use it as a template to create your own that better suits the needs of your application.


Any volumes provisioned using the persistent-storage StorageClass will be encrypted by default. To ensure that your own PersistentVolumes are encrypted, reference this StorageClass in the PersistentVolumeClaim, or create your own StorageClass with the parameter encrypted: "true". Data in a container’s ephemeral storage lives on the host filesystem, and does not persist between pod redeploys. However, since all host level volumes are encrypted using Amazon KMS keys, anything written to your container’s filesystem is encrypted at rest.



Any logs written to STDOUT or STDERR by containers on CKS are picked up by Kubernetes, and stored on the host filesystem. Fluentd retrieves the logs from there and forwards them to Elasticsearch and S3. Elasticsearch indices are pruned after five days, but the S3 log archives are retained for the life of the cluster. All indexed logs are viewable in the Kibana dashboard.


Each node has an encrypted logging volume attached to securely store logs before they have been processed. Once picked up by Fluentd, they are sent over the encrypted cluster network to an Elasticsearch client, which stores them on its own encrypted Persistent Volume. Archived logs are sent to S3 using the Amazon S3 API over HTTPS, and are stored in the encrypted cluster bucket.


The primary bottleneck in the CKS logging stack is Elasticsearch volume IOPS. By default, CKS deploys 2 es-client pods, each with a 100GiB GP2 EBS volume. GP2 volumes deliver 3 IOPS/GB with a burst balance of 3,000 IOPS, meaning each Elasticsearch client has approximately 300 baseline IOPS. With the default setup, the CKS logging stack can handle around 1 million logs per hour without expending burst balance.

If an es-client expends the available burst balance on its volume, Elasticsearch will be unable to index logs as fast as they can be ingested. When this happens, the bulk ingestion queue fills up and new requests will be rejected. This causes the log buffers on CKS nodes to fill up, and over time may result in pods failing to deploy on nodes where the log volume has filled completely. As a temporary measure, the volume can be expanded incrementally, since each time an EBS volume is resized it starts with a full burst balance. Note that AWS EBS volumes can only be expanded once every 6 hours however.

Volume expansion is enabled on the persistent-storage StorageClass in CKS, so you can expand these volumes yourself by following these steps:

kubectl -n kube-system patch pvc elasticsearch-data-es-client-0 --patch='{"spec": {"resources": {"requests": {"storage": "<NEW VOLUME SIZE>Gi"}}}}'
kubectl -n kube-system patch pvc elasticsearch-data-es-client-1 --patch='{"spec": {"resources": {"requests": {"storage": "<NEW VOLUME SIZE>Gi"}}}}'

As of Kubernetes 1.15, the ExpandInUsePersistentVolumes feature is enabled by default, so the volumes will expand their filesystems without any further action.

If the available burst balance is continually drained due to high logging throughput, then the Elasticsearch volumes will need to be overprovisioned such that burst balance is no longer a factor. EBS volumes with a capacity of 1TB or more do not have a burst balance, since they generate enough IOPS to replenish burst before it can be used. Expanding your elasticsearch-data-es-client PVCs to 1000Gi will eliminate the IOPS bottleneck from the system, and significantly increase the number of logs that can be handled by the logging stack.


Any cluster can also have CKS host kernel-level audit logging enabled, these logs will be treated the same as application logs (sent to S3 on for the life of the cluster and indexed in Elasticsearch for 5 days). Please submit a support ticket if you would like this feature enabled, make sure to take a look at the caveats below prior to creating your request. These audit logs include but are not limited to: * Kernel Parameter/Module Modifications * Mount Operations * Cron Scheduling * Sudoers, Passwd, User, Group, Password Database Changes * Network Environment Changes * Systemd Changes/Operations

By default this feature is disabled. Before requesting that this be enabled, please review the following caveats: * High likelihood of needing to increase logging volume storage across all nodes. If those volumes are not actively monitored and they fill up, they can cause service interruption by preventing unrelated pods from starting properly. * Moderate increase on the load of Elasticsearch, increased index sizes which could mean slight performance degredation of Elasticsearch if not addressed. * Moderate increase on the volume of logs stored in S3, which will likely cost more depending on your AWS S3 Pricing

Control Plane Configuration

Image Garbage Collection

Image garbage collection is handled by the kubelet agent running on each node. The kubelet watches the image filesystem and checks if any of its eviction thresholds have been met. For imageGC, the kubelet is configured to start deleting old images when there is less than 20% of the total storage left or when there are fewer than 15% of the total free inodes on the images volume.

Pod Security Policies

CKS supports enabling the PodSecurityPolicies admission controller. PodSecurityPolicies provide a way for cluster administrators to define a set of security criteria that a pod must meet in order to be accepted by the system. These include things like restricting the users a container can run as, allowing different types of volume mounts (secrets, configmaps, hostPath, etc.), and using the host network. A full list of available controls can be found here.

CKS creates a default PodSecurityPolicy that is bound to all ServiceAccounts, meaining any Pod can use it out of the box. The default PSP requires that all containers run as non-root, and does not allow privileged containers or any special capabilities, like CAP_NET_ADMIN. Any pod that meets these criteria will be admitted automatically without additional configuration. Pods that do not match the default criteria will need to be explicitly granted permission to use a PSP that allows the required priviledges. To use a PSP, a pod’s ServiceAccount needs to be granted the use verb on the PSP resource via RBAC.

A pod that is allowed to use multiple PSPs will follow this policy order to choose between them. Once a pod is created, its PSP will not change. The pod does not get permissions from all PSPs that it is allowed to use, it selects one PSP and must conform to its rules in order to be admitted to the cluster. This can cause some unexpected behavior in certain situations when a pod has implicit requirements not specified in the initial pod YAML.

For instance, if a pod is allowed to use the default and privileged policies and the image runs as root, then it should be assigned the privileged policy. However, if the pod does not explicitly define that it wants to run as root by setting runAsUser: 0, and has no other requirements that disqualify default,then it will be admitted under the default policy. When this happens the pod will be created, but the container will fail with the error CreateContainerConfigError because it has violated the rules of its PSP. In order to use the correct PSP, the pod would need to either explicitly define the user it will run as, or request some other elevated permission that would force the pod to be admitted under the privileged policy (such as mounting a host path, or running a privileged container).

PodSecurityPolicies are an advanced security feature of Kubernetes, and are not enabled by default. In order to enable PSPs for your CKS cluster please reach out to Datica support by submitting a ticket through the Platform dashboard. We will enable the feature on your staging cluster first, and enable it in production when you indicate that you are ready.


Application Ingress


Secure ingress is one of most important parts of a compliant cluster. In CKS, we use the ingress-nginx project to provide a single access-logged point of ingress into the cluster network. Using routing rules and TLS definitions configured by ingress resources, multiple applications can be exposed to the public internet over the Network Load Balancer provisioned for your cluster. The ingress-contoller handles determining which ingress best matches the incoming URL, and routes the request to the appropriate service backend. If none of the ingress resources have a match then the request is routed to the default-http-backend, which responds default backend - 404.

This tutorial will step through a basic deployment based on the k8s-example project to expose an application on the public internet over HTTPS.


Step 1 Clone the k8s-example project

$ git clone

This project contains a set of templates that demonstrate a simple app deployment, along with a bash script that uses them to generate valid Kubernetes YAML files. The script assumes a Unix-like environment with the command line utility sed installed.

Step 2 Pick a host name for your app.

This host name will be a DNS CNAME entry that points to the load balancer for your cluster’s ingress-controller. For this example, we will refer to this as <YOUR_DOMAIN_NAME>.

To find the load balancer address for the cluster, run the following command:

$ kubectl -n ingress-nginx get svc ingress-nginx -o wide

NAME            TYPE           CLUSTER-IP    EXTERNAL-IP                PORT(S)                      AGE     SELECTOR
ingress-nginx   LoadBalancer   <LOAD_BALANCER_Address>    80:30236/TCP,443:31494/TCP   18d     app=ingress-nginx

Create a CNAME record, with the name you selected, that points to the EXTERNAL-IP address listed by the command above. At Datica, we use AWS Route53 to manage our DNS records, but any DNS provider will work.

Note: If you do not own a domain, and do not wish to set one up at this time, you can use the ingress load balancer address to expose your app for development.

Step 3 Generate Kubernetes YAML for the app

Now that we have chosen a host name for the application, we can generate the YAML to describe the deployment, service, and ingress resources. The script in k8s-example will take care of this for us.

$ ./ --deployment example --namespace default --image nginxdemos/hello --port 1234 --hostname <YOUR_DOMAIN_NAME>

There should now be three YAML files under the example directory, one for each resource.

Step 4 Create TLS certificate

Next, we will generate a self-signed certificate for serving the application over HTTPS. While this would not be appropriate for a production app, it is sufficient for a test deployment. Any certificate used for serving HTTPS must have either a Common Name (CN) or a Subject Alternative Name (SAN) that matches the hostname in the request sent by the client.

For this example, we will use the script in k8s-example to create the cert pair.

$ ./ --deployment example --hostname <YOUR_DOMAIN_NAME>

Once you have generated your key and certificate, upload them to your cluster as a TLS Secret to allow the ingress-controller to make use of them for your ingress resource. The secret must be created in the same namespace that your application will be deployed in, default in this case.

$ kubectl --namespace default create secret tls example-tls --cert=./example/cert.pem --key=./example/key.pem

Step 5 Deploy the application

Now that the YAML files and TLS Secret have been created, we are ready to deploy the application.

$ kubectl apply -f ./example/deployment.yaml
$ kubectl apply -f ./example/service.yaml
$ kubectl apply -f ./example/ingress.yaml

If you look at the logs for your cluster’s ingress-controller deployment you will see that it finds the new ingress resource as soon as it is created, and reloads nginx with the new configuration.

These logs will look something like this:

$ kubectl -n ingress-nginx logs -l app=ingress-nginx

I0507 02:48:24.548467       8 event.go:218] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"example", UID:"9554489c-7072-11e9-be3e-02d978fa86a2", APIVersion:"extensions", ResourceVersion:"3362903", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/example
I0507 02:48:24.549097       8 backend_ssl.go:67] adding secret default/example-tls to the local store
I0507 02:48:24.549214       8 controller.go:168] backend reload required
I0507 02:48:24.631578       8 controller.go:177] ingress backend successfully reloaded...

Step 6 View the app

If everything is wired up correctly, your application should now be up and running at the CNAME you configured. For this example, we would go to https://<YOUR_DOMAIN_NAME>.

Since the certificate we created is self-signed, any modern browser will tell you that the connection is not secure. If you inspect the details for the HTTPS connection you will see that it is receiving your certificate, but the connection is considered insecure because the cert is self-signed. This is expected, since there is no way for a browser to check the validity of a self-signed certificate. It is safe to proceed, (so long as you trust yourself!). In production, you should always use a certificate signed by a public CA. At Datica, we use Let’s Encrypt for this purpose.

If you have deployed the nginxdemos/hello image, it will display a page with some information about the server it is running on. Since the container is running in a Kubernetes pod the Server Address and Server Name will be the IP and name of the pod, rather than the external IP and name of the load balancer or the host you have configured for ingress. This is because Kubernetes and ingress-nginx abstract networking away from the container. From the container’s point of view, the pod is the host that it lives on.

Default SSL Certificate Configuration

By default, the ingress-controller uses a Kubernetes Ingress Controller Fake Certificate which is a self-signed certificate used to serve HTTPS for any routes that do not have a valid TLS configuration. This makes it much harder for clients to verify the origin or validate the authenticity of the server. This could lead to clients ignoring certificate validation when interacting with the server and thus making them more vulnerable to an attacker impersonating the server to perform a man-in-the-middle (MiTM) attack.

For better security posture, we recommend configuring a default-ssl-certificate. This can be configured by editing an auto-created tls secret called default-ssl-certificate in the ingress-nginx namespace that the ingress controller will automatically pick up and use. Base64 encode your certificate and key, and edit the existing default-ssl-certificate secret in the ingress-nginx namespace with your encoded certificate and key strings:

$ kubectl -n ingress-nginx edit secret default-ssl-certificate

++  tls.key: "<ENCODED_KEY_STRING>"

Then Rollout your ingress deployment for the new default cert to be used:

$ kubectl -n ingress-nginx rollout restart deploy/nginx-ingress-controller

Common Problems

Ingress-controller is serving “Kubernetes Ingress Controller Fake Certificate” instead of the configured certificate If your ingress has TLS and/or a default-ssl-certificate defined but you are still seeing the wrong certificate when visiting the site, then it is likely that there is something misconfigured in your ingress resource, or in your certificate.

For the ingress resource check closely to make sure there are no typos in the TLS config, and that the CNAME is set as the host for the route and is in the list of hosts for the TLS config. A small error, such as:

  - hosts:
  - secretName: example-tls



  - hosts:
    secretName: example-tls


can cause Kubernetes to accept the YAML as valid, while also causing the ingress-controller to ignore the TLS secret for your list of hosts.

For the certificate, verify the secretName specified in the ingress resource matches the name of a valid TLS Secret in the same namespace. Also make sure that it has a CN or SAN that matches the CNAME DNS record you are using for the application. If it does not, then the ingress-controller will consider it to be an invalid certificate, and will not use it to serve HTTPS.

When debugging ingress, it is always useful to check the logs for the ingress-controller deployment. If a TLS secret is being rejected, the ingress-controller will often log information about the problem. As an example, if a certicate is created with the CN <YOUR_DOMAIN_NAME>, but the ingress uses the load balancer address as the host, then the ingress controller will reject the certificate on the grounds that it does not have a CN or SAN that matches the route configured for the ingress:

ssl certificate default/example-tls does not contain a Common Name or Subject Alternative Name for host <LOAD_BALANCER_Address>. Reason: x509: certificate is valid for <YOUR_DOMAIN_NAME>, not <LOAD_BALANCER_Address>

After updating the ingress resource to use <YOUR_DOMAIN_NAME> for the host (and list of hosts under the TLS config) the ingress-controller will detect that an ingress resource has changed, and reload nginx with the new configuration.

Certificate Chaining

In production, it is common for a certificate to be part of a cert chain, each one validating the trust of the one before it until you reach the Root Certificate Authority. In the simplest example, a Root CA would sign an Intermediate CA, which could then be used to sign your leaf certificate(s). This allows the Root CA to remain locked away, while still being able to sign new leaf certificates using the shorter-lived Intermediate CA. It is important that the TLS Secret contains the full chain of certificates up to (but not including) the public trusted root in order for an end-user to verify the chain of trust.

A chain of certificates would look like this:

<Leaf Cert>
<Intermediate Cert>

If you choose to use an automated certificate controller, such as cert-manager, then it will handle creating this chain for you in your Secret. Otherwise, just make sure that the file you upload as the cert for your TLS Secret contains a valid chain.

VPC Peering with the CKS VPC

Some applications require additional AWS resources beyond what is provisioned with CKS. For example, your application might store data in RDS. To maintain logical separation between Datica-managed resources and resources you manage, it is best to provision these resources in a separate VPC, then create a peering connection between the two VPCs.

VPC peering with CKS works just like any other VPC peering. You can follow the instructions in Amazon’s guide for creating a VPC peering connection. You may add routes to the route tables created for CKS or attach additional security groups and rules to instances as needed. As noted in the AWS documentation, you cannot peer two VPCs with overlapping IPv4 CIDR blocks. The host network uses a CIDR of, the cluster network CIDR is, the Kubernetes service CIDR is, and there are network bridges using CIDRs and

Kubernetes Dashboard

Note: In order to install an application that requires system roles or service accounts on Datica CKS, it’s necessary to first install the application using the Datica RBAC Admin user and then configure the roles for use using Datikube authentication. In the case of kubernetes-dashboard, the project recommends the creation of a system role and associated policies to limit the privileges granted to the application. Refer to the Kubernetes Dashboard documentation for more information about what privileges are set up.

Installing the Kubernetes Dashboard using the RBAC Admin certificate

To install the Kubernetes Dashboard application in your cluster, execute the following commands (Before beginning this step, set a kubectl context that uses the Admin certificate):

kubectl apply -f

Accessing the Dashboard

To help ensure that all access to the Dashboard and resources running on Kubernetes is handled appropriately, authenticate against the Kubernetes Dashboard using your Datica account. This prevents you from having to manage system roles directly and from accidentally exposing your cluster to would-be attackers due to misconfiguration. All access control is handled via Datica Groups and ACLs.

Step 1

Make sure you have an up-to-date Datica session token for use later on — then run $ datikube refresh and enter your account credentials when prompted. Next you can retrieve your updated token by running (save this token for use in step 4):

$ echo `kubectl config view -o jsonpath='{.users[?( == "datica")].user.token}'`

Step 2

Next we’re going to get the running pod name for the dashboard by running:

$ kubectl -n kube-system get pods | grep kubernetes-dashboard

This will output something similar to the following line:

kubernetes-dashboard-5bd6f767c7-44446                                   1/1       Running            0          25m

The first part is the pod name that is needed for the next step.

Step 3

In this step we’re going to set up port-forward to the dashboard using the kubernetes-dashboard pod name. This allows you to securely access the dashboard through https://localhost:8001/. Run:

$ kubectl -n kube-system port-forward <pod-name> 8001:8443

(replace with the name of the pod running your dashboard (e.g. kubernetes-dashboard-5bd6f767c7-44446)

Now you can navigate to https://localhost:8001

Note: Unless you replaced it during installation, your browser will report the certificate as invalid because the kubernetes-dashboard serves self-signed certificates by default, which is specified in the kubernetes-dashboard.yaml deployment file. The kubernetes-dashboard documentation provides instructions for replacing these certificates with valid CA-signed certificates. IMPORTANT: Whether you use self-signed or CA-signed certificates, Datica strongly recommends that you do not expose your kubernetes dashboard to the public Internet.

Step 4

The last step is to authenticate using Datica session token. At the login screen, select “Token” and enter the token from step 2. At this point, you should be able to see and use the kubernetes dashboard per its documentation. Remember, your permissions in the dashboard will be limited by the ACLs you have access to via Datica’s Product Dashboard.

Private VPN Connections

In this document we will demonstrate how to connect a remote application to your application running in your Kubernetes cluster. For the demonstration we will be using the AWS VPC VPN, an internal-only Ingress service, and the Kubernetes Guestbook example application. This document makes no assumptions about the VPN device being connected to other than familiarity with the device and ipsec VPN that supports IKEv1. One of the reason we are performing this demonstration using the AWS VPC VPN is that at the end of the VPN setup process you can download a configuration for a number of common VPN devices.

To get started we will need to gather some information about the VPN device to which we are connecting. The following information will be needed:

  • Internet IP address of the VPN device
  • If you plan on using Border Gateway Protocol for dynamic routing you will need the BGP ASN of the VPN device
  • If you plan on using static routing you will need CIDR ranges associated with the remote network(s)

Once you have this information you can proceed with the creation of the VPN following the instructions from AWS ( One important thing to note about the AWS VPN is that it does not act as a connection initiator. If you plan on initiating connections from your application running on Kubernetes, you will need to set up some sort of ping or keepalive on the remote side to keep the VPN up.

Now that you have a working VPN connection you can expose the application to the VPC. This is accomplished using a Kubernetes Ingress service with a load balancer set to internal-only mode. To get an application working for demonstration purposes I used the Guestbook application provided by Kubernetes ( I followed this example pretty closely. The only changes I made were to install the application into its own namespace so that it would be easier to completely remove, and also the changes needed for it to utilize an internal-only Ingress service. In the frontend-service.yaml file I changed type: NodePort to type: ClusterIP. I then created an ingress configuration for the service which is as follows:

apiVersion: extensions/v1beta1
kind: Ingress
  name: frontend-ingress
  annotations: internal-nginx
  namespace: guestbook
    - hosts:
      secretName: guestbook-certificate
    - host:
        - path: /
            serviceName: frontend
            servicePort: 80

Of particular importance is the annotation. This is what assigns this ingress to the internal ingress service. Since the ingress service is configured to only allow https, I also needed to create a certificate and put it in a secret. You can see that the above ingress config specifies some tls settings. This tells the internal ingress service that this ingress needs to use the certificate located in the the guestbook-certificate secret. To create the guestbook-certificate secret I ran the following commands:

I used a self-signed certificate for testing. In any environment that interacts with ePHI you will need to use a certificate signed by a certificate authority.

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout -out -subj "/"

Create the secret from the contents of the certificate and key.

kubectl --namespace=guestbook create secret tls guestbook-certificate --key --cert

We will also need a DNS record for the internal service. For simplicity I am using a public DNS zone hosted by AWS Route53 for the CNAME to the internal ingress service’s load balancer. This can be done with private DNS services, but explaining how to do that is well beyond the scope of this document.

With the guestbook application modified and deployed according to its documentation along with the additional frontend-ingress and the route53 CNAME in place we can test connectivity. To do this I ran a curl request from the remote VPN device.

myvpn:~# curl --insecure
<html ng-app="redis">
    <link rel="stylesheet" href="//">
    <script src=""></script>
    <script src="controllers.js"></script>
    <script src=""></script>
  <body ng-controller="RedisCtrl">
    <div style="width: 50%; margin-left: 20px">
    <input ng-model="msg" placeholder="Messages" class="form-control" type="text" name="input"><br>
    <button type="button" class="btn btn-primary" ng-click="controller.onRedis()">Submit</button>
      <div ng-repeat="msg in messages track by $index">

Here we can see that we are able to connect to our internal service over the VPN. This connection is fully secured using encryption in transit. From the remote VPN device - or whatever devices route through the VPN device - there is an https connection created. This connection traverses the VPN tunnel which is a second layer of encryption. The https connection exits the tunnel and travels through the load balancer. No SSL offload is performed on the load-balancer. Doing so would send unencrypted traffic to the Kubernetes cluster. Instead the https connection is SSL terminated at the ingress service. The ingress service is directly connected to the Kubernetes cluster’s encrypted overlay network. Thus you have a fully encrypted connection between applications running in separate networks over a private network link.

Custom AlertManager routing

Datica’s monitoring and alerting stack provides you with a tremendous amount of flexibility for defining your own alerts and how to receive them. In this section, we’ll show you how to configure custom routing for alerts.

The configuration for alertmanager is stored as a base64 encoded value within a Kubernetes secret called alertmanager-main in the monitoring namespace. To make changes to alertmanager, we will need to encode the new configuration as a base64 encoded value, then embed that value within the yaml file for the alertmanager secret, and finally apply that yaml so that the modified secret is deployed.

  1. Retrieve the yaml for the alertmanager secret by running kubectl get secret alertmanager-main -n monitoring -o yaml. Save the yaml into a file so that it can be modified and applied back to the cluster later.
  2. Record the value from the alertmanager.yaml field in that file. This value is the base64 encoded version of the alertmanager configuration. Run echo <value from field> | base64 --decode to see the configuration yaml that is currently applied.
  3. Modify the configuration yaml from step 2 as desired and save it as a new file. For more information on how to configure alertmanager, see
  4. Run cat <filename from step 3> | base64 and insert the output in the alertmanager-custom.yaml file in the alertmanager.yaml field. This is the base64 encoded version of the modified alertmanager configuration file.
  5. Apply the new configuration by running kubectl apply -f alertmanager-custom.yaml. The changes to alertmanager are applied immediately, and can be viewed in the alertmanager dashboard. To view the dashboard, run kubectl port-forward -n monitoring service/alertmanager-main 9093:9093 and navigate to localhost:9093 in a browser. The current configuration can be found on the status page on the dashboard.

Security Group Rule Changes

There are certain scenarios where services you deploy require changes to Security Group rules in order to function appropriately. By default, Datica includes a set of Security Groups with an initial set of rules to enable the cluster and resources to function. For managing changes to the overall Security Group rules, we suggest creating a new Security Group with the specific rules you need. Having a separate group will make it easier for you to distinguish between manual changes you have made and the rules that are set up and managed by Datica.

While we suggest creating your own Security Group for managing new rules, if changes to Datica’s Security Groups are made, the following situations apply: any rules you add to Datica’s Security Groups will persist, but any changes to rules that Datica sets up by default will NOT persist.

Capacity Planning

As the resource requirements of your applications grow, you may need to consider expanding your cluster. A standard CKS deployment comes with three m5.xlarge worker nodes, but CKS supports increasing the size of these worker nodes as well as increasing the number of worker nodes in the cluster.

Within the cluster itself, Kubernetes comes with several standard tools for resource management. Resource requests are used mainly for scheduling decisions, while resource limits are thresholds that apply during execution. Containers exceeding their resource limits may be terminated. Quality of Service classes are assigned to pods based on their configured resource requests and limits, and pods that are assigned the Burstable QoS class will be given a higher priority than pods that are assigned the BestEffort class.

Grafana Dashboards

CKS provides several Grafana dashboards to assist with capacity planning. To access these dashboards, you can use the command kubectl port-forward -n monitoring svc/grafana 3000:3000 and navigate in a browser to http://localhost:3000/.

The Kubernetes / Compute Resources / Cluster dashboard shows overall usage statistics for the cluster. It includes details such as memory, CPU, disk, and network utilization. The USE Method / Cluster and Node dashboards show similar metrics but distinguish by individual nodes. There are individually configured dashboards outlining health and performance of each component of the Kubernetes control plane (Kubelets, the Controller Managers, Scheduler, API Server, etc.). The Kubernetes / Compute Resources / Pod dashboard displays the resource usage for every pod in your cluster, allowing you to compare each pod’s requested resources against the actual usage statistics. The Kubernetes / Compute Resources / Node (Pods) gives you similar information, but provides a filter for what specific node on which the pod is running.

The Grafana deployment includes several other dashboards that you may find useful for monitoring the health and usage of your cluster. We highly recommend taking a few minutes to see which dashboards are available and considering how they might help you make the best use of your cluster.

Custom Grafana Dashboards

Note: The latest Grafana version that has been deployed to CKS clusters has authentication enabled. The username and password to log in are both “admin”. Grafana will ask you to change this password, but changes will not persist. You can simply skip the password changing screen. This does not affect the security of your Grafana dashboards, as access to the dashboards still requires the appropriate cluster access to view and edit resources in the monitoring namespace of your cluster.

Dashboard Installation

The steps below explain how to add a custom Grafana Dashboard to your cluster. Before starting, create your Grafana dashboard and export it as JSON (Refer to the Grafana documentation for more information).

Step 1

Edit the custom-grafana-dashboards configmap

$ kubectl -n monitoring edit configmap custom-grafana-dashboards

Step 2

Add dashboards to the configmap, under the data: key. As many dashboards as you like can be added, each is just a unique JSON filename under the data: key with a multiline string containing the JSON dashboard definition.

If the configmap does not have a data: key, add one as shown below

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
apiVersion: v1
  my-dashboard.json: |+
  another-great-dashboard.json: |+

Step 3

Save and exit

Step 4

Redeploy Grafana, to pick up the custom dashboard changes

$ kubectl -n monitoring delete pods -l app=grafana

Release Notes

July 14th, 2020


This release includes several versioning updates and new features:

  • Updated Prometheus Components
  • Updated ClamAV Daemonset
  • Image Garbage Collection threshold increased from 15% to 20% to help cleanup image volume sooner
  • An hourly etcd kubernetes cronjob in the kube-system namespace is scheduled to create and upload a snapshot to S3, with a default 7 day retention period for disaster recovery
  • Several operational bugfixes
  • Several security-related bugfixes

April 28, 2020


This release includes several versioning updates and new features:

  • All components of Kubernetes have been upgraded to version 1.17.3
  • cri-o has been upgraded to version 1.17.0
  • etcd has been upgraded to version 3.4.3
  • CoreDNS has been upgraded to version 1.6.6
  • Kernel-level audit logging has been enabled and logs can now be found in the standard CKS logging stack
  • Several minor security enhancements have been implemented

January 7, 2020


This release contains several small bugfixes and features.

  • The version of Alertmanager deployed to CKS clusters has been upgraded to v0.20.0.
  • CoreOS (host-level) events are now logged to the CKS logging stack.
  • Ingress controller metrics are now being collected and are available within Prometheus.
  • A bug that caused by a timeout during the creation of large snapshots has been fixed.

October 29, 2019


This release updates all components of Kubernetes to 1.15.5, and the underlying operating system to the latest stable version. This release also enables support for the Kubernetes security feature PodSecurityPolicies. For more information on how PodSecurityPolicies work, see the official Kubernetes documentation. Please contact Datica support if you would like to turn on this feature in your cluster.

August 21, 2019


  • Two major vulnerabilities have been disclosed that affect CKS clusters - CVE-2019-9512 and CVE-2019-9514. Kubernetes has released new builds to address these issues. In light of their severity, Datica will be updating all clusters to Kubernetes v1.13.10 (changelog here) and ingress-nginx v0.25.1 (changelog here).

July 30, 2019


  • The cronjob responsible for managing backups is moving to the kube-system namespace.
  • It is easier to add custom Grafana dashboards for monitoring. See our Grafana Dashboards Tutorial for more information.
  • A bug that prevented default Prometheus alert rules from appearing in the alerts dashboard has been fixed.
  • A bug caused by a timeout during the pruning of old snapshots that prevented the snapshots from being removed has been fixed.

April 23, 2019


This release updates all components of kubernetes to 1.13.5, and 1.13.3 for CRI-O. All Datica deployed components will now make use of pod priority policies to ensure core cluster and compliance pods will not be evicted when a node runs out of resources. Also included in this release is an update to the blockstoragebackup cronjob to ensure that it is only snapshotting volumes attached to nodes of the cluster, rather than snapshotting all volumes in the region.

March 19, 2019


This release includes an update to the version of Fluentd to 1.3.3 and its configuration to more effectively clear log buffers to S3 as well as an update to the configuration of image garbage collection to execute based on both inode and disk usage.

February 20, 2019


A major vulnerability that affects CKS has been discovered. You can find details of the vulnerability here. This release is specifically for a fix to CVE-2019-5736 which is handled by updating the version of CoreOS running on all nodes to 1967.6.0.

Most clusters will be entirely unaffected by this rollout. A notable exception is that pods with stringent memory limits may need to have the limits increased to work with the runc patch. Specifically, we suggest that all pods running on CKS have any memory limits set to 10MB higher than the pod itself needs to support the runc patch.

January 16, 2019

The latest CKS release will be hitting staging clusters January 16 starting at 9am CST. In this release, we’ve made significant improvements to our vulnerability management tooling as well as core Kubernetes components. See the full release notes below:


  • We will be installing Wazuh on all customer CKS clusters. Wazuh improves our ability to scan the cluster for vulnerabilities — similar to Nessus, alerts from Wazuh will be sent directly to Datica’s security team for evaluation and handling, including direct customer notification as necessary.
  • The deployments for ingress, coredns, elasticsearch, and prometheus-operator are now configured to land on controller nodes, leaving more capacity on worker nodes for customer workloads.
  • System volumes and snapshots will have tags for faster recovery in the event of a disaster.

Bug Fixes

  • Fixed internal Prometheus communication issues enabling grafana to pull in more default data.

Customer Support

In order to tend to your support issue in a timely manner, please submit your ticket through the Platform dashboard by clicking on the “Contact Support” button located in the footer of the Environment UI. This provides valuable metadata to the support staff, which allow them to triage the issue much quicker.