Deploying the OpenTelemetry Collector to AKS

While investigating some issues users raised around the OpenTelemetry Collector running in AKS, I found a few nuances that are worth noting.

What is the OpenTelemetry Collector?

The collector is the focal point for telemetry inside your cluster. Instead of your containerised applications sending directly to your OpenTelemetry capable backend (the place that allows you to ask questions of your telemetry) we send that data to an internal location first and that will forward the data on.

Why is the Collector useful in AKS?

When you’re running applications in a Kubernetes cluster, you’re running them in a more “portable” way. The applications themselves don’t realise that they’re running in K8s and everything they need is injected in an agnostic way.

However, from an Observability perspective, knowledge of the surrounding environment is important to get a full understanding of what is going on. That could be the metrics associated with the pod that’s hosting that code, or it code by the information about the node that the pod is running on. This surrounding information can help answer questions like:

Was the CPU high on the Pod that served this request?
Was the Node that served this request under heavy network load?
Are the requests being evenly distributed over multiple pods?

These questions, and the answers we get, help us truly understand how our application works inside the wider context of production.

How to deploy the Collector in AKS?

The OpenTelemetry Collector team provide helm charts that make it incredibly easy to deploy the services.

The default or recommended approach to deploying collectors in K8s doesn’t work in Azure though, so we need to make some small amendments to the process. The base guide is located in the OpenTelemetry Docs and will work fine for most installations of the collector. For AKS, once you’ve followed that guide, there are few other additions you’ll need to make to get things working optimally.

Kubeletstats and certificates

The first nuance with default collector deployment and AKS is that the kubeletstatsreceiver won’t work and will through an error regarding certificates. This is because in AKS, the Kubelet API server uses Self-signed certificates, instead of using the kube-root CA to sign them. We can get around this by adding a property to our values.yaml to allow the receiver to skip the validation of the authority while still using the SSL endpoint and the security tokens.

Add this to your values file for the collector that is receiving the Kubelet stats.

config:
  receivers:
    kubeletstats:
      insecure_skip_verify: true

Kubernetes Attributes Processor and NAT

The second nuance is that the k8s attributes processor will fail to lookup any of the pod if you’re using kubenet networking for your AKS cluster. This is the default networking mode for AKS.

This will mean that your telemetry does not get enriched with information like the Deployment name, the Node name etc. by the collector. This is what provides the context to link our infrastructure metrics data to our application telemetry, so without it we’re a bit blind.

The reason for this is that the default setup for the Collector is to use a DaemonSet and then have all your applications send their telemetry to the Node IP using the K8s Downward API to set that as an Environment variable. For example

  env:
    - name: "OTEL_COLLECTOR_NAME"
      valueFrom:
        fieldRef:
          fieldPath: status.hostIP

However, in AKS, in Kubenet networking mode, the calls to that Host/Node IP are proxied through a NAT gateway and therefore the Collector only sees a connection from an IP like 10.244.2.1 instead of the Pod IP. We can, however, fix this using a K8s service.

The reason that this is the preferred way is that it escapes the software networking load by routing locally. This means that you’re also not going to have Telemetry flowing between different nodes which can incur significant costs if that traffic ends up transitioning VNETs, or Regions. However, K8s introduced something called internalTrafficPolicy which aim’s to solve that by sending the traffic internal to a node where that’s possible.

Add this to your values.yaml for the collector that is deployed as a DaemonSet.

service:
  enabled: true

Then in your applications, instead using the downward API, add a reference to the namespace and the service name. The service name is generated using the name of the Helm release and the string opentelemetry-collector. For example:

Helm Release Name: otelcol
Namespace: observability
Service name: otelcol-opentelemetry-collector

So your configuration for the applications would be:

  env:
    - name: "OTEL_COLLECTOR_NAME"
      value: otelcol-opentelemetry-collector.observability

This will mean that your applications are now sending to the service instead of directly to the pod, however, because the default policy from the helm chart for the collector is internalTrafficPolicy: local, they will resolve to the pod on the same node.

Conclusion

Setting up the OpenTelemetry Collector in the Azure Kubernetes Service (AKS) is really easy, but there are pretty significant nuances that make it feel like it can’t work out of the box. That said, they’re pretty easy to get around without you having to create your own Helm charts.