Building a secure OpenTelemetry Collector

The OpenTelemetry collector is a core part of your telemetry pipelines, which makes it one of the parts of your infrastructure you want to be as secure as possible. The general advice from the OpenTelemetry teams is to build a custom collector executable instead of using the supplied ones when you’re using it in a production scenario, however, that isn’t an easy task. That prompted me to build something to make it easy.

In this post, we’ll go through how to build a custom collector in various ways, including the new way I’ve created using just your standard OpenTelemetry collector configuration.

If you just want to see the new stuff, take a look at the repository, or read the last section.

What does OpenTelemetry provide?

The OpenTelemetry team provides docker images you can use:

  • OpenTelemetry (Core)
    This is a limited image and includes components that are maintained by the core OpenTelemetry collector team. The manifest includes all the components from the base OpenTelemetry Collector repo, but also includes some of the most commonly used components from the Contrib repo like filter and attribute processors, and some common exporters like Jaegar and Zipkin.
  • OpenTelemetry Contrib
    This is the kitchen sink version. The manifest includes almost everything from the core and contrib repos, with some omissions where the components is in development.

Why aren’t these enough?

They include too much of an attack surface, it’s as simple as that in my eyes. Remember, code is a liability, aim for less not more.

These images, even the core one, have more components than anyone will require. As an example, I’ve not yet come across a user (and I talk to a lot of you) that uses the OTLP, Zipkin AND Jaeger exporters in the same collector. So including those would be a bad security practice as you’re including potential attack vectors you don’t need to.

As an example, there’s been a long standing vulnerability with the Jaeger receiver that can’t be fixed until a Go upgrade was done, however, removing the Jaeger receiver wasn’t an option as it would break people’s environments expecting it. So it had to be shipped.

You could use the argument that unless a component is in a pipeline, it’s not executed, so it’s not a vulnerability. I’d love to see you convince security teams of that. Regardless, unless you do a code review, and understand the vulnerability, you can’t definitively say that’s the case, and that’s not something the OpenTelemetry collector teams can do at scale.

What is the OpenTelemetry Collector Builder?

The OpenTelemetry Collector team (specifically Juraci I believe) decided that creating OpenTelemetry Collector images should be easier, and people should be able choose their components. Therefore they created a tool that takes a manifest.yaml file that specifies the Go modules to include, and uses that to build a targeted distribution with a limited set of components.

A manifest looks like this:

dist:
  name: otelcol-custom
  description: OpenTelemetry Collector
  version: 0.91.0
  otelcol_version: 0.91.0

receivers:
  - gomod: go.opentelemetry.io/collector/receiver/otlpreceiver v0.91.0

exporters:
  - gomod: go.opentelemetry.io/collector/exporter/debugexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/loggingexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/otlpexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/otlphttpexporter v0.91.0

extensions:
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/extension/healthcheckextension v0.91.0

processors:
  - gomod: go.opentelemetry.io/collector/processor/batchprocessor v0.91.0
  - gomod: go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.91.0
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/attributesprocessor v0.91.0
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/filterprocessor v0.91.0

This will generate a very minimal collector which, honestly, is what most people would use. However, you can see that there is different syntax here for some of the modules, so you need to know where they come from etc. You also need to know the syntax etc. There’s some inbuilt Go knowledge to know what gomod means etc.

The builder itself is solid, and I use it all the time to build custom images, however, I’ve been striving for a way to build this easier in pipelines and make it more accessible to people.

Building a custom collector with a two stage build

One of the ways to make using the collector builder easier is to use a two stage build and run everything inside the first container.

FROM golang:1.21 as build
ARG  OTEL_VERSION=0.90.1
WORKDIR /app
RUN go install go.opentelemetry.io/collector/cmd/builder@v${OTEL_VERSION}
COPY . .
RUN CGO_ENABLED=0 builder --config=manifest.yaml --output-path=/app

FROM cgr.dev/chainguard/static:latest
COPY --from=build /app/otelcol-custom /
COPY config.yaml /
EXPOSE 4317/tcp 4318/tcp 13133/tcp

CMD ["/otelcol-custom", "--config=/config.yaml"]

With this, you’ll still need to manually generate the manifest.yaml, however, you don’t need to install the builder, and the Go SDK locally. This is really useful if you’re not interested in writing go, or you want to do things in a release pipeline.

But we can do better, and you don’t need to learn go!

Building a custom collector with ocb-config-builder

I got to thinking about how we could build the manifest file automatically, how we could remove the gomod problems and allow developers to pick the components they wanted, instead of the go modules they wanted.

I played with some custom yaml formats, but it all felt a little weird, too many abstractions? Then chatting with Tyler Helmuth about some issues with using the collector builder, there was an epiphany… ”What if we just use the collector config itself, and map it?”

This clicked with me. It’s DevEx 101 really, don’t leak your internal decisions to the user, speak their language and make it simple for them. So the “OpenTelemetry Collector Builder Config Builder” was born (the name is bad, naming is hard).

We still need to do a stage docker build, however, we’ve skipped the step of needing the manifest.yaml. Which also has the side benefit that now the collector will never have unused components.

What does it look like?

FROM ghcr.io/martinjt/ocb-config-builder:latest as build
COPY config.yaml /config/config.yaml
RUN /builder/build-collector.sh /config/config.yaml

FROM cgr.dev/chainguard/static:latest
COPY --from=build /app/otelcol-custom /
COPY config.yaml /
EXPOSE 4317/tcp 4318/tcp 13133/tcp

CMD ["/otelcol-custom", "--config=/config.yaml"]

Just make sure that your collector config file is called config.yaml (or change it in the dockerfile). and then you’ll have a tightly coupled collector executable, in a secure container using the Chainguard base image for running it in production.

Conclusion

It doesn’t have to be complicated to have a custom OpenTelemetry Collector build anymore. You don’t need to understand go, or build config files. You can include this in your pipelines as a drop-in replacement for the collector-contrib image you’re likely using in production right now.

Leave a comment

Website Built with WordPress.com.

Up ↑