Skip to content

OpenTelemetry

CLARA can utilize OpenTelemetry traces as data source for finding components and to some extent component types, as well as communications between components. For that feature to work correctly, it is crucial to have instrumented applications and an OpenTelemetry Collector running in the cluster as described below.

Concept

"OpenTelemetry is an Observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs."
Traces and metrics are generated by each component individually and are forwarded to an OpenTelemetry collector, which processes the telemetry data and distributes it to a backend which utilizes it.
CLARA can be seen as such a backend, which offers a gRPC endpoint for the oTel-collector to forward the traces to. CLARA then iterates over the traces and extracts information about components and their communications from that.

Setup

When using OpenTelemetry for CLARA you first need to ensure your software components are instrumented with OpenTelemetry traces. If not consider using OpenTelemetry auto-instrumentation.

OpenTelemetry Semantic Conventions

Because OpenTelemetry traces' attributes are not standardized, it is recommended to use tracing with the OpenTelemetry semantic conventions for CLARA. If your services do not provide them, you can try to set up the OpenTelemetry auto-instrumentation on top of your system.

Second, ensure that there is an OpenTelemetry collector with the matching configuration is running in your cluster. Third, when you use CLARA on a local machine and do not deploy in the cluster, you need to forward the traces from the OpenTelemetry collector to your local machine. The open-source tool ktunnel can be used to achieve this.

OpenTelemetry Collector

The OpenTelemetry collector is a default component provided by OpenTelemetry itself. For CLARA only traces are used, thus the minimal configuration The image can be used to deploy a container with a suitable configuration as shown below. Examples for service and deployment configurations can be found in clara/deployment/open-telemetry-collector/deployment.yml

An example ConfigMap for the oTel-collector deployment
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-conf
  labels:
    app: otel-collector-conf
    component: otel-collector-conf
data:
  otel-collector-conf: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:

    processors:

    exporters:
      otlp:
        endpoint: "localhost:7878"
        tls:
          insecure: true

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: []
          exporters: [otlp]

ktunnel

ktunnel is an open-source tool that enables reverse port-forwarding to extract data out of kubernetes clusters. In order to use CLARA on a local machine, a ktunnel sidecar can be attached towards the OpenTelemetry collector deployment using
ktunnel inject deployment otel-collector-deployment 7878 -n <namespace>
For further information see the ktunnel docs.

OpenTelemetry Auto-instrumentation

OpenTelemetry Auto-instrumentation can be used to generate OpenTelemetry traces on software components that are not instrumented themselves in Kubernetes clusters. This works by applying sidecar containers to each yet to be instrumented service that capture the network traffic and generate traces from that.
For documentation on installation please see the official docs from OpenTelemetry.

Aggregation Algorithms

After the OpenTelemetry-aggregator finished collecting the traces, the algorithm will iterate over all traces and extract architectural information from it.
An OpenTelemetry spans can be of one of five kinds, Producer, Consumer, Client, Server and Internal.
Internal spans are ignored, Client- and Server-spans as well as Producer- and Consumer- spans are analyzed seperated from each other, as described below.

Client-Server

Spans of Client and Server kind are analyzed the following way:

A client as well as a server span can disclose information about the sending component as well as the respective other component in the communication. Therefore, from each span two possible components are obtained, that receive all the information available.

As there is no definitive standard for the naming and the values of span-attributes, the following logic is applied to extract information from span attributes:

  • For each seeked information (hostname, port, ip-address, and path) of the server and the client a list of often used key names is provided (e.g. server.address, url.path, etc.).
  • The spans attributes are then filtered for those key-names and if they match, regexes are applied in order to find the specific attribute.
  • Two component-objects are created and with all available information and simply added to a list of found components.
  • A relation (communication) object with the client-name and if available the server-name, otherwise the server hostname or ip-address is added.

Producer-Consumer

Producers and Consumers of message-oriented communications are specifically tagged, because there is no directly observable communication between the source and the target component.
Based on the semantic conventions, however, a producer-span contains the messaging-destination which can be used to obtain the communication. Therefore, the recovery looks as follows:

  • Three components are created, the source, the target and the message-broker
  • A Messaging relation (communications) containing the source, the target and the messaging system.

Merging

As each analyzed span creates at least two component objects, those need to be merged into a consistent pattern. The merging is done the following way:

  • All component objects without a service name are tried to be matched to component having a service name via the hostname, the ip-address or the endpoint-list.
  • Components that can't be matched will be dealt with afterward.

Mapping

The component objects finally need to be mapped to the CLARA-wide internal component and communication representation. Component objects containing a service-name are mapped to an "internal" component object, components without a service-name are mapped to an "external" one.
Communications are mapped if a matching component via service-name or hostname for source and target can be found.