Grafana LGTM Stack Setup Example¶

This example demonstrates how to set up and configure the Grafana LGTM Stack (Loki, Grafana, Tempo, Mimir) with OpenTelemetry Collector for a cost-effective, comprehensive observability solution.

Overview¶

The Grafana LGTM Stack provides a complete observability platform:

Loki: Log aggregation system (similar to Prometheus but for logs)
Grafana: Visualization and dashboarding platform
Tempo: Distributed tracing backend
Mimir: Long-term metrics storage (Prometheus-compatible)

This stack is designed to be:

Cost-effective: Open-source, efficient storage
Scalable: Handles high-volume telemetry
Integrated: All components work seamlessly together
OpenTelemetry-native: Built for OpenTelemetry from the ground up

Prerequisites¶

Docker Desktop installed and running
At least 8GB of available RAM (LGTM stack is resource-intensive)
Ports available: 3000 (Grafana), 3100 (Loki), 3200 (Tempo), 9009 (Mimir)

Step 1: Enable LGTM Stack Configuration¶

Update your application configuration (e.g., appsettings.json):

{
  "ObservabilityStack": {
    "Mode": "OtelCollector",
    "CollectorEndpoint": "http://otel-collector:4317"
  },
  "ObservabilityBackend": {
    "EnabledBackends": ["GrafanaLGTM"],
    "GrafanaLGTM": {
      "LokiEndpoint": "http://loki:3100",
      "TempoEndpoint": "http://tempo:3200",
      "MimirEndpoint": "http://mimir:9009",
      "GrafanaEndpoint": "http://grafana:3000"
    }
  }
}

Step 2: Start the LGTM Stack¶

# Navigate to Docker Compose directory
cd <your-docker-compose-directory>

# Start LGTM stack and collector
docker-compose up -d loki grafana tempo mimir otel-collector

Docker Compose Configuration¶

Example docker-compose.yml for LGTM stack:

version: '3.8'

services:
  # Loki - Log aggregation
  loki:
    image: grafana/loki:latest
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - loki-data:/loki

  # Grafana - Visualization
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
    volumes:
      - grafana-data:/var/lib/grafana
    depends_on:
      - loki
      - tempo
      - mimir

  # Tempo - Distributed tracing
  tempo:
    image: grafana/tempo:latest
    ports:
      - "3200:3200"
      - "4317:4317"  # OTLP gRPC
      - "4318:4318"  # OTLP HTTP
    command: ["-config.file=/etc/tempo/tempo.yaml"]
    volumes:
      - tempo-data:/var/tempo

  # Mimir - Metrics storage
  mimir:
    image: grafana/mimir:latest
    ports:
      - "9009:9009"
      - "8080:8080"
    command: ["-config.file=/etc/mimir/mimir.yaml"]
    volumes:
      - mimir-data:/data

  # OpenTelemetry Collector
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    ports:
      - "4317:4317"  # OTLP gRPC
      - "4318:4318"  # OTLP HTTP
    volumes:
      - ./otel-collector-config-lgtm.yaml:/etc/otelcol/config.yaml
    depends_on:
      - loki
      - tempo
      - mimir

volumes:
  loki-data:
  grafana-data:
  tempo-data:
  mimir-data:

Step 3: Configure OpenTelemetry Collector¶

Create otel-collector-config-lgtm.yaml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 400
    spike_limit_mib: 100
    check_interval: 5s
  resource:
    attributes:
      - key: service.name
        value: MyApplication
        action: upsert
      - key: deployment.environment
        value: ${ENVIRONMENT:production}
        action: upsert

exporters:
  # Loki for logs
  loki:
    endpoint: http://loki:3100
    labels:
      resource:
        service.name: "service_name"
        deployment.environment: "deployment_environment"
      attributes:
        http.method: "http_method"
        http.status_code: "http_status_code"

  # Tempo for traces
  otlp/tempo:
    endpoint: tempo:3200
    tls:
      insecure: true

  # Prometheus Remote Write for Mimir (metrics)
  prometheusremotewrite:
    endpoint: http://mimir:9009/api/v1/push
    external_labels:
      cluster: production
      environment: ${ENVIRONMENT:production}

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [otlp/tempo]

    metrics:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [prometheusremotewrite]

    logs:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [loki]

Step 4: Configure Grafana Data Sources¶

Add Loki Data Source¶

Open http://localhost:3000 in your browser
Go to Configuration > Data Sources
Click Add data source
Select Loki
Set URL to http://loki:3100
Click Save & Test

Add Tempo Data Source¶

Click Add data source
Select Tempo
Set URL to http://tempo:3200
Under Trace to logs, select the Loki data source
Click Save & Test

Add Mimir Data Source¶

Click Add data source
Select Prometheus
Set URL to http://mimir:9009/prometheus
Click Save & Test

Step 5: Configure Loki¶

Create loki-config.yaml:

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# https://github.com/grafana/loki/blob/main/pkg/analytics/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
#  reporting_enabled: false

Step 6: Configure Tempo¶

Create tempo-config.yaml:

server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318

ingester:
  max_block_duration: 5m

compactor:
  compaction:
    block_retention: 1h
    compacted_block_retention: 1h

storage:
  trace:
    backend: local
    local:
      path: /var/tempo/traces
    pool:
      max_workers: 100
      queue_depth: 10000

overrides:
  default:
    ingestion_burst_size_mb: 16
    ingestion_rate_limit_mb: 16

Step 7: Configure Mimir¶

Create mimir-config.yaml:

target: all

server:
  http_listen_port: 9009
  grpc_listen_port: 9095

distributor:
  pool:
    health_check_ingesters: true

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: memberlist
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h
  max_chunk_age: 1h
  chunk_target_size: 1048576
  chunk_retain_period: 30s
  max_transfer_retries: 0

store_gateway:
  sharding_enabled: true

compactor:
  data_dir: /data/compactor
  sharding_enabled: true

storage:
  backend: filesystem
  filesystem:
    dir: /data/mimir

limits:
  ingestion_rate: 10000
  ingestion_burst_size: 20000
  max_global_series_per_user: 100000
  max_global_series_per_metric: 20000

Step 8: Verify Setup¶

Check Loki¶

curl http://localhost:3100/ready

Query logs:

curl -G -s "http://localhost:3100/loki/api/v1/query_range" \
  --data-urlencode 'query={service_name="MyApplication"}' \
  --data-urlencode 'start=1h' \
  --data-urlencode 'end=now'

Check Tempo¶

curl http://localhost:3200/ready

Check Mimir¶

curl http://localhost:9009/ready

Query metrics:

curl http://localhost:9009/prometheus/api/v1/query?query=up

Check Grafana¶

Open http://localhost:3000
Verify all data sources are connected
Navigate to Explore to query data

Step 9: Create Grafana Dashboards¶

Logs Dashboard¶

Go to Dashboards > New Dashboard
Add panel with Loki query: {service_name="MyApplication"}
Visualize logs over time

Traces Dashboard¶

Add panel with Tempo query
View trace timeline and spans
Link traces to logs using trace ID

Metrics Dashboard¶

Add panel with Prometheus query (Mimir): rate(http_server_request_duration_seconds_count[5m])
Create graphs for key metrics
Set up alerts

Correlated Observability¶

Grafana's Explore view allows you to:

Start from a trace, jump to related logs
Start from a log, jump to related traces
View metrics for the same time period
Correlate issues across all three pillars

Step 10: Generate Test Data¶

Start your application and generate telemetry:

# Make API calls to generate traces, metrics, and logs
for i in {1..20}; do
  curl http://localhost:8081/api/health
  sleep 0.5
done

Configuration Examples¶

Resource Attributes¶

Add custom resource attributes for better filtering:

processors:
  resource:
    attributes:
      - key: service.name
        value: MyApplication
        action: upsert
      - key: deployment.environment
        value: production
        action: upsert
      - key: service.version
        value: 1.0.0
        action: upsert
      - key: team.name
        value: platform-team
        action: upsert

Log Labeling¶

Configure Loki labels for efficient querying:

exporters:
  loki:
    endpoint: http://loki:3100
    labels:
      resource:
        service.name: "service_name"
        deployment.environment: "deployment_environment"
        service.version: "service_version"
      attributes:
        http.method: "http_method"
        http.status_code: "http_status_code"
        http.route: "http_route"

Trace Sampling¶

Reduce trace volume with sampling:

processors:
  probabilistic_sampler:
    sampling_percentage: 10.0

Metrics Aggregation¶

Configure Mimir for long-term storage:

exporters:
  prometheusremotewrite:
    endpoint: http://mimir:9009/api/v1/push
    external_labels:
      cluster: production
      environment: production
      region: us-east-1

Performance Optimization¶

Loki Optimization¶

Use label-based indexing (not full-text search)
Limit label cardinality
Configure retention policies
Use chunk compression

Tempo Optimization¶

Enable trace compression
Configure block retention
Use object storage for long-term retention
Enable trace sampling

Mimir Optimization¶

Configure ingestion limits
Use sharding for scale
Enable compression
Configure retention policies

Cost Considerations¶

The LGTM stack is designed to be cost-effective:

Efficient Storage: Compressed storage formats
No Vendor Lock-in: Open-source, self-hosted
Scalable: Handles high volume efficiently
Resource Efficient: Lower resource requirements than ELK stack

Storage Optimization¶

Configure retention policies
Use compression
Archive old data to object storage
Enable downsampling for long-term metrics

Troubleshooting¶

No Logs in Loki¶

Problem: Logs not appearing in Loki

Solution:

Verify collector is exporting to Loki: Check collector logs
Check Loki is running: curl http://localhost:3100/ready
Verify label configuration matches query
Check time range in Grafana

No Traces in Tempo¶

Problem: Traces not appearing in Tempo

Solution:

Verify collector is exporting to Tempo
Check Tempo is running: curl http://localhost:3200/ready
Verify OTLP endpoint configuration
Check trace sampling percentage

No Metrics in Mimir¶

Problem: Metrics not appearing in Mimir

Solution:

Verify Prometheus Remote Write endpoint
Check Mimir is running: curl http://localhost:9009/ready
Verify metric format compatibility
Check ingestion limits

High Resource Usage¶

Problem: Stack using too much memory/CPU

Solution:

Reduce retention periods
Enable sampling
Optimize label cardinality
Increase resource limits
Use object storage for long-term data

Use Cases¶

Comprehensive Observability¶

Single platform for metrics, logs, and traces
Correlated observability across all pillars
Unified dashboards and alerts

Cost-Effective Monitoring¶

Open-source solution
Efficient storage
No per-GB pricing
Self-hosted control

High-Volume Systems¶

Scalable architecture
Handles millions of metrics
Efficient log aggregation
Distributed trace storage

Multi-Tenant Environments¶

Tenant isolation
Resource quotas
Per-tenant dashboards
Access control

Observability Stacks: Observability stack comparison and selection
OpenTelemetry Collector: OpenTelemetry Collector configuration
Logging: Logging patterns and best practices
Metrics: Metrics collection and instrumentation
Distributed Tracing: Distributed tracing implementation

Grafana LGTM Stack Setup Example¶

Overview¶

Prerequisites¶

Step 1: Enable LGTM Stack Configuration¶

Step 2: Start the LGTM Stack¶

Docker Compose Configuration¶

Step 3: Configure OpenTelemetry Collector¶

Step 4: Configure Grafana Data Sources¶

Add Loki Data Source¶

Add Tempo Data Source¶

Add Mimir Data Source¶

Step 5: Configure Loki¶

Step 6: Configure Tempo¶

Step 7: Configure Mimir¶

Step 8: Verify Setup¶

Check Loki¶

Check Tempo¶

Check Mimir¶

Check Grafana¶

Step 9: Create Grafana Dashboards¶

Logs Dashboard¶

Traces Dashboard¶

Metrics Dashboard¶

Correlated Observability¶

Step 10: Generate Test Data¶

Configuration Examples¶

Resource Attributes¶

Log Labeling¶

Trace Sampling¶

Metrics Aggregation¶

Performance Optimization¶

Loki Optimization¶

Tempo Optimization¶

Mimir Optimization¶

Cost Considerations¶

Storage Optimization¶

Troubleshooting¶

No Logs in Loki¶

No Traces in Tempo¶

No Metrics in Mimir¶

High Resource Usage¶

Use Cases¶

Comprehensive Observability¶

Cost-Effective Monitoring¶

High-Volume Systems¶

Multi-Tenant Environments¶

Related Documentation¶

Further Reading¶