Observability

The Observability Service provides visibility into every action an Orbit Agent takes.
It captures logs, metrics, and traces across the platform using OpenTelemetry, giving teams a unified way to monitor performance, debug issues, and ensure compliance.

Why Observability Matters

End-to-End Tracing – Follow a request from user input, through the lead agent, to sub-agents and integrations.
Performance Insights – Identify slowdowns in agents, tools, or external integrations.
Error Visibility – Surface failures, retries, and anomalies automatically.
Governance & Compliance – Create audit trails of agent behavior and external API calls.
Scalability – Standardized telemetry that works across all BUs and hosting environments.

What We Collect

Traces
- Agent invocation start and end times
- Tool calls (with inputs/outputs redacted or masked for PII)
- Integration Hub calls (external API requests)
- Latency per span
Metrics
- Request counts per agent, tool, or integration
- Latency distributions
- Error rate (per agent, per org, per endpoint)
- Resource usage (CPU, memory, concurrency)
Logs
- Structured JSON logs for agent events
- Warnings, errors, retries
- Compliance or security flags

trace_id: "abc123"
spans:
  - name: "Agent Invocation"
    start_time: 2025-08-20T18:12:00Z
    end_time: 2025-08-20T18:12:02Z
    attributes:
      agent.name: "billing-agent"
      org.id: "clubwise"
  - name: "Tool Call - invoices.list"
    start_time: 2025-08-20T18:12:01Z
    end_time: 2025-08-20T18:12:01.500Z
    attributes:
      tool.name: "invoices.list"
      status: "success"
  - name: "IntegrationHub - Stripe API"
    start_time: 2025-08-20T18:12:01.600Z
    end_time: 2025-08-20T18:12:01.800Z
    attributes:
      integration: "stripe"
      http.status_code: 200

This example shows a single user request traced across:

The billing-agent
Its invoices.list tool
The Stripe integration

Where the Data Goes

Orbit emits OpenTelemetry data that can be shipped to your preferred backend, such as:

Your team configures the destination in your infrastructure. For development and testing, use Orbit Console's Observability Settings. For production, configure directly in your hosted environment.

Security & Compliance

Data Redaction – Sensitive fields (e.g., PII, payment data) are masked before export.
Access Controls – Only authorized users can view telemetry streams.
Audit Trails – All traces can be retained to meet Orbit Protocol compliance.

Best Practices

Trace Everything: Always instrument new agents and tools with OTel hooks.
Use Attributes: Add meaningful metadata (agent.name, tool.name, org.id) for easier debugging.
Centralize Storage: Ship telemetry to a single backend for cross-org visibility.
Set Alerts: Monitor error rates and latency spikes in Grafana, Datadog, or CloudWatch.

Tip: Observability is not just for debugging — it’s also for trust. Transparent traces and metrics reassure teams that Orbit Agents are behaving securely, efficiently, and in compliance with organizational policies.