Orbit Logo
Solutions/Orbit Hosted/Advanced agents

Evaluations

Evaluation is essential for measuring the performance and reliability of your agents. Orbit provides tools for automated and manual evaluation.

Evaluation Methods

  • Automated tests: Use test suites to validate agent outputs and behaviors.
  • Human-in-the-loop: Collect user feedback and ratings for continuous improvement.
  • Metrics and logging: Track accuracy, latency, and usage statistics.

Example: Automated Test

def test_agent_response():
    response = my_agent.run("What is the weather today?")
    assert "weather" in response.lower()

Best Practices

  • Integrate evaluation into your CI/CD pipeline.
  • Use Orbit's observability tools to monitor agent quality.
  • Regularly review and update evaluation criteria.

See the Orbit documentation for evaluation templates and advanced techniques.

Last updated on