Solutions/Orbit Hosted/Advanced agents

Evaluations

Evaluation is essential for measuring the performance and reliability of your agents. Orbit provides tools for automated and manual evaluation.

Evaluation Methods

Automated tests: Use test suites to validate agent outputs and behaviors.
Human-in-the-loop: Collect user feedback and ratings for continuous improvement.
Metrics and logging: Track accuracy, latency, and usage statistics.

Example: Automated Test

def test_agent_response():
    response = my_agent.run("What is the weather today?")
    assert "weather" in response.lower()

Best Practices

Integrate evaluation into your CI/CD pipeline.
Use Orbit's observability tools to monitor agent quality.
Regularly review and update evaluation criteria.

See the Orbit documentation for evaluation templates and advanced techniques.

Last updated on

Bedrock in Orbit

Using Amazon Bedrock with Orbit.

Examples and concepts

Examples and advanced concepts for Orbit.

On this page

Evaluations Evaluation Methods Example: Automated Test Best Practices