이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우
언제든지 연락주시기 바랍니다.Overview
You can configure your LLM applications on the Settings page to optimize your application’s performance and security.
- Evaluations
- Enables Datadog to assess your LLM application on dimensions like Quality, Security, and Safety. By enabling evaluations, you can assess the effectiveness of your application’s responses and maintain high standards for both performance and user safety. For more information about evaluations, see Terms and Concepts.
- Topics
- Helps identify irrelevant input for the
topic relevancy
out-of-the-box evaluation, ensuring your LLM application stays focused on its intended purpose.
Connect your account
Connect your OpenAI account to LLM Observability with your OpenAI API key. LLM Observability uses the GPT-4o mini
model for Evaluations.
- In Datadog, navigate to LLM Observability > Settings > Integrations.
- Select Connect on the OpenAI tile.
- Follow the instructions on the tile.
- Provide your OpenAI API key. Ensure that this key has write permission for model capabilities.
- Enable Use this API key to evaluate your LLM applications.
Connect your Azure OpenAI account to LLM Observability with your OpenAI API key. We strongly recommend using the GPT-4o mini model for Evaluations.
- In Datadog, navigate to LLM Observability > Settings > Integrations.
- Select Connect on the Azure OpenAI tile.
- Follow the instructions on the tile.
- Provide your Azure OpenAI API key. Ensure that this key has write permission for model capabilities.
- Provide the Resource Name, Deployment ID, and API version to complete integration.
Connect your Anthropic account to LLM Observability with your Anthropic API key. LLM Observability uses the Haiku
model for Evaluations.
- In Datadog, navigate to LLM Observability > Settings > Integrations.
- Select Connect on the Anthropic tile.
- Follow the instructions on the tile.
- Provide your Anthropic API key. Ensure that this key has write permission for model capabilities.
Connect your Amazon Bedrock account to LLM Observability with your AWS Account. LLM Observability uses the Haiku
model for Evaluations.
- In Datadog, navigate to LLM Observability > Settings > Integrations.
- Select Connect on the Amazon Bedrock tile.
- Follow the instructions on the tile.
Select and enable evaluations
- Navigate to LLM Observability > Settings > Evaluations.
- Click on the evaluation you want to enable.
- Configure an evaluation for all of your LLM applications by selecting Configure Evaluation, or you select the edit icon to configure the evaluation for an individual LLM application.
- Evaluations can be disabled by selecting the disable icon for an individual LLM application.
- If you select Configure Evaluation, select the LLM application(s) you want to configure your evaluation for.
- Select OpenAI, Azure OpenAI, Anthropic, or Amazon Bedrock as your LLM provider.
- Select the account you want to run the evaluation on.
- Choose whether you want the evaluation to run on traces (the root span of each trace) or spans (which include LLM, Workflow, and Agent spans).
- If you select to run the evaluation on spans, you must select at least one span name to save your configured evaluation.
- Select the span names you would like your evaluation to run on. (Optional if traces is selected).
- Optionally, specify the tags you want this evaluation to run on and choose whether to apply the evaluation to spans that match any of the selected tags (Any of), or all of the selected tags (All of).
- Select what percentage of spans you would like this evaluation to run on by configuring the sampling percentage. This number must be greater than 0 and less than or equal to 100. A Sampling Percentage of 100% means that the evaluation runs on all valid spans, whereas a sampling percentage of 50% means that the evaluation runs on 50% of valid spans.
After you click Save, LLM Observability uses the LLM account you connected to power the evaluation you enabled.
For more information about evaluations, see Terms and Concepts.
Estimated Token Usage
LLM Observability provides metrics to help you monitor and manage the token usage associated with evaluations that power LLM Observability. The following metrics allow you to track the LLM resources consumed to power evaluations:
ml_obs.estimated_usage.llm.input.tokens
ml_obs.estimated_usage.llm.output.tokens
ml_obs.estimated_usage.llm.total.tokens
Each of these metrics has ml_app
, model_server
, model_provider
, model_name
, and evaluation_name
tags, allowing you to pinpoint specific applications, models, and evaluations contributing to your usage.
Provide topics for topic relevancy
Providing topics allows you to use the topic relevancy evaluation.
- Go to LLM Observability > Applications.
- Select the application you want to add topics for.
- At the bottom of the left sidebar, select Configuration.
- Add topics in the pop-up modal.
Topics can contain multiple words and should be as specific and descriptive as possible. For example, for an LLM application that was designed for incident management, add “observability”, “software engineering”, or “incident resolution”. If your application handles customer inquiries for an e-commerce store, you can use “Customer questions about purchasing furniture on an e-commerce store”.
Further Reading