This tutorial walks you through the steps for enabling tracing on a sample Go application installed in a cluster on AWS Elastic Container Service (ECS) with Fargate. In this scenario, the Datadog Agent is also installed in the cluster.
For other scenarios, including the application and Agent on a host, the application in a container and Agent on a host, the application and Agent on cloud infrastructure, and on applications written in other languages, see the other Enabling Tracing tutorials. Some of those other tutorials, for example, the ones using containers or EKS, step through the differences seen in Datadog between automatic and custom instrumentation. This tutorial skips right to a fully custom instrumented example.
This tutorial also uses intermediate-level AWS topics, so it requires that you have some familiarity with AWS networking and applications. If you’re not as familiar with AWS, and you are trying to learn the basics of Datadog APM setup, use one of the host or container tutorials instead.
An AWS IAM user with AdministratorAccess permission. You must add the profile to your local credentials file using the access and secret access keys. For more information, read Configuring the AWS SDK for Go V2.
Install the sample Go application
Next, install a sample application to trace. The code sample for this tutorial can be found at github.com/DataDog/apm-tutorial-golang.git. Clone the git repository by running:
The repository contains a multi-service Go application pre-configured to run inside Docker containers. The docker-compose YAML files to make the containers are located in the docker directory. This tutorial uses the service-docker-compose-ECS.yaml file, which builds containers for the notes and calendar service that make up the sample application.
In addition, this tutorial uses several configuration files in the terraform/Fargate directory to create the environment to deploy the sample application to ECS with Fargate.
Initial ECS setup
The application requires some initial configuration, including adding your AWS profile (already configured with the correct permissions to create an ECS cluster and read from ECR), AWS region, and Amazon ECR repository.
Open terraform/Fargate/global_constants/variables.tf. Replace the variable values below with your correct AWS account information:
Your application (without tracing enabled) is containerized and available for ECS to pull.
Deploy the application
Start the application and send some requests without tracing. After you’ve seen how the application works, you’ll instrument it using the tracing library and Datadog Agent.
To start, use a Terraform script to deploy to Amazon ECS:
From the terraform/Fargate/deployment directory, run the following commands:
terraform init
terraform apply
terraform state show 'aws_alb.application_load_balancer'
Note: If the terraform apply command returns a CIDR block message, the script to obtain your IP address did not work on your local machine. To fix this, set the value manually in the terraform/Fargate/deployment/security.tf file. Inside the ingress block of the load_balancer_security_group, switch which cidr_blocks line is commented out and update the now-uncommented example line with your machine’s IP4 address.
Make note of the DNS name of the load balancer. You’ll use that base domain in API calls to the sample app. Wait a few minutes for the instances to start up.
Open up another terminal and send API requests to exercise the app. The notes application is a REST API that stores data in an in-memory H2 database running on the same container. Send it a few commands:
curl -X GET 'BASE_DOMAIN:8080/notes'
[]
curl -X POST 'BASE_DOMAIN:8080/notes?desc=hello'
{"id":1,"description":"hello"}
curl -X GET 'BASE_DOMAIN:8080/notes?id=1'
{"id":1,"description":"hello"}
curl -X GET 'BASE_DOMAIN:8080/notes'
[{"id":1,"description":"hello"}]
curl -X PUT 'BASE_DOMAIN:8080/notes/1?desc=UpdatedNote'
{"id":1,"description":"UpdatedNote"}
curl -X GET 'BASE_DOMAIN:8080/notes'
[{"id":1,"description":"UpdatedNote"}]
curl -X POST 'BASE_DOMAIN:8080/notes?desc=NewestNote&add_date=y'
{"id":2,"description":"NewestNote with date 12/02/2022."}
This command calls both the notes and calendar services.
After you’ve seen the application running, run the following command to stop it and clean up the AWS resources so that you can enable tracing:
terraform destroy
Enable tracing
Next, configure the Go application to enable tracing.
To enable tracing support:
Uncomment the following imports in apm-tutorial-golang/cmd/notes/main.go:
The steps above enabled automatic tracing with fully supported libraries. In cases where code doesn’t fall under a supported library, you can create spans manually.
Open notes/notesController.go. This example already contains commented-out code that demonstrates the different ways to set up custom tracing on the code.
The makeSpanMiddleware function in notes/notesController.go generates middleware that wraps a request in a span with the supplied name. Uncomment the following lines:
notes/notesController.go
r.Get("/notes",nr.GetAllNotes)// GET /notes
r.Post("/notes",nr.CreateNote)// POST /notes
r.Get("/notes/{noteID}",nr.GetNoteByID)// GET /notes/123
r.Put("/notes/{noteID}",nr.UpdateNoteByID)// PUT /notes/123
r.Delete("/notes/{noteID}",nr.DeleteNoteByID)// DELETE /notes/123
notes/notesController.go
r.Get("/notes",makeSpanMiddleware("GetAllNotes",nr.GetAllNotes))// GET /notes
r.Post("/notes",makeSpanMiddleware("CreateNote",nr.CreateNote))// POST /notes
r.Get("/notes/{noteID}",makeSpanMiddleware("GetNote",nr.GetNoteByID))// GET /notes/123
r.Put("/notes/{noteID}",makeSpanMiddleware("UpdateNote",nr.UpdateNoteByID))// PUT /notes/123
r.Delete("/notes/{noteID}",makeSpanMiddleware("DeleteNote",nr.DeleteNoteByID))// DELETE /notes/123
Also remove the comment around the following import:
notes/notesController.go
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer"
The doLongRunningProcess function creates child spans from a parent context. Remove the comments to enable it:
notes/notesHelper.go
funcdoLongRunningProcess(ctxcontext.Context){childSpan,ctx:=tracer.StartSpanFromContext(ctx,"traceMethod1")childSpan.SetTag(ext.ResourceName,"NotesHelper.doLongRunningProcess")deferchildSpan.Finish()time.Sleep(300*time.Millisecond)log.Println("Hello from the long running process in Notes")privateMethod1(ctx)}
The privateMethod1 function demonstrates creating a completely separate service from a context. Remove the comments to enable it:
notes/notesHelper.go
funcprivateMethod1(ctxcontext.Context){childSpan,_:=tracer.StartSpanFromContext(ctx,"manualSpan1",tracer.SpanType("web"),tracer.ServiceName("noteshelper"),)childSpan.SetTag(ext.ResourceName,"privateMethod1")deferchildSpan.Finish()time.Sleep(30*time.Millisecond)log.Println("Hello from the custom privateMethod1 in Notes")}
Open terraform/Fargate/deployment/main.tf. The sample app already has the base configurations necessary to run the Datadog Agent on ECS Fargate and collect traces: the API key (which you configure in the next step), enabling ECS Fargate, and enabling APM. The definition is provided in both the notes task and the calendar task.
Provide the API key variable with a value. Open terraform/Fargate/global_constants/variables.tf, uncomment the output "datadog_api_key" section, and provide your organization’s Datadog API key.
Universal Service Tags identify traced services across different versions and deployment environments so that they can be correlated within Datadog, and so you can use them to search and filter. The three environment variables used for Unified Service Tagging are DD_SERVICE, DD_ENV, and DD_VERSION. For applications deployed on ECS, these environment variables are set within the task definition for the containers.
For this tutorial, the /terraform/Fargate/deployment/main.tf file already has these environment variables defined for the notes and calendar applications. For example, for notes:
You can also see that Docker labels for the same Universal Service Tags service, env, and version values are set. This allows you also to get Docker metrics once your application is running.
Your multi-service application with tracing enabled is containerized and available for ECS to pull.
Launch the app to see traces
Redeploy the application and exercise the API:
Redeploy the application to Amazon ECS using the same terraform commands as before, but with the instrumented version of the configuration files. From the terraform/Fargate/deployment directory, run the following commands:
terraform init
terraform apply
terraform state show 'aws_alb.application_load_balancer'
Make note of the DNS name of the load balancer. You’ll use that base domain in API calls to the sample app.
Wait a few minutes for the instances to start up. Wait a few minutes to ensure the containers for the applications are ready. Run some curl commands to exercise the instrumented app:
curl -X GET 'BASE_DOMAIN:8080/notes'
[]
curl -X POST 'BASE_DOMAIN:8080/notes?desc=hello'
{"id":1,"description":"hello"}
curl -X GET 'BASE_DOMAIN:8080/notes?id=1'
{"id":1,"description":"hello"}
curl -X GET 'BASE_DOMAIN:8080/notes'
[{"id":1,"description":"hello"}]
curl -X PUT 'BASE_DOMAIN:8080/notes/1?desc=UpdatedNote'
{"id":1,"description":"UpdatedNote"}
curl -X GET 'BASE_DOMAIN:8080/notes'
[{"id":1,"description":"hello"}]
curl -X POST 'BASE_DOMAIN:8080/notes?desc=NewestNote&add_date=y'
{"id":2,"description":"NewestNote with date 12/02/2022."}
This command calls both the notes and calendar services.
Wait a few moments, and take a look at your Datadog UI. Navigate to APM > Traces. The Traces list shows something like this:
There are entries for the database (db) and the notes app. The traces list shows all the spans, when they started, what resource was tracked with the span, and how long it took.
If you don’t see traces, clear any filter in the Traces Search field (sometimes it filters on an environment variable such as ENV that you aren’t using).
Examine a trace
On the Traces page, click on a POST /notes trace, to see a flame graph that shows how long each span took and what other spans occurred before a span completed. The bar at the top of the graph is the span you selected on the previous screen (in this case, the initial entry point into the notes application).
The width of a bar indicates how long it took to complete. A bar at a lower depth represents a span that completes during the lifetime of a bar at a higher depth.
The flame graph for a POST trace looks something like this:
Tracing a single application is a great start, but the real value in tracing is seeing how requests flow through your services. This is called distributed tracing. Click the trace for the last API call, the one that added a date to the note, to see a distributed trace between the two services:
This flame graph combines interactions from multiple applications:
The first span is a POST request sent by the user and handled by the chi router through the supported go-chi library.
The second span is a createNote function that was manually traced by the makeSpanMiddleware function. The function created a span from the context of the HTTP request.
The next span is the request sent by the notes application using the supported http library and the client initialized in the main.go file. This GET request is sent to the calendar application. The calendar application spans appear in blue because they are separate service.
Inside the calendar application, a go-chi router handles the GET request and the GetDate function is manually traced with its own span under the GET request.
Finally, the purple db call is its own service from the supported sql library. It appears at the same level as the GET /Calendar request because they are both called by the parent span CreateNote.
When you’re done exploring, clean up all resources and delete the deployments:
terraform destroy
Troubleshooting
If you’re not receiving traces as expected, set up debug mode for the Go tracer. Read Enable debug mode to find out more.