Log Collection Troubleshooting Guide

There are a number of common issues that can get in the way when sending new logs to Datadog via the log collector in the dd-agent. If you experience issues sending new logs to Datadog, this list helps you troubleshoot. If you continue to have trouble, contact Datadog support for further assistance.

Restart the Agent

Changes in the configuration of the datadog-agent won’t be taken into account until you have restarted the Agent.

Outbound traffic on port 10516 is blocked

The Datadog Agent sends its logs to Datadog over TCP using port 10516. If that connection is not available, logs fail to be sent and an error is recorded in the agent.log file to that effect.

You can manually test your connection using OpenSSL, GnuTLS, or another SSL/TLS client. For OpenSSL, run the following command:

openssl s_client -connect intake.logs.datadoghq.com:10516

For GnuTLS, run the following command:

gnutls-cli intake.logs.datadoghq.com:10516

And then by sending a log like the following:

<API_KEY> this is a test message
  • If opening the port 10516 is not an option, it is possible to configure the Datadog Agent to send logs through HTTPS by adding the following in datadog.yaml:
logs_config:
  force_use_http: true

See the HTTPS log forwarding section for more information.

Check the status of the Agent

Often, checking the Agent status command results will help you troubleshoot what is happening.

No new logs have been written

The Datadog Agent only collects logs that have been written after it has started trying to collect them (whether it be tailing or listening for them). In order to confirm whether log collection has been successfully set up, make sure that new logs have been written.

Permission issues tailing log files

The Datadog Agent does not run as root (and running as root is not recommended, as a general best practice). When you configure your Agent to tail log files for custom logs or for integrations, you need to take special care to ensure the Agent user has the correct access to the log files.

The default Agent user per operating system:

Operating systemDefault Agent user
Linuxdatadog-agent
MacOSdatadog-agent
Windowsddagentuser

If the Agent does not have the correct permissions, you might see one of the following error messages when checking the Agent status:

  • The file does not exist.
  • Access is denied.
  • Could not find any file matching pattern <path/to/filename>, check that all its subdirectories are executable.

To fix the error, give the Datadog Agent user read and execute permissions to the log file and subdirectories.

  1. Run the namei command to obtain more information about the file permissions:

    > namei -m /path/to/log/file
    

    In the following example, the Agent user does not have execute permissions on the application directory or read permissions on the error.log file.

    > namei -m /var/log/application/error.log
    > f: /var/log/application/
    drwxr-xr-x /
    drwxr-xr-x var
    drwxrwxr-x log
    drw-r--r-- application
    -rw-r----- error.log
    
  2. Make the logs folder and its children readable:

    sudo chmod o+rx /path/to/logs
    

Note: Make sure that these permissions are correctly set in your log rotation configuration. Otherwise, on the next log rotate, the Datadog Agent might lose its read permissions. Set permissions as 644 in the log rotation configuration to make sure the Agent has read access to the files.

  1. Use the icacls command on the log folder to obtain more information about the file permissions:

    icacls path/to/logs/file /t
    

    The /t flag runs the command recursively on files and sub-folders.

    In the following example, the test directory and its children are not accessible to ddagentuser:

    PS C:\Users\Administrator> icacls C:\test\ /t
    C:\test\ NT AUTHORITY\SYSTEM:(OI)(CI)(F)
           BUILTIN\Administrators:(OI)(CI)(F)
           CREATOR OWNER:(OI)(CI)(IO)(F)
    
    C:\test\file.log NT AUTHORITY\SYSTEM:(F)
           BUILTIN\Administrators:(F)
    
    C:\test\file2.log NT AUTHORITY\SYSTEM:(F)
           BUILTIN\Administrators:(F)
    
  2. Use the icacls command to grant ddagentuser the required permissions (include the quotes):

    icacls "path\to\folder" /grant "ddagentuser:(OI)(CI)(RX)" /t
    

    In case the application uses log rotation, (OI) and (CI) inheritance rights ensure that any future log files created in the directory inherit the parent folder permissions.

  3. Run icacls again to check that ddagentuser has the correct permissions:

    icacls path/to/logs/file /t
    

    In the following example, ddagentuser is listed in the file permissions:

    PS C:\Users\Administrator> icacls C:\test\ /t
    C:\test\ EC2-ABCD\ddagentuser:(OI)(CI)(RX)
           NT AUTHORITY\SYSTEM:(OI)(CI)(F)
           BUILTIN\Administrators:(OI)(CI)(F)
           CREATOR OWNER:(OI)(CI)(IO)(F)
    
    C:\test\file.log NT AUTHORITY\SYSTEM:(F)
                   BUILTIN\Administrators:(F)
                   EC2-ABCD\ddagentuser:(RX)
    
    C:\test\file2.log NT AUTHORITY\SYSTEM:(F)
                   BUILTIN\Administrators:(F)
                   EC2-ABCD\ddagentuser:(RX)
    Successfully processed 3 files; Failed processing 0 files
    
  4. Restart the Agent service and check the status to see if the problem is resolved:

    & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" restart-service
    & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" status
    
  1. Retrieve the ACL permissions for the file:

    PS C:\Users\Administrator> get-acl C:\app\logs | fl
    
    Path   : Microsoft.PowerShell.Core\FileSystem::C:\app\logs
    Owner  : BUILTIN\Administrators
    Group  : EC2-ABCD\None
    Access : NT AUTHORITY\SYSTEM Allow  FullControl
             BUILTIN\Administrators Allow  FullControl
    ...
    

    In this example, the application directory is not executable by the Agent.

  2. Run this PowerShell script to give read and execute privileges to ddagentuser:

    $acl = Get-Acl <path\to\logs\folder>
    $AccessRule = New-Object System.Security.AccessControl.FileSystemAccessRule("ddagentuser","ReadAndExecute","Allow")
    $acl.SetAccessRule($AccessRule)
    $acl | Set-Acl <path\to\logs\folder>
    
  3. Retrieve the ACL permissions for the file again to check if ddagentuser has the correct permissions:

    PS C:\Users\Administrator> get-acl C:\app\logs | fl
    Path   : Microsoft.PowerShell.Core\FileSystem::C:\app\logs
    Owner  : BUILTIN\Administrators
    Group  : EC2-ABCD\None
    Access : EC2-ABCD\ddagentuser Allow  ReadAndExecute, Synchronize
             NT AUTHORITY\SYSTEM Allow  FullControl
             BUILTIN\Administrators Allow  FullControl
    ...
    
  4. Restart the Agent service and check the status to see if the problem is resolved:

    & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" restart-service
    & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" status
    

Permission issue and Journald

When collecting logs from Journald, make sure that the Datadog Agent user is added in the systemd group as shown in the Journald integration.

Note: Journald sends an empty payload if the file permissions are incorrect. Accordingly, it is not possible to raise or send an explicit error message in this case.

Configuration issues

These are a few of the common configuration issues that are worth triple-checking in your datadog-agent setup:

  1. Check if the api_key is defined in datadog.yaml.

  2. Check if you have logs_enabled: true in your datadog.yaml

  3. By default the Agent does not collect any logs, make sure there is at least one .yaml file in the Agent’s conf.d/ directory that includes a logs section and the appropriate values.

  4. You may have some .yaml parsing errors in your configuration files. YAML can be finicky, so when in doubt rely on a YAML validator.

Check for errors in the Agent logs

There might be an error in the logs that would explain the issue. Run the following command to check for errors:

sudo grep -i error /var/log/datadog/agent.log

Docker environment

See the Docker Log Collection Troubleshooting Guide

Serverless environment

See the Lambda Log Collection Troubleshooting Guide

Unexpectedly dropping logs

Check if logs appear in the Datadog Live Tail.

If they appear in the Live Tail, check the Indexes configuration page for any exclusion filters that could match your logs. If they do not appear in the Live Tail, they might have been dropped if their timestamp was further than 18 hours in the past. You can check which service and source may be impacted with the datadog.estimated_usage.logs.drop_count metric.

Truncated logs

Logs above 1MB are truncated. You can check which service and source are impacted with the datadog.estimated_usage.logs.truncated_count and datadog.estimated_usage.logs.truncated_bytes metrics.

Further Reading

PREVIEWING: rtrieu/product-analytics-ui-changes