Getting Started with Notebooks Analysis Features
このページは日本語には対応しておりません。随時翻訳に取り組んでいます。
翻訳に関してご質問やご意見ございましたら、
お気軽にご連絡ください。
Join the Preview!
Advanced Analysis is in Preview. To enable, reach out to your Customer Success Manager.
Overview
This example analysis notebook has:
Three data sources:
trade_start_logs
trade_execution_logs
trading_platform_users
Three derived datasets, which are the results of data that has been transformed from filtering, grouping, or querying using SQL:
parsed_execution_logs
transaction_record
transaction_record_with_names
One treemap visualization.
This diagram shows the different transformation and analysis cells the data sources go through.
Example walkthrough
The example starts off with two logs data sources:
trade_start_logs
trade_execution_logs
The next cell in the analysis notebook is the transform cell parsed_execution_logs
. It uses the following [grok parsing syntax][3] to extract the transaction ID from the message
column of the trade_execution_logs
dataset and adds the transaction ID to a new column called transaction_id
.
transaction %{notSpace:transaction_id}
An example of the resulting parsed_execution_logs
dataset:
timestamp | host | message | transaction_id |
---|
May 29 11:09:28.000 | shopist.internal | Executing trade for transaction 56519 | 56519 |
May 29 10:59:29.000 | shopist.internal | Executing trade for transaction 23269 | 23269 |
May 29 10:58:54.000 | shopist.internal | Executing trade for transaction 96870 | 96870 |
May 31 12:20:01.152 | shopist.internal | Executing trade for transaction 80207 | 80207 |
The analysis cell transaction_record
uses the following SQL command to select specific columns from the trade_start_logs
dataset and the trade_execution_logs
, renames the status INFO
to OK
, and then joins the two datasets.
SELECT
start_logs.timestamp,
start_logs.customer_id,
start_logs.transaction_id,
start_logs.dollar_value,
CASE
WHEN executed_logs.status = 'INFO' THEN 'OK'
ELSE executed_logs.status
END AS status
FROM
trade_start_logs AS start_logs
JOIN
trade_execution_logs AS executed_logs
ON
start_logs.transaction_id = executed_logs.transaction_id;
An example of the resulting transaction_record
dataset:
timestamp | customer_id | transaction_id | dollar_value | status |
---|
May 29 11:09:28.000 | 92446 | 085cc56c-a54f | 838.32 | OK |
May 29 10:59:29.000 | 78037 | b1fad476-fd4f | 479.96 | OK |
May 29 10:58:54.000 | 47694 | cb23d1a7-c0cb | 703.71 | OK |
May 31 12:20:01.152 | 80207 | 2c75b835-4194 | 386.21 | ERROR |
Then the reference table trading_platform_users
is added as a data source:
customer_name | customer_id | account_status |
---|
Meghan Key | 92446 | verified |
Anthony Gill | 78037 | verified |
Tanya Mejia | 47694 | verified |
Michael Kaiser | 80207 | fraudulent |
The analysis cell transaction_record_with_names
runs the following SQL command to take the customer name and account status from trading_platform_users
, appending it as columns, and then joins it with the transaction_records
dataset:
SELECT tr.timestamp, tr.customer_id, tpu.customer_name, tpu.account_status, tr.transaction_id, tr.dollar_value, tr.status
FROM transaction_record AS tr
LEFT JOIN trading_platform_users AS tpu ON tr.customer_id = tpu.customer_id;
An example of the resulting transaction_record_with_names
dataset:
timestamp | customer_id | customer_name | account_status | transaction_id | dollar_value | status |
---|
May 29 11:09:28.000 | 92446 | Meghan Key | verified | 085cc56c-a54f | 838.32 | OK |
May 29 10:59:29.000 | 78037 | Anthony Gill | verified | b1fad476-fd4f | 479.96 | OK |
May 29 10:58:54.000 | 47694 | Tanya Mejia | verified | cb23d1a7-c0cb | 703.71 | OK |
May 31 12:20:01.152 | 80207 | Michael Kaiser | fraudulent | 2c75b835-4194 | 386.21 | ERROR |
Finally, a treemap visualization cell is created with the transaction_record_with_names
dataset filtered for status:error
logs and grouped by dollar_value
, account_status
, and customer_name
.
Further reading