2017-10-15 61 views
0

我有一系列正在由流分析作業處理的動態數據。有幾個可以明確查詢的統一屬性,但在查詢時大部分有效負載是未知類型。我的目標是獲取這個未知數據(記錄)並將所有屬性提升爲寫入Azure Table的結果查詢中的頂級字段。從記錄到頂級提升已知屬性導致流分析查詢

我能夠扁平化記錄的屬性,它總是作爲子對象添加到查詢中。 GetRecordProperties()沒有幫助,因爲我不想爲每個屬性返回一個單獨的記錄。

我的查詢看起來是這樣的:

WITH 
[custom_events_temp] AS 
(
    SELECT 
     [magellan].[context].[data].[eventTime] as [event_time], 
     [flat_event].ArrayValue.name as [event_name], 
     udf.FlattenCustomDimensions([magellan].[context].[custom].[dimensions]) as [flat_custom_dim] 
    FROM [Magellan--AI-CustomEvents] magellan 
    TIMESTAMP BY [magellan].[context].[data].[eventTime] 
    CROSS APPLY GetElements([magellan].[event]) as [flat_event] 
), 
-- create table with extracted webhook data 
[all_webhooks] AS 
(
    SELECT 
     [flat_custom_dim].[hook_event_source] as PartitionKey, 
     udf.CreateGuid('') as RowKey, 
     -- event data 
     [custom_events_temp].[event_time], 
     [custom_events_temp].[flat_custom_dim].[hook_event_name] as [event_name], 
     -- webhook payload data   
     udf.FlattenWebhookPayload(udf.ExtractJsonWebhookPayload([custom_events_temp].[flat_custom_dim].[webhook_payload])) AS [payload] 
    FROM [custom_events_temp] 
) 
SELECT * INTO [TrashTableOut] FROM [all_webhooks] 

,所得記錄我得到是這樣的。這個想法是將嵌套對象中的所有內容都嵌套在一起,因此每個屬性在Azure表中都有自己的列。

{ 
    "partitionkey": "zzzzzzzzz", 
    "rowkey": "8beeb783-b07f-8a98-ef56-71c43378a5fc", 
    "event_time": "2017-10-15T05:37:06.3240000Z", 
    "event_name": "subscriber.updated_lead_score", 
    "payload": { 
    "event": "subscriber.updated_custom_field", 
    "data.subscriber.id": "...", 
    "occurred_at": "2017-10-15T05:36:57.000Z", 
    "data.account_id": "11111", 
    "data.subscriber.status": "active", 
    "data.subscriber.custom_fields.coupon": "xxxxxxx", 
    "data.subscriber.custom_fields.coupon_discounted_price": "11111", 
    "data.subscriber.custom_fields.coupon_pre_discount_price": "11111", 
    "data.subscriber.custom_fields.name": "John Doe", 
    "data.subscriber.custom_fields.first_name": "John", 
    "data.subscriber.custom_fields.ip_address": "0.0.0.0", 
    "data.subscriber.tags": "tag1,tag2,tag3", 
    "data.subscriber.time_zone": "Europe/Berlin", 
    "data.subscriber.utc_offset": 120, 
    "data.subscriber.created_at": "2017-03-27T18:19:35.000Z" 
    } 
} 

這可能嗎? UDF FlattenCustomDimensions接收一系列項目並將它們作爲屬性公開。 UDF ExtractJsonWebhookPayload採用字符串&將其轉換爲JSON,而UDF FlattenWebhookPayload採用複雜的JSON對象&創建在結果中的​​對象中看到的點語法。

我的最終目標是獲得一個結果集,看起來像:

{ 
    "partitionkey": "zzzzzzzzz", 
    "rowkey": "8beeb783-b07f-8a98-ef56-71c43378a5fc", 
    "event_time": "2017-10-15T05:37:06.3240000Z", 
    "event_name": "subscriber.updated_lead_score", 
    "payload.event": "subscriber.updated_custom_field", 
    "payload.data.subscriber.id": "...", 
    "payload.occurred_at": "2017-10-15T05:36:57.000Z", 
    "payload.data.account_id": "11111", 
    "payload.data.subscriber.status": "active", 
    "payload.data.subscriber.custom_fields.coupon": "xxxxxxx", 
    "payload.data.subscriber.custom_fields.coupon_discounted_price": "11111", 
    "payload.data.subscriber.custom_fields.coupon_pre_discount_price": "11111", 
    "payload.data.subscriber.custom_fields.name": "John Doe", 
    "payload.data.subscriber.custom_fields.first_name": "John", 
    "payload.data.subscriber.custom_fields.ip_address": "0.0.0.0", 
    "payload.data.subscriber.tags": "tag1,tag2,tag3", 
    "payload.data.subscriber.time_zone": "Europe/Berlin", 
    "payload.data.subscriber.utc_offset": 120, 
    "payload.data.subscriber.created_at": "2017-03-27T18:19:35.000Z" 
} 

除非有人有更好的主意/選項。

+0

,如果你知道所有的列名,它可能可以寫一個查詢來促進所有嵌套的字段。但是這個查詢將會很大並且很難改變。 你有沒有考慮過使用[javascript UDF](https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-javascript-user-defined-functions)?它會更清潔,你將不得不將整個有效載荷傳遞到UDF的單個字段中。 –

+0

嗯,是的,點是我不知道領域。通過使用點語法的直接查詢,瞭解字段名稱很容易解決。 –

回答

0

如果添加點語法到您選擇查詢*到最後一行,那麼你可以查詢你在臨時擴展到特定的列在主表中的列

相關問題