2017-09-30 46 views
0

我一直在尋找在這幾個星期(在背景中)和難倒關於如何JSON數據近似CSV轉換成使用NiFi JoltTransformJson處理器加標籤的集合。我的意思是將輸入中數組的第一行的數據用作輸出中的JSON對象名稱。顛簸參考第一元件

作爲一個例子,我有這樣的輸入數據:

[ 
    [ 
    "Company", 
    "Retail Cost", 
    "Percentage" 
    ], 
    [ 
    "ABC", 
    "5,368.11", 
    "17.09%" 
    ], 
    [ 
    "DEF", 
    "101.47", 
    "0.32%" 
    ], 
    [ 
    "GHI", 
    "83.79", 
    "0.27%" 
    ] 
] 

和我試圖得到儘可能輸出是:

[ 
    { 
    "Company": "ABC", 
    "Retail Cost": "5,368.11", 
    "Percentage": "17.09%" 
    }, 
    { 
    "Company": "DEF", 
    "Retail Cost": "101.47", 
    "Percentage": "0.32%" 
    }, 
    { 
    "Company": "GHI", 
    "Retail Cost": "83.79", 
    "Percentage": "0.27%" 
    } 
] 

我認爲這主要是兩個問題:開始訪問第一個數組的內容,然後確保輸出數據不包含第一個數組。

我很想發表顛簸規格展示自己變得有些接近,但最近給我輸出的正確形狀不正確的內容。它看起來像這樣:

[ 
    { 
    "operation": "shift", 
    "spec": { 
     "*": { 
     "*": "[&1].&0" 
     } 
    } 
    } 
] 

但它會導致這樣的輸出:

[ { 
    "0" : "Company", 
    "1" : "Retail Cost", 
    "2" : "Percentage" 
}, { 
    "0" : "ABC", 
    "1" : "5,368.11", 
    "2" : "17.09%" 
}, { 
    "0" : "DEF", 
    "1" : "101.47", 
    "2" : "0.32%" 
}, { 
    "0" : "GHI", 
    "1" : "83.79", 
    "2" : "0.27%" 
} ] 

其中明確有錯誤的對象名稱和它有1組輸出的元素過多。

回答

1

能做到這一點,但哇很難讀/貌似可怕的正則表達式

規格

[ 
    { 
    // this does most of the work, but producs an output 
    // array with a null in the Zeroth space. 
    "operation": "shift", 
    "spec": { 
     // match the first item in the outer array and do 
     // nothing with it, because it is just "header" data 
     // e.g. "Company", "Retail Cost", "Percentage". 
     // we need to reference it, but not pass it thru 
     "0": null, 
     // 
     // loop over all the rest of the items in the outer array 
     "*": { 
     // this is rather confusing 
     // "*" means match the array indices of the innner array 
     // and we will write the value at that index "ABC" etc 
     // to "[&1][email protected](2,[0].[&])" 
     // "[&1]" means make the ouput be an array, and at index 
     // &1, which is the index of the outer array we are 
     // currently in. 
     // Then "lookup the key" (Company, Retail Cost) using 
     // @(2,[0].[&]) 
     // Which is go back up the tree to the root, then 
     // come back down into the first item of the outer array 
     // and Index it by the by the array index of the current 
     // inner array that we are at. 
     "*": "[&1][email protected](2,[0].[&])" 
     } 
    } 
    }, 
    { 
    // We know the first item in the array will be null/junk, 
    // because the first item in the input array was "header" info. 
    // So we match the first item, and then accumulate everything 
    // into a new array 
    "operation": "shift", 
    "spec": { 
     "0": null, 
     "*": "[]" 
    } 
    } 
] 
+0

謝謝米洛......我可以證實,這個工程,我借鑑了很多它。非常感謝您對規範發表評論。雖然我很困惑,但如果第一個班次操作員發出一個空值,那麼爲什麼我們需要第二班?我試圖刪除它,我看到了結果。但我覺得困惑,因爲兩個移位條款看起來完全相同的第一行。所以,如果第一次沒有將其刪除......爲什麼第二個成功嗎? – Mark

+0

更新了規格並提供了更多評論。希望能夠說清楚。這就是說,只要嘗試刪除第一個「0」:空,只是運行的第一個轉變,看看會發生什麼。 ;) –