我有一個配置單元查詢,它使用XPath從XML返回一組數組。 我想將數組的這些元素插入配置單元表中。如何將數據插入XPath返回的數組中的hive表中
在hivexml表XML內容是:
<tag><row Id="1" TagName=".net" Count="244006" ExcerptPostId="3624959" WikiPostId="3607476" /><row Id="2" TagName="html" Count="602809" ExcerptPostId="3673183" WikiPostId="3673182" /><row Id="3" TagName="javascript" Count="1274350" ExcerptPostId="3624960" WikiPostId="3607052" /><row Id="4" TagName="css" Count="434937" ExcerptPostId="3644670" WikiPostId="3644669" /><row Id="5" TagName="php" Count="1009113" ExcerptPostId="3624936" WikiPostId="3607050" /><row Id="8" TagName="c" Count="236386" ExcerptPostId="3624961" WikiPostId="3607013" /></tag>
它返回組陣列的該查詢:
select xpath(str,'/tag/row/@Id'), xpath(str,'/tag/row/@TagName'), xpath(str,'/tag/row/@Count'), xpath(str,'/tag/row/@ExcerptPostId'), xpath(str,'/tag/row/@WikiPostId') from hivexml;"
和上面查詢的輸出(設定陣列)是:
["1","2","3","4","5"] [".net","html","css","php","c"] ["244006","602809","434937","1009113","236386"] ["3624959","3673183","3644670","3624936","3624961"] ["3607476","36
73182","3644669","3607050","3607013"]
我想插入這些值到一個配置單元表中,就像在這種格式:
1 .net 244006 3624959 3607476
2 html 602809 3673183 3673182
3 css 434937 3644670 3644669
4 php 1009113 3624936 3607050
5 c 236386 3624961 3607013
如果我做一個插入上述選擇查詢:
insert into newhivexml select xpath(str,'/tags/row/@Id'), xpath(str,'/tag/row/@TagName'), xpath(str,'/tag/row/@Count'), xpath(str,'/tag/row/@ExcerptPostId'), xpath(str,'/tag/row/@WikiPostId') from hivexml;"
然後我得到一個錯誤:
NoMatchingMethodException No matching method for class org.apache.hadoop.hive.ql.udf.UDFToInteger with (array). Possible choices: FUNC(bigint) FUNC(boolean) FU NC(decimal(38,18)) FUNC(double) FUNC(float) FUNC(smallint) FUNC(string) FUNC(struct) FUNC(timestamp) FUNC(tinyin t) FUNC(void)
我認爲,我們不能直接插入這樣的,我在這裏失去了一些東西。誰能告訴我如何做到這一點?也就是說,將數組中的這些值插入到表中。
下載只是爲了確保 - 的XML剛剛開始列中的e列,而不是整個數據,對不對? –