2016-02-13 112 views
0

我試圖解析使用豬XML(0.12版本),但得到以下錯誤:錯誤而解析PIG-XML

解析失敗:豬腳本無法解析: 無法生成邏輯的計劃。嵌套異常:org.apache.pig.backend.executionengine.ExecException:錯誤1070:無法使用imports解析org.apache.pig.piggybank.evaluation.xml.XPath:[,java.lang。,org.apache.pig。 。內置,org.apache.pig.impl.builtin]

我的XML文件如下:

<CATALOG> 
<BOOK> 
<TITLE>Hadoop Defnitive Guide</TITLE> 
<AUTHOR>Tom White</AUTHOR> 
<COUNTRY>US</COUNTRY> 
<COMPANY>CLOUDERA</COMPANY> 
<PRICE>24.90</PRICE> 
<YEAR>2012</YEAR> 
</BOOK> 
<BOOK> 
<TITLE>Programming Pig</TITLE> 
<AUTHOR>Alan Gates</AUTHOR> 
<COUNTRY>USA</COUNTRY> 
<COMPANY>Horton Works</COMPANY> 
<PRICE>30.90</PRICE> 
<YEAR>2013</YEAR> 
</BOOK> 
</CATALOG> 

從Practcing:http://hadoopgeek.com/apache-pig-xml-parsing-xpath/

下面是腳本:

REGISTER piggybank.jar 

DEFINE XPath org.apache.pig.piggybank.evaluation.xml.XPath(); 

A = LOAD '/hadoop_books.xml' using org.apache.pig.piggybank.storage.XMLLoader('BOOK') as (x:chararray); 

B = FOREACH A GENERATE XPath(x, 'BOOK/AUTHOR'), XPath(x, 'BOOK/PRICE'); 


dump B; 

請幫助

I have kept .xml file in hadoop root directory 
+0

您必須製作一個dir名稱xmls,然後添加聽到'hadoop_books.xml'文件然後嘗試運行。 –

回答

0

我不認爲你想在你的定義句的括號:

DEFINE XPath org.apache.pig.piggybank.evaluation.xml.XPath; 

您還可以通過刪除DEFINE和直接引用UDF調試:

B = FOREACH A GENERATE 
     org.apache.pig.piggybank.evaluation.xml.XPath(x, 'BOOK/AUTHOR'), 
     org.apache.pig.piggybank.evaluation.xml.XPath(x, 'BOOK/PRICE'); 

如果這不起作用,那麼在您的類路徑中找不到piggybank.jar,您可能需要提供jar的完整路徑。