2013-02-17 115 views
2

我創建了一個新的Java項目,然後我添加了Library Sqoop和Hadoop。 (該庫是 「Hadoop的芯 - 1.1.1.jar,sqoop-1.4.2.jar,等...」。)如何在Java中執行Sqoop?

然後我試圖下面的代碼:

public class MySqoopDriver { 
    public static void main(String[] args) { 
     String[] str = { "export", "--connect", "jdbc:mysql://localhost/mytestdb", "--hadoop-home", 
       "/home/yoonhok/development/hadoop-1.1.1", "--table", "tbl_1", "--export-dir", "hdfs://localhost:9000/user/hive/warehouse/tbl_1", 
       "--username", "yoonhok", "--password", "1234"}; 

     Sqoop.runTool(str); 
    } 
} 

的參數是正確的,因爲當我嘗試在終端,它運作良好。

但它沒有奏效。該錯誤信息是:

13/02/17 16:23:07 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/17 16:23:07 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/17 16:23:07 INFO tool.CodeGenTool: Beginning code generation 
13/02/17 16:23:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_1` AS t LIMIT 1 
13/02/17 16:23:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_1` AS t LIMIT 1 
13/02/17 16:23:07 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/86a3cab62184ad50a3ae11e7cb0e4f4d/tbl_1.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/17 16:23:08 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/86a3cab62184ad50a3ae11e7cb0e4f4d/tbl_1.jar 
13/02/17 16:23:08 INFO mapreduce.ExportJobBase: Beginning export of tbl_1 
13/02/17 16:23:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/17 16:23:09 INFO input.FileInputFormat: Total input paths to process : 1 
13/02/17 16:23:09 INFO input.FileInputFormat: Total input paths to process : 1 
13/02/17 16:23:09 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok1526809600/.staging/job_local_0001 
13/02/17 16:23:09 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException: File /user/hive/warehouse/tbl_1/000000_0 does not exist. 
13/02/17 16:23:09 ERROR tool.ExportTool: Encountered IOException running export job: java.io.FileNotFoundException: File /user/hive/warehouse/tbl_1/000000_0 does not exist. 

當我檢查HDFS,該文件存在:

hadoop fs -ls /user/hive/warehouse/tbl_1 
Found 1 items 
-rw-r--r-- 1 yoonhok supergroup  240 2013-02-16 18:56 /user/hive/warehouse/tbl_1/000000_0 

如何我可以在Java程序中執行Sqoop?

我試過Processbuilder和Process,但我不想使用它們。

我真的很想使用Sqoop API,但我聽說它還不存在。

我讀this question但它不適用於我。

+0

複製Java項目的構建路徑:〔如何在Java程序中使用Sqoop?](HTTP: //stackoverflow.com/questions/9229611/how-to-use-sqoop-in-java-program) – 2014-08-13 05:03:59

回答

1

首先讓我提一下,Sqoop 1沒有官方的客戶端API。儘管以你所使用的方式調用Sqoop是相當常見的工作。

基於日誌,我猜測你正在執行Sqoop的java應用程序在classpath上沒有hadoop配置。因此,Sqoop不會獲得有關您的羣集的信息,並且會以「本地」模式工作。您需要將hadoop配置放入您的類路徑中才能針對遠程羣集運行Sqoop。請在計算器上使用entry簽出以獲取更多詳細信息。

1

您可以使用「SqoopOptions」在您的Java程序中執行sqoop。

這是用於將MySql中的表導入HDFS的示例代碼。

public static void importSQLToHDFS() throws Exception { 
    String driver = "com.mysql.jdbc.Driver"; 
    Class.forName(driver).newInstance(); 

    Configuration config = new Configuration(); 
    config.addResource(new Path("/.../conf/core-site.xml")); 
    config.addResource(new Path("/.../conf/hdfs-site.xml")); 

    properties.load(new FileInputStream("/.../sqoopimport.properties")); 

    SqoopOptions options = new SqoopOptions(); 
    options.setDriverClassName(driver); 
    options.setHadoopHome("/.../hadoop-0.20.2-cdh3u2"); 
    options.setConnectString(properties.getProperty("db_connection_string")); 
    options.setTableName(properties.getProperty("db_mysql_table_name")); 
    options.setUsername(properties.getProperty("db_usr_id")); 
    options.setPassword(properties.getProperty("db_passwd")); 
    options.setNumMappers(1); 
    options.setTargetDir(properties.getProperty("path_export_file")); 
    options.setFileLayout(FileLayout.TextFile); 

    new ImportTool().run(options); 
} 

對於導出,請參閱下面的示例代碼。 注意:此處不使用屬性文件。確保您創建了要將數據導入到的表。

public static boolean exportHDFSToSQL() throws InstantiationException, IllegalAccessException, ClassNotFoundException { 
    try { 
     SqoopOptions options=new SqoopOptions(); 
     options.setConnectString("jdbc:mysql://localhost:3306/dbName"); 
     options.setUsername("user_name"); 
     options.setPassword("pwd"); 
     options.setExportDir("path of file to be exported from hdfs"); 
     options.setTableName("table_name"); 
     options.setInputFieldsTerminatedBy(','); 
     options.setNumMappers(1); 
     new ExportTool().run(options); 
    } catch (Exception e) { 
     return false; 
    } 
    return true; 
} 
+0

感謝您的幫助。但什麼是「屬性」?我做到了嗎?我不知道「/.../sqoopimport.properties」這個...... – yoonhok 2013-02-19 02:07:51

+0

對於上面的代碼,你需要做一個屬性文件。 上面的代碼用於從MySQL向HDFS導入表。對於導出使用「新的ExportTool()。運行(選項);」 – 2013-02-20 11:45:48

2

有一個技巧,我覺得很容易。通過ssh你可以直接執行Sqoop命令。只要你必須使用的是一個SSH Java庫

你必須按照這一步。

下載sshxcute Java庫:https://code.google.com/p/sshxcute/ 並將其添加到包含下面的Java代碼

import net.neoremind.sshxcute.core.SSHExec; 
import net.neoremind.sshxcute.core.ConnBean; 
import net.neoremind.sshxcute.task.CustomTask; 
import net.neoremind.sshxcute.task.impl.ExecCommand; 

public class TestSSH { 

    public static void main(String args[]) throws Exception{ 

    // Initialize a ConnBean object, parameter list is ip, username, password 

    ConnBean cb = new ConnBean("192.168.56.102", "root","hadoop"); 

    // Put the ConnBean instance as parameter for SSHExec static method getInstance(ConnBean) to retrieve a singleton SSHExec instance 
    SSHExec ssh = SSHExec.getInstance(cb);   
    // Connect to server 
    ssh.connect(); 
    CustomTask sampleTask1 = new ExecCommand("echo $SSH_CLIENT"); // Print Your Client IP By which you connected to ssh server on Horton Sandbox 
    System.out.println(ssh.exec(sampleTask1)); 
    CustomTask sampleTask2 = new ExecCommand("sqoop import --connect jdbc:mysql://192.168.56.101:3316/mysql_db_name --username=mysql_user --password=mysql_pwd --table mysql_table_name --hive-import -m 1 -- --schema default"); 
    ssh.exec(sampleTask2); 
    ssh.disconnect(); 
    } 
}