Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)
Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。
Sqoop的User Guide地址:
1:tar zxvf sqoop-1.1.0.tar.gz
2:修改配置文件 /home/hadoopuser/sqoop-1.1.0/conf/sqoop-site.xml
一般只需要修改如下几个项:
sqoop.metastore.client.enable.autoconnect
sqoop.metastore.client.autoconnect.url
sqoop.metastore.client.autoconnect.username
sqoop.metastore.client.autoconnect.password
sqoop.metastore.server.location
sqoop.metastore.server.port
3:
bin/sqoop help
bin/sqoop help import
4:
[hadoopuser@master sqoop-1.1.0]$ bin/sqoop import --connect jdbc:mysql://localhost/ppc --table data_ip --username kwps -P
Enter password:
11/02/18 10:51:58 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.2
java.lang.RuntimeException: Could not find appropriate Hadoop shim for 0.20.2
at com.cloudera.sqoop.shims.ShimLoader.loadShim(ShimLoader.java:190)
at com.cloudera.sqoop.shims.ShimLoader.getHadoopShim(ShimLoader.java:109)
at com.cloudera.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:173)
at com.cloudera.sqoop.tool.ImportTool.init(ImportTool.java:81)
at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:411)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:134)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:170)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:196)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:205)
解决办法:
默认情况下:
./hadoop-0.20.2/conf/hadoop-env.sh
# Extra Java runtime options. Empty by default.
# export HADOOP_OPTS=-server
需要更改成:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Dsqoop.shim.jar.dir=/home/hadoopuser/sqoop-1.1.0/shims"
特别需要注意的是:
Sqoop目前在Apache 版本的Hadoop 0.20.2上是无法使用的。
目前只支持CDH 3 beta 2版本。所以如果想使用的话,得升级到 CDH 3 beta 2版本了。
“Sqoop does not run with Apache Hadoop 0.20.2. The only supported platform is CDH 3 beta 2. It requires features of MapReduce not available in the Apache 0.20.2 release of Hadoop. You should upgrade to CDH 3 beta 2 if you want to run Sqoop 1.0.0.”
这个问题 已经被Cloudera 标记为 Major Bug,希望能尽快解决吧。
,