hadoop sqoop
TRANSCRIPT
Apache Sqoop
陳威宇
Sqoop : RDB 與 Hadoop 的橋樑
• Apache Sqoop is a “tool” designed to transfer data between hadoop and structured datastores.
• 從..拿資料
– RDBMS
– Data warehources
– NoSQL
• 寫資料到..
– Hive
– Hbase
• 使用 mapreduce framework to transfer data in parallel
2 figure Source : http://bigdataanalyticsnews.com/data-transfer-mysql-cassandra-using-sqoop/
Sqoop 使用方法
3 figure Source : http://hive.3du.me/slide.html
Sqoop 與大象的連結 ( setup )
• 解壓縮 http://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.5-cdh5.3.2.tar.gz
• 修改
~/.bashrc
• 修改 conf/sqoop-env.sh
• 啟動 sqoop
export JAVA_HOME=/usr/lib/jvm/java-7-oracle export HADOOP_HOME=/home/hadoop/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HIVE_HOME=/home/hadoop/hive export SQOOP_HOME=/home/hadoop/sqoop export HCAT_HOME=${HIVE_HOME}/hcatalog/ export PATH=$PATH:$SQOOP_HOME/bin:
$ sqoop Try 'sqoop help' for usage.
export HADOOP_COMMON_HOME=/home/hadoop/hadoop export HBASE_HOME=/home/hadoop/hbase export HIVE_HOME=/home/hadoop/hive
練習一 : 實作 import to hive
cd ~
git clone https://github.com/waue0920/hadoop_example.git
cd ~/hadoop_example/sqoop/ex1
mysql -u root -phadoop < ./exc1.sql
hadoop fs -rmr /user/hadoop/authors
sqoop import --connect jdbc:mysql://localhost/books --username root --table authors --password hadoop --hive-import -m 1
練習 : 用hive 語法查詢是否已經匯入 hive> select * from authors;
練習一 : 製作 job
hadoop fs -rmr /user/hadoop/authors
sqoop job --create myjob -- import --connect jdbc:mysql://localhost/books --username root -table authors -P -hive-import -m 1
sqoop job --list
sqoop job --show myjob
sqoop job --exec myjob
練習 : 用hive 語法查詢是否已經匯入 hive> select * from authors;
練習二 : 實作 export to mysql
cd ~/hadoop_example/sqoop/ex2
mysql -u root -phadoop < ./create.sql
./update_hdfs_data.sh
sqoop export --connect jdbc:mysql://localhost/db --username root --password hadoop --table employee --export-dir /user/hadoop/sqoop_input/emp_data
Reference
• Sqoop 範例說明
– http://www.tutorialspoint.com/sqoop/sqoop_quick_guide.htm
• Sqoop 官方user guild
– https://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html
• Sqoop 練習
– http://hive.3du.me/