启动sqoop服务

sqoop2-server start

开启sqoop客户端

sqoop2-shell or sqoop.sh client

我们在使用的过程中可能会遇到错误,使用此配置打印错误 相当于debug模式

set option –name verbose –value true

设置链接信息

set server –host localhost –port 12000 –webapp sqoop

验证是否链接

show version –all #如果server version:能显示代表能正常连接

检查Sqoop server支持的连接

show connector

创建数据源头link

sqoop:000> create link -connector generic-jdbc-connector
Creating link for connector with name generic-jdbc-connector
Please fill following values to create new link object
Name: mysql-jdbc-connector

Database connection

Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://39.105.120.8/www_wuzhixiang_c
Username: www_wuzhixiang_c
Password: ****************
Fetch Size:
Connection Properties:
There are currently 0 values in the map:
entry# protocol=tcp
There are currently 1 values in the map:
protocol = tcp
entry#

SQL Dialect

Fetch Size:(回车) entry#protocol=tcp

entry#(回车)

Identifier enclose:(空格) #这里是指定SQL中标识符的定界符,也就是说, 有的SQL标示符是一个引号:select * from “table_name”,这种定界符在MySQL中是会报错的。 这个属性默认值就是双引号,所以不能使用回车,必须将之覆盖,我使用空格覆盖了这个值。 官方文档这里有坑! New link was successfully created with validation status OK and name third link 看到这样的字符这个link算是创建成功

 

创建目标link

sqoop:000> create link -connector hdfs-connector
Creating link for connector with name hdfs-connector
Please fill following values to create new link object
Name: hdfs-link

HDFS cluster

URI: hdfs://localhost:9000
Conf directory: /usr/local/apps/hadoop/etc/hadoop/
Additional configs::
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and name hdfs-link

创建job

sqoop:000> create job -f “mysql-jdbc-connector” -t “hdfs-link”
Creating job for links with from name mysql-jdbc-connector and to name hdfs-link
Please fill following values to create new job object
Name: job-first

Database source

Schema name: www_wuzhixiang_c
Table name: wp_posts
SQL statement:
Column names:
There are currently 0 values in the list:
element# ID
There are currently 1 values in the list:
ID
element# post_date
There are currently 2 values in the list:
ID
post_date
element# post_content
There are currently 3 values in the list:
ID
post_date
post_content
element# post_title
There are currently 4 values in the list:
ID
post_date
post_content
post_title
element# post_status
There are currently 5 values in the list:
ID
post_date
post_content
post_title
post_status
element# guid
There are currently 6 values in the list:
ID
post_date
post_content
post_title
post_status
guid
element#
Partition column: ID
Partition column nullable:
Boundary query:

Incremental read

Check column:
Last value:

Target configuration

Override null value:
Null value:
File format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
2 : PARQUET_FILE
Choose: 0
Compression codec:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom codec:
Output directory: hdfs://localhost:9000/job/job-first
Append mode:

Throttling resources

Extractors: 2
Loaders: 2

Classpath configuration

Extra mapper jars:
There are currently 0 values in the list:
element#
New job was successfully created with validation status OK and name job-first

开启job 并打印job执行详情
sqoop:000> start job -n Sqoopy -s

发表评论

电子邮件地址不会被公开。 必填项已用*标注