SlideShare a Scribd company logo
1 of 24
Presentation  slide  for  Sqoop  User  Meetup  (Strata  +  Hadoop  World  NYC  2013)  

Complex  stories  about
Sqooping  PostgreSQL  data
10/28/2013
NTT  DATA  Corporation
Masatake  Iwasaki

Copyright © 2013 NTT DATA Corporation
Introduction

Copyright © 2013 NTT DATA Corporation

2
About  Me
Masatake  Iwasaki:
Software  Engineer  @  NTT  DATA:

NTT(Nippon  Telegraph  and  Telephone  Corporation):  Telecommunication
NTT  DATA:  Systems  Integrator

Developed:
Ludia:  Fulltext  search  index  for  PostgreSQL  using  Senna
Authored:
“A  Complete  Primer  for  Hadoop”  
(no  official  English  title)
Patches  for  Sqoop:
SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload
SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM  
SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development  

Copyright © 2013 NTT DATA Corporation

3
Why  PostgreSQL?

Enterprisy
from  earlier  version
comparing  to  MySQL
Active  community  in  Japan
NTT  DATA  commits  itself  to  development

Copyright © 2013 NTT DATA Corporation

4
Sqooping  PostgreSQL  data

Copyright © 2013 NTT DATA Corporation

5
Working  around  PostgreSQL
Direct  connector  for  PostgreSQL  loader:
    SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload
Yet  another  direct  connector  for  PostgreSQL  JDBC:
    SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL
                                            using  COPY  ...  FROM
Supporting  complex  data  types:
    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types
  

Copyright © 2013 NTT DATA Corporation

6
Direct  connector  for  PostgreSQL  loader

SQOOP-‐‑‒390:  
    PostgreSQL  connector  for  direct  export  with  pg_̲bulkload

pg_̲bulkload:
Data  loader  for  PostgreSQL
Server  side  plug-‐‑‒in  library  and  client  side  command
Providing  filtering  and  transformation  of  data
http://pgbulkload.projects.pgfoundry.org/

Copyright © 2013 NTT DATA Corporation

7
SQOOP-‐‑‒390:  
    PostgreSQL  connector  for  direct  export  with  pg_̲bulkload	
HDFS

File  Split

File  Split

File  Split
CREATE  TABLE
    tmp3(LIKE  dest  INCLUDING  CONSTRAINTS)

Mapper

Mapper

Mapper

pg_̲bulkload

pg_̲bulkoad

pg_̲bulkload

tmp1

PostgreSQL

tmp2

Reducer

Destination  Table

Copyright © 2013 NTT DATA Corporation

tmp3

external  process
staging  table  per  mapper  is  must
due  to  table  level  locks
BEGIN
INSERT  INTO  dest  (  SELECT  *  FROM  tmp1  )
DROP  TABLE  tmp1
INSERT  INTO  dest  (  SELECT  *  FROM  tmp2  )
DROP  TABLE  tmp2
INSERT  INTO  dest  (  SELECT  *  FROM  tmp3  )
DROP  TABLE  tmp3
COMMIT
8
Direct  connector  for  PostgreSQL  loader

Pros:
Fast

by  short-‐‑‒circuitting  server  functionality

Flexible

filtering  error  records

Cons:
Not  so  fast

Bottleneck  is  not  in  client  side  but  in  DB  side
Built-‐‑‒in  COPY  functionality  is  fast  enough

Not  General

pg_̲bulkload  supports  only  export

Requiring  setup  on  all  slave  nodes  and  client  node
Possible  to  Require  recovery  on  failure
Copyright © 2013 NTT DATA Corporation

9
Yet  another  direct  connector  for  PostgreSQL  JDBC
PostgreSQL  provides  custom  SQL  command  for  data  import/export
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]
where option can be one of:
FORMAT format_name
OIDS [ boolean ]
DELIMITER 'delimiter_character'
NULL 'null_string'
HEADER [ boolean ]
QUOTE 'quote_character'
ESCAPE 'escape_character'
FORCE_QUOTE { ( column_name [, ...] ) | * }
FORCE_NOT_NULL ( column_name [, ...] )
ENCODING 'encoding_name‘

AND  JDBC  API
org.postgresql.copy.*
Copyright © 2013 NTT DATA Corporation

10
SQOOP-‐‑‒999:
    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM	

File  Split

File  Split

File  Split

Mapper

Mapper

Mapper

PostgreSQL
JDBC

HDFS

PostgreSQL
JDBC

PostgreSQL
JDBC

Using  custom  SQL  command  
via  JDBC  API
only  available  in  PostgreSQL
COPY  FROM  STDIN  WITH  ...

staging  tagle
PostgreSQL
Destination  Table

Copyright © 2013 NTT DATA Corporation

11
SQOOP-‐‑‒999:
    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM  
import org.postgresql.copy.CopyManager;
import org.postgresql.copy.CopyIn;
...

Requiring  PostgreSQL  
specific  interface.

protected void setup(Context context)
...
dbConf = new DBConfiguration(conf);
CopyManager cm = null;
...
public void map(LongWritable key, Writable value, Context context)
...
if (value instanceof Text) {
line.append(System.getProperty("line.separator"));
}
try {
byte[]data = line.toString().getBytes("UTF-8"); Just  feeding  lines  of  text
copyin.writeToCopy(data, 0, data.length);
Copyright © 2013 NTT DATA Corporation

12
Yet  another  direct  connector  for  PostgreSQL  JDBC

Pros:
Fast  enough
Ease  of  use

JDBC  driver  jar  is  distributed  automatically  by  MR  framework

Cons:
Dependency  on  not  general  JDBC

possible  licensing  issue  (PostgreSQL  is  OK,  itʼ’s  BSD  Lisence)
build  time  requirement  (PostgreSQL  JDBC  is  available  in  Maven  repo.)
<dependency org="org.postgresql" name="postgresql"
rev="${postgresql.version}" conf="common->default" />

Error  record  causes  rollback  of  whole  transaction
Still  difficult  to  implement  custom  connector  for  IMPORT
because  of  code  generation  part

Copyright © 2013 NTT DATA Corporation

13
Supporting  complex  data  types
PostgreSQL  supports  lot  of  complex  data  types  
Geometric  Types
Points
Line  Segments
Boxes
Paths
Polygons
Circles

Network  Address  Types
inet
cidr
macaddr

XML  Type
JSON  Type

Supporting  complex  data  types:
    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types
Copyright © 2013 NTT DATA Corporation

not  me
14
Constraints  on  JDBC  data  types  in  Sqoop  framework
protected Map<String, Integer> getColumnTypesForRawQuery(String stmt) {
...
results = execute(stmt);
...
ResultSetMetaData metadata = results.getMetaData();
for (int i = 1; i < cols + 1; i++) {
int typeId = metadata.getColumnType(i);

returns  java.sql.Types.OTHER  for  types  
{ not  mappable  to  basic    Java  data  types
=>  Losing  type  imformation    

public String toJavaType(int sqlType)
// Mappings taken from:
// http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html
if (sqlType == Types.INTEGER) {
return "Integer";
} else if (sqlType == Types.VARCHAR) {
return "String";
....
reaches  here
} else {
// TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT.
// Return null indicating database-specific manager should return a
// java data type if it can find one for any nonstandard type.
return null;
Copyright © 2013 NTT DATA Corporation

15
Sqoop  1  Summary

Pros:
Simple  Standalone  MapReduce  Driver
Easy  to  understand  for  MR  application  developpers

                                                                            except  for  ORM  (SqoopRecord)  code  generation  part.

Variety  of  connectors
Lot  of  information
Cons:
Complex  command  line  and  inconsistent  options
meaning  of  options  is  according  to  connectors

Not  enough  modular
Dependency  on  JDBC  data  model
Security
Copyright © 2013 NTT DATA Corporation

16
Sqooping  PostgreSQL  Data  2

Copyright © 2013 NTT DATA Corporation

17
Sqoop  2
Everything  are  rewritten
Working  on  server  side
More  modular
Not  compatible  with  Sqoop  1  at  all
(Almost)  Only  generic  connector
Black  box  comparing  to  Sqoop  1
Needs  more  documentation

SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development
Internal of Sqoop2 MapReduce Job
++++++++++++++++++++++++++++++++
...
- OutputFormat invokes Loader's load method (via SqoopOutputFor
.. todo: sequence diagram like figure.
Copyright © 2013 NTT DATA Corporation

18
Sqoop2:  Initialization  phase  of  IMPORT  job
	
 
Implement  this
	
 	
 	
 	
 	
 	
 	
 ,----------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,-----------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 |SqoopInputFormat|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |Partitioner|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 `-------+--------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-----+-----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 getSplits	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 ----------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 getPartitions	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------------------------>|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 ,---------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------->	
 |Partition|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,----------.	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------------------------------------->|SqoopSplit|	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+-----'	
 

Copyright © 2013 NTT DATA Corporation

19
Sqoop2:  Map  phase  of  IMPORT  job
	
 
	
 	
 	
 	
 	
 	
 	
 ,-----------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 |SqoopMapper|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 `-----+-----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 run	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 --------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,-------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |---------------------------------->|MapDataWriter|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `------+------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Implement  this
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,---------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------->	
 |Extractor|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 extract	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Conversion  to  Sqoop  
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
internal  data  format
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 read	
 from	
 DB	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 <-------------------------------|	
 	
 	
 	
 	
 	
 write*	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,----.	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |---------->|Data|	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-+--'	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 context.write	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------------->	
 
	
 
Copyright © 2013 NTT DATA Corporation

20
Sqoop2:  Reduce  phase  of  EXPORT  job

	
 	
 ,-------.	
 	
 ,---------------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 |Reducer|	
 	
 |SqoopNullOutputFormat|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 `---+---'	
 	
 `----------+----------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 ,-----------------------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Implement  this
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |->	
 |SqoopOutputFormatLoadExecutor|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 `--------------+--------------'	
 	
 	
 	
 	
 	
 	
 	
 ,----.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |--------------------->	
 |Data|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-+--'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 ,-----------------.	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |->	
 |SqoopRecordWriter|	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 getRecordWriter	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 `--------+--------'	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
----------------------->|	
 getRecordWriter	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 ,--------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------------------------->	
 |ConsumerThread|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 `------+-------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 ,------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |--->|Loader|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 `--+---'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 load	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 run	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------>|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
----->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 write	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |------------------------------------------------>|	
 setContent	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 read*	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------->|	
 getContent	
 |<------|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-----------|	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 -	
 -	
 ->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 write	
 into	
 DB	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |-------------->	
 

Copyright © 2013 NTT DATA Corporation

21
Summary

Copyright © 2013 NTT DATA Corporation

22
My  interest  for  popularizing  Sqoop  2

Complex  data  type  support  in  Sqoop  2
Bridge  to  use  Sqoop  1  connectors  on  Sqoop  2
Bridge  to  use  Sqoop  2  connectors  from  Sqoop  1  CLI

Copyright © 2013 NTT DATA Corporation

23
Copyright © 2011 NTT DATA Corporation

Copyright © 2013 NTT DATA Corporation

More Related Content

What's hot

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)Kohei KaiGai
 
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀EXEM
 
A git workflow for Drupal Core development
A git workflow for Drupal Core developmentA git workflow for Drupal Core development
A git workflow for Drupal Core developmentCameron Tod
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)Naoto MATSUMOTO
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkRiyaj Shamsudeen
 
A kind and gentle introducton to rac
A kind and gentle introducton to racA kind and gentle introducton to rac
A kind and gentle introducton to racRiyaj Shamsudeen
 
Athenticated smaba server config with open vpn
Athenticated smaba server  config with open vpnAthenticated smaba server  config with open vpn
Athenticated smaba server config with open vpnChanaka Lasantha
 
Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우PgDay.Seoul
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel ExecutionDoug Burns
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015Kohei KaiGai
 
Cisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsCisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsBootcamp SCL
 
Ims06 change you can believe in
Ims06 change you can believe inIms06 change you can believe in
Ims06 change you can believe inRobert Hain
 
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsReplicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsLinas Virbalas
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersConnor McDonald
 
Amazon aurora 1
Amazon aurora 1Amazon aurora 1
Amazon aurora 1EXEM
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb Connor McDonald
 

What's hot (20)

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
 
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
 
A git workflow for Drupal Core development
A git workflow for Drupal Core developmentA git workflow for Drupal Core development
A git workflow for Drupal Core development
 
Apend. networking linux
Apend. networking linuxApend. networking linux
Apend. networking linux
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: Network
 
A kind and gentle introducton to rac
A kind and gentle introducton to racA kind and gentle introducton to rac
A kind and gentle introducton to rac
 
Athenticated smaba server config with open vpn
Athenticated smaba server  config with open vpnAthenticated smaba server  config with open vpn
Athenticated smaba server config with open vpn
 
Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
Cisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsCisco vs. huawei CLI Commands
Cisco vs. huawei CLI Commands
 
Ims06 change you can believe in
Ims06 change you can believe inIms06 change you can believe in
Ims06 change you can believe in
 
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsReplicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
 
Ccnp3 lab 3_3_en
Ccnp3 lab 3_3_enCcnp3 lab 3_3_en
Ccnp3 lab 3_3_en
 
Amazon aurora 1
Amazon aurora 1Amazon aurora 1
Amazon aurora 1
 
Memcache as udp traffic reflector
Memcache as udp traffic reflectorMemcache as udp traffic reflector
Memcache as udp traffic reflector
 
Ccnp3 lab 3_4_en
Ccnp3 lab 3_4_enCcnp3 lab 3_4_en
Ccnp3 lab 3_4_en
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb
 

Viewers also liked

Intellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyIntellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyConstantine Zerov
 
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 201014 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010Sierra Francisco Justo
 
Fine motor
Fine motorFine motor
Fine motorjeh20717
 
Fine Motor 2
Fine Motor 2Fine Motor 2
Fine Motor 2jeh20717
 
Israel dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rIsrael dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rSierra Francisco Justo
 
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)IT Arena
 
D drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeD drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeZoran Stojcevski
 
What is it to be a senior engineer?
What is it to be a senior engineer?What is it to be a senior engineer?
What is it to be a senior engineer?Vladimir Miguro
 
До Дня Незалежності України
До Дня Незалежності УкраїниДо Дня Незалежності України
До Дня Незалежності УкраїниMargoshenka
 
Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Constantine Zerov
 
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)IT Arena
 
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsСтроим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsMaxim Uvarov
 
Reworker предпочтовая подготовка
Reworker предпочтовая подготовкаReworker предпочтовая подготовка
Reworker предпочтовая подготовкаNikita Florinskiy
 
PwC Sports Outlook 2016
PwC Sports Outlook 2016PwC Sports Outlook 2016
PwC Sports Outlook 2016Jonesy033
 
Failure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeFailure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeOluwaseyi Adeniyan
 
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.GIDEONE Moura Santos Ferreira
 
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Constantine Zerov
 

Viewers also liked (20)

Retail crm
Retail crmRetail crm
Retail crm
 
Intellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyIntellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro Gadomsky
 
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 201014 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
 
VidaEconomica_21NOV
VidaEconomica_21NOVVidaEconomica_21NOV
VidaEconomica_21NOV
 
Fine motor
Fine motorFine motor
Fine motor
 
Fine Motor 2
Fine Motor 2Fine Motor 2
Fine Motor 2
 
Israel dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rIsrael dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa r
 
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
 
D drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeD drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mlade
 
What is it to be a senior engineer?
What is it to be a senior engineer?What is it to be a senior engineer?
What is it to be a senior engineer?
 
До Дня Незалежності України
До Дня Незалежності УкраїниДо Дня Незалежності України
До Дня Незалежності України
 
Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.
 
E05912226
E05912226E05912226
E05912226
 
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
 
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsСтроим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
 
Reworker предпочтовая подготовка
Reworker предпочтовая подготовкаReworker предпочтовая подготовка
Reworker предпочтовая подготовка
 
PwC Sports Outlook 2016
PwC Sports Outlook 2016PwC Sports Outlook 2016
PwC Sports Outlook 2016
 
Failure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeFailure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine Blade
 
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
 
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
 

Similar to Complex stories about Sqooping PostgreSQL data

PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New FeaturesJosé Lin
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Stephen Gordon
 
Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Trevor Roberts Jr.
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
TrinityCore server install guide
TrinityCore server install guideTrinityCore server install guide
TrinityCore server install guideSeungmin Shin
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
 
NetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationNetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationCumulus Networks
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multiKohei KaiGai
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PROIDEA
 
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Continuent
 

Similar to Complex stories about Sqooping PostgreSQL data (20)

PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New Features
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
TrinityCore server install guide
TrinityCore server install guideTrinityCore server install guide
TrinityCore server install guide
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
Pro PostgreSQL
Pro PostgreSQLPro PostgreSQL
Pro PostgreSQL
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
NetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationNetDevOps 202: Life After Configuration
NetDevOps 202: Life After Configuration
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
 
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
 

More from NTT DATA OSS Professional Services

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力NTT DATA OSS Professional Services
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~NTT DATA OSS Professional Services
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントNTT DATA OSS Professional Services
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~NTT DATA OSS Professional Services
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~NTT DATA OSS Professional Services
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのことNTT DATA OSS Professional Services
 

More from NTT DATA OSS Professional Services (20)

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
HDFS Router-based federation
HDFS Router-based federationHDFS Router-based federation
HDFS Router-based federation
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Distributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystemDistributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystem
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
 
Apache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development statusApache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development status
 
HDFS basics from API perspective
HDFS basics from API perspectiveHDFS basics from API perspective
HDFS basics from API perspective
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
 
20170303 java9 hadoop
20170303 java9 hadoop20170303 java9 hadoop
20170303 java9 hadoop
 
ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)
 
Application of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jpApplication of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jp
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

Complex stories about Sqooping PostgreSQL data

  • 1. Presentation  slide  for  Sqoop  User  Meetup  (Strata  +  Hadoop  World  NYC  2013)   Complex  stories  about Sqooping  PostgreSQL  data 10/28/2013 NTT  DATA  Corporation Masatake  Iwasaki Copyright © 2013 NTT DATA Corporation
  • 2. Introduction Copyright © 2013 NTT DATA Corporation 2
  • 3. About  Me Masatake  Iwasaki: Software  Engineer  @  NTT  DATA: NTT(Nippon  Telegraph  and  Telephone  Corporation):  Telecommunication NTT  DATA:  Systems  Integrator Developed: Ludia:  Fulltext  search  index  for  PostgreSQL  using  Senna Authored: “A  Complete  Primer  for  Hadoop”   (no  official  English  title) Patches  for  Sqoop: SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM   SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development   Copyright © 2013 NTT DATA Corporation 3
  • 4. Why  PostgreSQL? Enterprisy from  earlier  version comparing  to  MySQL Active  community  in  Japan NTT  DATA  commits  itself  to  development Copyright © 2013 NTT DATA Corporation 4
  • 5. Sqooping  PostgreSQL  data Copyright © 2013 NTT DATA Corporation 5
  • 6. Working  around  PostgreSQL Direct  connector  for  PostgreSQL  loader:    SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload Yet  another  direct  connector  for  PostgreSQL  JDBC:    SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL                                            using  COPY  ...  FROM Supporting  complex  data  types:    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types   Copyright © 2013 NTT DATA Corporation 6
  • 7. Direct  connector  for  PostgreSQL  loader SQOOP-‐‑‒390:      PostgreSQL  connector  for  direct  export  with  pg_̲bulkload pg_̲bulkload: Data  loader  for  PostgreSQL Server  side  plug-‐‑‒in  library  and  client  side  command Providing  filtering  and  transformation  of  data http://pgbulkload.projects.pgfoundry.org/ Copyright © 2013 NTT DATA Corporation 7
  • 8. SQOOP-‐‑‒390:      PostgreSQL  connector  for  direct  export  with  pg_̲bulkload HDFS File  Split File  Split File  Split CREATE  TABLE    tmp3(LIKE  dest  INCLUDING  CONSTRAINTS) Mapper Mapper Mapper pg_̲bulkload pg_̲bulkoad pg_̲bulkload tmp1 PostgreSQL tmp2 Reducer Destination  Table Copyright © 2013 NTT DATA Corporation tmp3 external  process staging  table  per  mapper  is  must due  to  table  level  locks BEGIN INSERT  INTO  dest  (  SELECT  *  FROM  tmp1  ) DROP  TABLE  tmp1 INSERT  INTO  dest  (  SELECT  *  FROM  tmp2  ) DROP  TABLE  tmp2 INSERT  INTO  dest  (  SELECT  *  FROM  tmp3  ) DROP  TABLE  tmp3 COMMIT 8
  • 9. Direct  connector  for  PostgreSQL  loader Pros: Fast by  short-‐‑‒circuitting  server  functionality Flexible filtering  error  records Cons: Not  so  fast Bottleneck  is  not  in  client  side  but  in  DB  side Built-‐‑‒in  COPY  functionality  is  fast  enough Not  General pg_̲bulkload  supports  only  export Requiring  setup  on  all  slave  nodes  and  client  node Possible  to  Require  recovery  on  failure Copyright © 2013 NTT DATA Corporation 9
  • 10. Yet  another  direct  connector  for  PostgreSQL  JDBC PostgreSQL  provides  custom  SQL  command  for  data  import/export COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | STDIN } [ [ WITH ] ( option [, ...] ) ] COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | STDOUT } [ [ WITH ] ( option [, ...] ) ] where option can be one of: FORMAT format_name OIDS [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] QUOTE 'quote_character' ESCAPE 'escape_character' FORCE_QUOTE { ( column_name [, ...] ) | * } FORCE_NOT_NULL ( column_name [, ...] ) ENCODING 'encoding_name‘ AND  JDBC  API org.postgresql.copy.* Copyright © 2013 NTT DATA Corporation 10
  • 11. SQOOP-‐‑‒999:    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM File  Split File  Split File  Split Mapper Mapper Mapper PostgreSQL JDBC HDFS PostgreSQL JDBC PostgreSQL JDBC Using  custom  SQL  command   via  JDBC  API only  available  in  PostgreSQL COPY  FROM  STDIN  WITH  ... staging  tagle PostgreSQL Destination  Table Copyright © 2013 NTT DATA Corporation 11
  • 12. SQOOP-‐‑‒999:    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM   import org.postgresql.copy.CopyManager; import org.postgresql.copy.CopyIn; ... Requiring  PostgreSQL   specific  interface. protected void setup(Context context) ... dbConf = new DBConfiguration(conf); CopyManager cm = null; ... public void map(LongWritable key, Writable value, Context context) ... if (value instanceof Text) { line.append(System.getProperty("line.separator")); } try { byte[]data = line.toString().getBytes("UTF-8"); Just  feeding  lines  of  text copyin.writeToCopy(data, 0, data.length); Copyright © 2013 NTT DATA Corporation 12
  • 13. Yet  another  direct  connector  for  PostgreSQL  JDBC Pros: Fast  enough Ease  of  use JDBC  driver  jar  is  distributed  automatically  by  MR  framework Cons: Dependency  on  not  general  JDBC possible  licensing  issue  (PostgreSQL  is  OK,  itʼ’s  BSD  Lisence) build  time  requirement  (PostgreSQL  JDBC  is  available  in  Maven  repo.) <dependency org="org.postgresql" name="postgresql" rev="${postgresql.version}" conf="common->default" /> Error  record  causes  rollback  of  whole  transaction Still  difficult  to  implement  custom  connector  for  IMPORT because  of  code  generation  part Copyright © 2013 NTT DATA Corporation 13
  • 14. Supporting  complex  data  types PostgreSQL  supports  lot  of  complex  data  types   Geometric  Types Points Line  Segments Boxes Paths Polygons Circles Network  Address  Types inet cidr macaddr XML  Type JSON  Type Supporting  complex  data  types:    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types Copyright © 2013 NTT DATA Corporation not  me 14
  • 15. Constraints  on  JDBC  data  types  in  Sqoop  framework protected Map<String, Integer> getColumnTypesForRawQuery(String stmt) { ... results = execute(stmt); ... ResultSetMetaData metadata = results.getMetaData(); for (int i = 1; i < cols + 1; i++) { int typeId = metadata.getColumnType(i); returns  java.sql.Types.OTHER  for  types   { not  mappable  to  basic    Java  data  types =>  Losing  type  imformation     public String toJavaType(int sqlType) // Mappings taken from: // http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html if (sqlType == Types.INTEGER) { return "Integer"; } else if (sqlType == Types.VARCHAR) { return "String"; .... reaches  here } else { // TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT. // Return null indicating database-specific manager should return a // java data type if it can find one for any nonstandard type. return null; Copyright © 2013 NTT DATA Corporation 15
  • 16. Sqoop  1  Summary Pros: Simple  Standalone  MapReduce  Driver Easy  to  understand  for  MR  application  developpers                                                                            except  for  ORM  (SqoopRecord)  code  generation  part. Variety  of  connectors Lot  of  information Cons: Complex  command  line  and  inconsistent  options meaning  of  options  is  according  to  connectors Not  enough  modular Dependency  on  JDBC  data  model Security Copyright © 2013 NTT DATA Corporation 16
  • 17. Sqooping  PostgreSQL  Data  2 Copyright © 2013 NTT DATA Corporation 17
  • 18. Sqoop  2 Everything  are  rewritten Working  on  server  side More  modular Not  compatible  with  Sqoop  1  at  all (Almost)  Only  generic  connector Black  box  comparing  to  Sqoop  1 Needs  more  documentation SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development Internal of Sqoop2 MapReduce Job ++++++++++++++++++++++++++++++++ ... - OutputFormat invokes Loader's load method (via SqoopOutputFor .. todo: sequence diagram like figure. Copyright © 2013 NTT DATA Corporation 18
  • 19. Sqoop2:  Initialization  phase  of  IMPORT  job Implement  this ,----------------. ,-----------. |SqoopInputFormat| |Partitioner| `-------+--------' `-----+-----' getSplits | | ----------->| | | getPartitions | |------------------------>| | | ,---------. | |-------> |Partition| | | `----+----' |<- - - - - - - - - - - - | | | | | ,----------. |-------------------------------------------------->|SqoopSplit| | | | `----+-----' Copyright © 2013 NTT DATA Corporation 19
  • 20. Sqoop2:  Map  phase  of  IMPORT  job ,-----------. |SqoopMapper| `-----+-----' run | --------->| ,-------------. |---------------------------------->|MapDataWriter| | `------+------' Implement  this | ,---------. | |--------------> |Extractor| | | `----+----' | | extract | | |-------------------->| | Conversion  to  Sqoop   | | | internal  data  format read from DB | | <-------------------------------| write* | | |------------------->| | | | ,----. | | |---------->|Data| | | | `-+--' | | | | | | context.write | | |--------------------------> Copyright © 2013 NTT DATA Corporation 20
  • 21. Sqoop2:  Reduce  phase  of  EXPORT  job ,-------. ,---------------------. |Reducer| |SqoopNullOutputFormat| `---+---' `----------+----------' | | ,-----------------------------. Implement  this | |-> |SqoopOutputFormatLoadExecutor| | | `--------------+--------------' ,----. | | |---------------------> |Data| | | | `-+--' | | | ,-----------------. | | | |-> |SqoopRecordWriter| | getRecordWriter | | `--------+--------' | ----------------------->| getRecordWriter | | | | |----------------->| | | ,--------------. | | |-----------------------------> |ConsumerThread| | | | | | `------+-------' | |<- - - - - - - - -| | | | ,------. <- - - - - - - - - - - -| | | | |--->|Loader| | | | | | | `--+---' | | | | | | | | | | | | | load | run | | | | | |------>| ----->| | write | | | | | |------------------------------------------------>| setContent | | read* | | | | |----------->| getContent |<------| | | | | |<-----------| | | | | | | | - - ->| | | | | | | | write into DB | | | | | | |--------------> Copyright © 2013 NTT DATA Corporation 21
  • 22. Summary Copyright © 2013 NTT DATA Corporation 22
  • 23. My  interest  for  popularizing  Sqoop  2 Complex  data  type  support  in  Sqoop  2 Bridge  to  use  Sqoop  1  connectors  on  Sqoop  2 Bridge  to  use  Sqoop  2  connectors  from  Sqoop  1  CLI Copyright © 2013 NTT DATA Corporation 23
  • 24. Copyright © 2011 NTT DATA Corporation Copyright © 2013 NTT DATA Corporation