Our scenario is a below:
- DWH: Netezza (IBM Puredata)
- Hadoop : Cloudera (CDH5.4) and we are using Cloudera Manager.
- Hadoop Cluster configuration consideration: All the worker nodes are behind a private network. Can not be accessed with our edge/management nodes.
- Objective: Import data from Netezza to a Hive table.
I followed below steps to meet the objective.
As pr our current stats, we are now able to migrate 700 GB data in 1 hour (using 4 channels or mappers).
1. Install netezza
connectors:
1.1
Install netezza connector for sqoop following steps in link:
http://cloudera.com/content/cloudera/en/documentation/connectors/latest/Netezza/Cloudera-Connector-for-Netezza/ccnt_topic_3.html
1.2
(for Oozie)download "sqoop-nz-connector-1.2c5.tar" from below link
and upload sqoop-nz-connector-1.2c5.jar to HDFS:
/user/oozie/share/lib/lib_/sqoop/
http://www.cloudera.com/content/cloudera/en/downloads/connectors/sqoop/netezza/1-2c5-for-cdh-5-x.html
1.3 sudo -su oozie
hadoop fs -put /tmp/sqoop-nz-connector-1.2c5.jar /user/oozie/share/lib/lib_/sqoop/
2. In addition to
above,
2.1
Netezza JDBC driver directory /var/lib/sqoop should have below permissions:
This directory should have set permission as
(chmod 755 /var/lib/sqoop)
full path up to the driver file
/var/lib/sqoop/nzjdbc3.jar should be accessible by others (chmod 444
/var/lib/sqoop/nzjdbc3.jar)
2.2
(for Oozie)upload netezza jdbc driver to
HDFS: /user/oozie/share/lib/lib_/sqoop/
sudo
-su oozie hadoop fs -put /var/lib/sqoop/nzjdbc3.jar
/user/oozie/share/lib/lib_/sqoop/
2.3
restart oozie
3. Configure proxy
on edge node/s to connect to netezza for data nodes. (in case your datanodes
are in private network).
[sqoop will launches
several map jobs on data nodes and data nodes connect directly to netezza]
3.1
install haproxy on the edge node/s
yum install haproxy
3.2
Configure the proxy for netezza.
Append
below in '/etc/haproxy/haproxy.cfg'. Here the proxy will listen to 25501 on the
edge node/s and bypass the incoming connection to netezza host:port.
#
Proxy For Netezza
listen
netezza : 25501
mode tcp
option tcplog
balance leastconn
server netezza :
3.3
service haproxy restart
4. To import use
below
*** the user should
have create external table privilege on the target database on netezza
- use username in
upper case
- use direct ( Each
map task of Netezza connector’s import job works on a subset of the Netezza
partitions and transparently creates and uses an external table to transport
data)
- use
"--P" for password prompt
- we can pass
mapreduce params like "-D mapreduce.task.timeout=0" just after
import/export command
Here:
4.1 to import data on a HDFS directory use below:
sqoop import \
--direct \
--connect
jdbc:netezza://:/ \
--table D_ACCOUNT \
--fields-terminated-by
'|' \
--target-dir
/user/etl/temp/ \
--username
\
--password
'' \
-- --nz-logdir /tmp/
4.2 to import data on Hive table use below (it
will prompt for password as we used “--P”):
sqoop import \
--connect
jdbc:netezza://:25501/ \
--table . \
--where \
--target-dir "" \
--username \
--P \
--hive-import \
--hive-table "" \
--hive-overwrite \
--delete-target-dir \
--num-mappers 4 \
--verbose \
--direct \
-- --nz-logdir
--nz-uploaddir
4.3 Then use hive to copy data from temp table(text
or serial format) to final table(parquet + snappy) (select ..insert). Sqoop has
some limitation to load data directly to a parquet table.
Hey, Great job here on NETEZZA. You know what! I read a lot of blog posts and I never come across such a helpful post. I like this post you made with great effort. Very ingenious.
ReplyDeleteGood blog,thanks for sharing the valuable information...thank you.
ReplyDeleteBest software Training institute in Bangalore
It was really a nice post and i was really impressed by reading this Big Data Hadoop Online Course Bnagalore
ReplyDeleteThank you for sharing your wonderful information. for professional graphics,
ReplyDeletework contacts me. I am a freelance designer in gurgaon.
Freelance Graphic Designing
Freelance Catalogue Designing in delhi
Freelance Catalogue Designing in gurgaon
Freelance Brochure Designing
Freelance Label Designing
Freelance Banner Designer
Freelance Poster Designer
graphic design services in delhi
graphic design services in gurgaon
Freelance Catalogue Designing in delhi
Freelance Catalogue Designing in gurgaon
Freelance Brochure Designing
Freelance Label Designing
Freelance Banner Designer
Freelance Poster Designer
graphic design services in delhi
graphic design services in gurgaon
freelance website designer in gurgaon
freelance designer in gurgaon
freelance website designer in gurgaon
freelance web designer in gurgaon
freelance graphic designer services in gurgaon
freelancer graphic designer services in gurgaon
freelancer graphic designer services in gurgaon
freelancer graphic services in gurgaon
freelancer logo services in gurgaon
freelancer logo services in gurgaon
freelancer web designer services in gurgaon
freelancer web designer services in gurgaon
freelance web designer services in gurgaon
freelance website designer services in gurgaon
freelance website designer services in gurgaon
freelance logo designer service in gurgaon
freelance logo designer service in gurgaon
logo designer in gurgaon
brochure design in gurgaon
logo design in gurgaon
freelance logo design in gurgaon
freelance logo designer in gurgaon
freelance logo designer in gurgaon
great very informative....
ReplyDeleteInterview Question for CTS Placement
Cognizant Interview Questions For Fresher
Cognizant Interview Questions
Resume Coustomer Service Executive
Resume For Bank Job
Resume Cyber security Engineer
Resume Data Base Developer
Resume DeputyManager
Resume Design Engineer
Resume Desktop Support Engineer
I am impressed. I don't think Ive met anyone who knows as much about this subject as you do. You are truly well informed and very intelligent. You wrote something that people could understand and made the subject intriguing for everyone. Really, great blog you have got here
ReplyDeleteBCOM 1st Year Exam TimeTable 2020
BCOM 2nd Year Exam Date Sheet/TimeTable 2020
BCOM 3rd Year
Exam Time Schedule 2020
It is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot
ReplyDeleteSalesforce Training in Chennai
Salesforce Online Training in Chennai
Salesforce Training in Bangalore
Salesforce Training in Hyderabad
Salesforce training in ameerpet
Salesforce Training in Pune
Salesforce Online Training
Salesforce Training
kayseriescortu.com - alacam.org - xescortun.com
ReplyDeleteSMM PANEL
ReplyDeletesmm panel
İs İlanlari Blog
instagram takipçi satın al
HİRDAVATCİ
HTTPS://WWW.BEYAZESYATEKNİKSERVİSİ.COM.TR
servis
tiktok jeton hilesi
beykoz toshiba klima servisi
ReplyDeletemaltepe beko klima servisi
kadıköy beko klima servisi
üsküdar toshiba klima servisi
kartal bosch klima servisi
ümraniye bosch klima servisi
kartal arçelik klima servisi
ümraniye arçelik klima servisi
beykoz samsung klima servisi
Good content. You write beautiful things.
ReplyDeletesportsbet
mrbahis
hacklink
mrbahis
korsan taksi
taksi
vbet
sportsbet
hacklink
Good text Write good content success. Thank you
ReplyDeletetipobet
kibris bahis siteleri
kralbet
bonus veren siteler
poker siteleri
mobil ödeme bahis
slot siteleri
betmatik
dijital kartvizit
ReplyDeletereferans kimliği nedir
binance referans kodu
referans kimliği nedir
bitcoin nasıl alınır
resimli magnet
Z12XLX
hatay
ReplyDeletekars
mardin
samsun
urfa
7DL
https://saglamproxy.com
ReplyDeletemetin2 proxy
proxy satın al
knight online proxy
mobil proxy satın al
4Z5X
Sridevi Satta Penal Chart, sridevi satta result penal chart, sridevi satta matka penal jodi patti record chart, sridevi day night satta chart, day night sridevi satta patti penal chart
ReplyDeleteThank You and I have a nifty proposal: What Do House Renovations Cost custom home additions
ReplyDeleteصيانة افران الغاز بمكه 9s1jkbU8ni
ReplyDeleteشركة مكافحة بق الفراش بالاحساء uVe2GPFJJp
ReplyDeleteصيانة افران مكة fkfrWoCyBe
ReplyDelete