Category Archives: Hadoop tutorial

Parse XML File using HIVE

Posted on by Sumit Kumar

In this post , we will learn how to parse XML file using hive. I am using below xml file for this example. jmdbks@hadoop:~$ cat test.xml <test><name>Sumit Kumar</name><properties><age>29</age><sex>male</sex></properties></test> <test><name>Amit Kumar</name><properties><age>30</age><sex>male</sex></properties></test> <test><name>Aditya Kumar</name><properties><age>23</age><sex>male</sex></properties></test> <test><name>Priya Kumar</name><properties><age>24</age><sex>Female</sex></properties></test> <test><name>Rohan Kumar</name><properties><age>20</age><sex>male</sex></properties></test> <test><name>Nitish Kumar</name><properties><age>29</age><sex>male</sex></properties></test> jmdbks@hadoop:~$ Below are the Step by Step Procedure to parse XML file using hive . Step 1:- […]

Important built-in function in Hive

Posted on by Sumit Kumar

(I)explode() and posexplode():- explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. Below example will help to understand explode() better. 1)Create example data set that having only one column  as Array<int>. beauty2955@hadoop:~$ cat array_exm1 100,200,300,500 400,200,201 300,45 101 2)create table and load array_exm1 into […]

Date function in hive

Posted on by Sumit Kumar

1)from_unixtime: This function converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a STRING that represents the TIMESTAMP of that moment in the current system time zone in the format of “1970-01-01 00:00:00”. The following example returns the current date including the time. hive> SELECT FROM_UNIXTIME(UNIX_TIMESTAMP()); OK                                         2015–05–18 05:43:37 Time taken: 0.153 […]

Hive Installation

Posted on by Sumit Kumar

Hive installation: 1.) search for apache hive-2.2.0 bin in google and download zar file (latest bin.tar.gz file) http://www-eu.apache.org/dist/hive/hive-2.2.0/ e.g. :- apache-hive-2.2.0-bin.tar.gz or download hive from linux command as below:– wget http://www-eu.apache.org/dist/hive/hive-2.2.0/apache-hive-2.2.0-bin.tar.gz 2.) extract file: tar -xvf <filename> e.g.:-tar -xvf apache-hive-2.2.0-bin.tar.gz mv apache-hive-2.2.0-bin hive2 3.) download mysqlconnector by using below command wget https://la-mirrors.evowise.com/mysql/Downloads/Connector-J/mysql-connector-java-5.1.45.tar.gz extract file : […]


Planet Polaris

Website Support
website support at Planet Polaris