Analysing Apache Pig, Apache Hive, and Mysql Query Performance on Large Dataset

Search by :

ALL Author Subject ISBN/ISSN Advanced Search

Last search:

Analysing Apache Pig, Apache Hive, and Mysql Query Performance on Large Dataset

Erwin, Alva - Personal Name; Ipung, Heru Purnomo - Personal Name; Alkatiri, Ammar Fuad - Personal Name;

Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. With the capabilities of Hadoop to process large data sets, it will help to save resource on computer. MySQL Cluster is a famous database that is used by the company, government, and many other institutes. the problem of MySQL Cluster is that as the data grow larger, processing data time is increasing and additional resource may be needed. With Hadoop and tools provided for Hadoop, processing time can be decreased and with data consistency. the purpose of this research is to find when is Hadoop and the tools can overcome processing time of MySQL Cluster with data consistency. Another aspect that makes Hadoop is suitable for big data is by adding another extra Hadoop node, it can squeeze more the processing time. the research is done by creating dummy datasets with large rows and queries statements on each dataset. the result of queries time will determine when Hadoop overcomes MySQL Cluster.

Availability

B01610 (Rack Thesis) Available

Detail Information

Series Title: -
Call Number: 1610
Publisher: : Swiss German University., 2014
Collation: -
Language: English
ISBN/ISSN: -
Classification: NONE
Content Type: -
Media Type: -
Carrier Type: -
Edition: -
Subject(s): Big data
IT
MySQL
Hadoop
Hive
Pig
MySQL cluster
Processing big data
Specific Detail Info: -
Statement of Responsibility: -

Other version/related

No other version available

File Attachment

No Data

Comments

You must be logged in to post a comment