Home
Add
Get on Google Play
Home
> Edit
Add/Update Thesis
Title*
Author's Name*
Supervisor's Name
Abstract
The amount of data has been increasing over the last few years due to the emergence of various end-user applications. These applications utilize cloud computing infrastructure in the data centers. Apart from the increasing volume of data, there are other factors such as variety, velocity, and veracity of the data which result in the problem of big data. Traditional database management systems are not efficient to handle big data. The use of big data platform is necessary to resolve the big data problem. Hadoop is one of the platforms which resolve the problem of big data. Hadoop uses a distributed storage system. Hive and HBase are some of the big data tools for storing big data in Hadoop. They run on top of Hadoop distributed file system (HDFS). Hive is a data warehouse framework for querying and analysis of data that is stored in HDFS.?Hive?is an open-source software that lets programmers analyze large data sets on Hadoop. HBase is a column-oriented, distributed and high fault-tolerant database. It is used to store and manage big data. It can store billions of rows at a time. Both Hive and HBase can be used to store the big data in Hadoop. When the data comes from multiple sources, it is stored into multiple tables in Hive and HBase. As a result, its performance degrades when there is a need to perform join operations. In this thesis, we propose an architecture which stores data from multiple sources into a single HBase table. A new table schema with a unique row key is designed which integrates multi-source data in a table. There is no need to perform join operation in the proposed technique as the data is integrated into a single HBase table. We evaluated the proposed technique using a real testbed by considering a dataset of two publishers. We compare the performance by storing data into Hive and also in the proposed HBase table. Results show improved query performance of the proposed technique as compared to the traditional approach of using join operations in multiple tables in Hive.
Subject/Specialization
Language
Program
Faculty/Department's Name
Institute Name
Univeristy Type
Public
Private
Campus (if any)
Institute Affiliation Inforamtion (if any)
City where institute is located
Province
Country
Degree Starting Year
Degree Completion Year
Year of Viva Voce Exam
Thesis Completion Year
Thesis Status
Completed
Incomplete
Number of Pages
Urdu Keywords
English Keywords
Link
Select Category
Religious Studies
Social Sciences & Humanities
Science
Technology
Any other inforamtion you want to share such as Table of Contents, Conclusion.
Your email address*