Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/11577
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKhan, M-
dc.contributor.authorJin, Y-
dc.contributor.authorLi, M-
dc.contributor.authorXiang, Y-
dc.contributor.authorJiang, C-
dc.date.accessioned2015-11-10T14:16:36Z-
dc.date.available2015-
dc.date.available2015-11-10T14:16:36Z-
dc.date.issued2015-
dc.identifier.citationIEEE Transactions on Parallel and Distributed Systems, (2015)en_US
dc.identifier.issn1045-9219-
dc.identifier.urihttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7045505-
dc.identifier.urihttp://bura.brunel.ac.uk/handle/2438/11577-
dc.description.abstractMapReduce has become a major computing model for data intensive applications. Hadoop, an open source implementation of MapReduce, has been adopted by an increasingly growing user community. Cloud computing service providers such as Amazon EC2 Cloud offer the opportunities for Hadoop users to lease a certain amount of resources and pay for their use. However, a key challenge is that cloud service providers do not have a resource provisioning mechanism to satisfy user jobs with deadline requirements. Currently, it is solely the user's responsibility to estimate the required amount of resources for running a job in the cloud. This paper presents a Hadoop job performance model that accurately estimates job completion time and further provisions the required amount of resources for a job to be completed within a deadline. The proposed model builds on historical job execution records and employs Locally Weighted Linear Regression (LWLR) technique to estimate the execution time of a job. Furthermore, it employs Lagrange Multipliers technique for resource provisioning to satisfy jobs with deadline requirements. The proposed model is initially evaluated on an in-house Hadoop cluster and subsequently evaluated in the Amazon EC2 Cloud. Experimental results show that the accuracy of the proposed model in job execution estimation is in the range of 94.97% and 95.51%, and jobs are completed within the required deadlines following on the resource provisioning scheme of the proposed model.en_US
dc.description.sponsorshipThis research is partially supported by the 973 project on Network Big Data Analytics funded by the Ministry of Science and Technology, China. No. 2014CB340404.en_US
dc.format.extent1 - 1-
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectCloud computingen_US
dc.subjectHadoop MapReduceen_US
dc.subjectJob estimationen_US
dc.titleHadoop performance modeling for job estimation and resource provisioningen_US
dc.typeArticleen_US
dc.identifier.doihttp://dx.doi.org/10.1109/TPDS.2015.2405552-
dc.relation.isPartOfIEEE Transactions on Parallel and Distributed Systems-
pubs.publication-statusPublished-
pubs.publication-statusPublished-
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
Fulltext.pdf1.02 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.