Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/5452
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorLi, M-
dc.contributor.authorAlham, Nasullah Khalid-
dc.date.accessioned2011-06-30T11:13:22Z-
dc.date.available2011-06-30T11:13:22Z-
dc.date.issued2011-
dc.identifier.urihttp://bura.brunel.ac.uk/handle/2438/5452-
dc.descriptionThis thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.en_US
dc.description.abstractMachine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced. The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers. The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications.en_US
dc.language.isoenen_US
dc.publisherBrunel University School of Engineering and Design PhD Theses-
dc.relation.urihttp://bura.brunel.ac.uk/bitstream/2438/5452/1/FulltextThesis.pdf-
dc.subjectImage annotationen_US
dc.subjectMap reduceen_US
dc.subjectMachine learningen_US
dc.subjectSVMen_US
dc.subjectDistributed computingen_US
dc.titleParallelizing support vector machines for scalable image annotationen_US
dc.typeThesisen_US
Appears in Collections:Electronic and Computer Engineering
Dept of Electronic and Electrical Engineering Theses

Files in This Item:
File Description SizeFormat 
FulltextThesis.pdf2.05 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.