Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/15389
Full metadata record
DC FieldValueLanguage
dc.contributor.authorHuang, Z-
dc.contributor.authorLi, M-
dc.contributor.authorChousidis, C-
dc.contributor.authorMousavi, A-
dc.contributor.authorJiang, C-
dc.date.accessioned2017-11-07T15:33:59Z-
dc.date.available2017-11-07T15:33:59Z-
dc.date.issued2017-12-12-
dc.identifier.citationHuang, Z., Li, M., Chousidis, C., Mousavi, A. and Jiang, C. (2018) 'Schema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analytics,' IEEE Transactions on Evolutionary Computation, 22 (5), pp. 792 - 804. doi: 10.1109/TEVC.2017.2771445.en_US
dc.identifier.issn1089-778X-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/15389-
dc.description.abstractGene expression programming (GEP) is a data driven evolutionary technique that well suits for correlation mining. Parallel GEPs are proposed to speed up the evolution process using a cluster of computers or a computer with multiple CPU cores. However, the generation structure of chromosomes and the size of input data are two issues that tend to be neglected when speeding up GEP in evolution. To fill the research gap, this paper proposes three guiding principles to elaborate the computation nature of GEP in evolution based on an analysis of GEP schema theory. As a result, a novel data engineered GEP is developed which follows closely the generation structure of chromosomes in parallelization and considers the input data size in segmentation. Experimental results on two data sets with complementary features show that the data engineered GEP speeds up the evolution process significantly without loss of accuracy in data correlation mining. Based on the experimental tests, a computation model of the data engineered GEP is further developed to demonstrate its high scalability in dealing with potential big data using a large number of CPU cores.en_US
dc.description.sponsorshipNational Fundamental Research Program (973) of China; 10.13039/501100003399-Science and Technology Commission of Shanghai Municipality; European Union’s Horizon 2020 research and innovation programme;en_US
dc.format.extent792 - 804-
dc.format.mediumPrint-Electronic-
dc.language.isoenen_US
dc.rightsThis work is licensed under a Creative Commons Attribution 3.0 License. For more information, see https://creativecommons.org/licenses/by/3.0/-
dc.rights.urihttps://creativecommons.org/licenses/by/3.0/-
dc.subjectgene expression programming (GEP)en_US
dc.subjectschema theoryen_US
dc.subjectdata engineeringen_US
dc.subjectbig data analyticsen_US
dc.subjectparallelization and segmentationen_US
dc.titleSchema Theory-Based Data Engineering in Gene Expression Programming for Big Data Analyticsen_US
dc.typeArticleen_US
dc.identifier.doihttps://doi.org/10.1109/TEVC.2017.2771445-
dc.relation.isPartOfIEEE Transactions on Evolutionary Computation-
pubs.issue5-
pubs.publication-statusPublished-
pubs.volume22-
dc.identifier.eissn1941-0026-
Appears in Collections:Dept of Health Sciences Research Papers

Files in This Item:
File Description SizeFormat 
Fulltext.pdf2.27 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons