Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/3007
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTan, A C-
dc.contributor.authorGilbert, D-
dc.contributor.authorTuson, A-
dc.coverage.spatial7en
dc.date.accessioned2009-01-30T17:24:08Z-
dc.date.available2009-01-30T17:24:08Z-
dc.date.issued2002-
dc.identifier.citationProceedings of International Conference on Bioinformatics, (INCOB 2002), Bangkok, Thailand 2002.en
dc.identifier.urihttp://bura.brunel.ac.uk/handle/2438/3007-
dc.description.abstractFlavin adenine dinucleotide (FAD) and its derivatives play a crucial role in biological processes. They are major organic cofactors and electron carriers in both enzymatic activities and biochemical pathways. We have analysed the relationships between sequence and structure of FAD-containing proteins using a machine learning approach. Decision trees were generated using the C4.5 algorithm as a means of automatically generating rules from biological databases (TOPS, CATH and PDB). These rules were then used as background knowledge for an ILP system to characterise the four different classes of FAD-family folds classified in Dym and Eisenberg (2001). These FAD-family folds are: glutathione reductase (GR), ferredoxin reductase (FR), p-cresol methylhydroxylase (PCMH) and pyruvate oxidase (PO). Each FADfamily was characterised by a set of rules. The “knowledge patterns” generated from this approach are a set of rules containing conserved sequence motifs, secondary structure sequence elements and folding information. Every rule was then verified using statistical evaluation on the measured significance of each rule. We show that this machine learning approach is capable of learning and discovering interesting patterns from large biological databases and can generate “knowledge patterns” that characterise the FADcontaining proteins, and at the same time classify these proteins into four different families.en
dc.format.extent59208 bytes-
dc.format.mimetypeapplication/pdf-
dc.language.isoen-
dc.publisherINCOBen
dc.subjectflavin adenine dinucleotide (FAD); protein structure-sequencefunction; machine learning; decision tree; inductive logic programming; knowledge discovery in biological databases.en
dc.titleCharacterisation of FAD-family folds using a machine learning approachen
dc.typeConference Paperen
Appears in Collections:Computer Science
Dept of Computer Science Research Papers

Files in This Item:
File Description SizeFormat 
InCoB2002.pdf57.82 kBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.