Show simple item record

dc.contributor.author Barrett, John
dc.contributor.author Raga, J. A.
dc.contributor.author Kostadinova, A.
dc.date.accessioned 2009-09-28T11:30:41Z
dc.date.available 2009-09-28T11:30:41Z
dc.date.issued 2005
dc.identifier.citation Barrett , J , Raga , J A & Kostadinova , A 2005 , ' Mining parasite data using genetic programming ' Parasitology , pp. 207-209 . , 10.1016/j.pt.2005.03.007 en
dc.identifier.issn 1469-8161
dc.identifier.other PURE: 119293
dc.identifier.other dspace: 2160/3097
dc.identifier.uri http://hdl.handle.net/2160/3097
dc.description Barrett, J., Kostadinova, A., Raga, J.A. (2005). Mining parasite data using genetic programming. Trends in Parasitology, 21, (5), 207-209 en
dc.description.abstract Genetic programming is a technique that can be used to tackle the hugely demanding data-processing problems encountered in the natural sciences. Application of genetic programming to a problem using parasites as biological tags demonstrates its potential for developing explanatory models using data that are both complex and noisy. In many areas of biology, the ability to collect data outstrips the ability to analyse it. Techniques are needed to mine large datasets and extract biologically meaningful relationships. Genetic programming (GP) is a stochastic optimization approach that helps to discover comprehensible rules for data mining. It is one of a group of supervised, evolutionary programming techniques that uses darwinian concepts to generate and optimize predictive mathematical models. This is done by mimicking ‘natural selection’ using ‘populations’ of mathematical models. Initially, a population of n models (short computer programmes) is generated, each model representing a different, random combination of variables, constants and mathematical functions. The fitness of each model is determined (in terms of how well it solves the problem). The ‘best’ models are then selected for ‘breeding’ to produce the next generation of ‘fitter’ models, and so on until a model is evolved that solves the problem with the required degree of accuracy or until a specified stopping criterion is reached. During breeding, different parts of the models are recombined, and the mathematical functions and variables can be changed: the equivalent of crossover and mutation. Because GP is a randomized algorithm, it is not deterministic, and each new run with a dataset evolves an independent model. Therefore, several alternative solutions to a problem can be evolved. For complex problems for which there is no single answer, each run can result in a different best model, and a validation process must then be devised to select the most appropriate one. en
dc.format.extent 3 en
dc.language.iso eng
dc.relation.ispartof Parasitology en
dc.title Mining parasite data using genetic programming en
dc.type Text en
dc.type.publicationtype Article (Journal) en
dc.identifier.doi http://dx.doi.org/10.1016/j.pt.2005.03.007
dc.contributor.institution Institute of Biological, Environmental and Rural Sciences en
dc.description.status Peer reviewed en


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Cadair


Advanced Search

Browse

Statistics