Trevor I Dix

Followers

Following

Co-authors

Public Views

Interests

Uploads

Papers by Trevor I Dix

A decision graph explanation of protein secondary structure prediction

[1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences

Abstract The machine-learning technique of decision graphs, a generalization of decision trees, i... more

Exploring long DNA sequences by information content

Download

Comparative analysis of long DNA sequences by per element information content using different contexts-0

<b>Copyright information:</b>Taken from "Comparative analysis of long DNA sequen... more <b>Copyright information:</b>Taken from "Comparative analysis of long DNA sequences by per element information content using different contexts"http://www.biomedcentral.com/1471-2105/8/S2/S10BMC Bioinformatics 2007;8(Suppl 2):S10-S10.Published online 3 May 2007PMCID:PMC1892068.

Modelling Is More Versatile than Shuffling

End Fragment Constraints in Stochastic Assembly of ContigRestriction

This paper considers the importance of end-fragment constraints in the construction of contig res... more This paper considers the importance of end-fragment constraints in the construction of contig restriction maps. A representation for such maps is used in conjunction with an objective function based on minimum message length (MML) principles and two stochastic optimization methods. Results from the optimization of real and simulated data sets with and without end fragment constraints are given, and it is shown that better scores can be obtained if end fragment constraints are violated. The eeectiveness of the MML objective function is illustrated by its ability to balance a number of connicting constraints.

Database integration and querying in the bioinformatics domain

Given the exponential growth in the amount of genetic data being produced, it is more important t... more Given the exponential growth in the amount of genetic data being produced, it is more important than ever for researchers to have effective tools to help them manage this data. This paper describes a system that enables users, generally biologists, to construct components to answer specific questions in their field. The system allows the creation of modules and submodules via top-down decomposition. Concepts and terms can be defined through conversation. These are then used when composing base-level functions to produce code for modules and for interfacing modules.

Download

A Restriction Mapping Engine Using Constraint Logic Programming

Proceedings. International Conference on Intelligent Systems for Molecular Biology, 1994

Restriction mapping generally requires the application of information from various digestions by ... more Restriction mapping generally requires the application of information from various digestions by restriction enzymes to find solution sets. We use both the predicate calculus and constraint solving capabilities of CLP(R) to develop an engine for restriction mapping. Many of the techniques employed by biologists to manually find solutions are supported by the engine in a consistent manner. We provide generalized pipeline and cross-multiply operators for combining sub-maps. Our approach encourages the building of maps iteratively. We show how other techniques can be readily incorporated.

Download

Constructing a generic C++ class for customised C pointers

A probabilistic prediction-based replanning architecture for a UAV

IFAC Proceedings Volumes

Polyome: A Learning System for Extracting Bioinformatic Data

Bioinformatics & Computational Biology, 2006

The exponential growth in the quantity of publicly available genetic data and the proliferation o... more The exponential growth in the quantity of publicly available genetic data and the proliferation of bioinformatic databases mean that scientists need computerized tools more than ever. Existing ap- proaches to the problem all suffer from one or more basic problems. This paper describes Polyome, the core of a system for the integration and querying of data sources, designed to overcome

Download

A Distributed Memory Multiprocessor

Proceedings of the 1994 International Conference on Parallel and Distributed Systems, Dec 19, 1994

Discovering simple DNA sequences by compression

Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, Feb 1, 1998

Download

End Fragment Constraints in Stochastic Assembly of Contig Restriction Maps

Download

A platform for restriction mapping

The Journal of Logic Programming, 1997

... Yap [16] dem0nrad the u5e 0f CLP(R) f0r exper1menta1 data fr0m tw0 f1nede5 and 0ne d0ued19e 1... more

Errors between sites in restriction site mapping

Bioinformatics, 1988

Restriction site mapping programs construct maps by generating permutations of fragments and chec... more Restriction site mapping programs construct maps by generating permutations of fragments and checking for consistency. Unfortunately many consistent maps often are obtained within the experimental error bounds, even though there is only one actual map. A particularly efficient algorithm is presented that aims to minimize error bounds between restriction sites. The method is generalized for linear and circular maps. The time complexity is derived and execution times are given for multiple enzymes and a range of error bounds.

Advanced Query Mechanisms for Biological Databases, Agarwal, Pankaj, Allison, L., Aloy, Patrick,

aaai.org

Advanced Query Mechanisms for Biological Databases,  Agarwal, Pankaj,  Allison, L.,  Aloy, Pa... more Advanced Query Mechanisms for Biological Databases,  Agarwal, Pankaj,  Allison, L.,  Aloy, Patrick,  Altman, Russ B.,  Amir, Amihood,  Ananko, EA,  Atchley, William R.,  Atteson, Kevin,  Automated Clustering and Assembly of Large EST Collections,  Aviles, Francesc X.,  ... Babenko, VN,  Bafna, Vineet,  Baker, Patricia G.,  Baldi, Pierre,  Bayesian Protein Family Classifier,  Bechhofer, Sean,  BioSim—A New Qualitative Simulation Environment for Molecular Biology,  Brass, Andy,  Brunak, ...

Exceptions and interrupts in CSP

Science of Computer Programming, 1983

Download

A Bit-String Longest-Common-Subsequence Algorithm

Information Processing Letters, Nov 24, 1986

Download

Circular Clustering of Protein Dihedral Angles by Minimum Message Length

Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, Feb 1, 1996

Download

Robust Estimation of Evolutionary Distances with Information Theory

Molecular Biology and Evolution, 2016

Methods for measuring genetic distances in phylogenetics are known to be sensitive to the evoluti... more Methods for measuring genetic distances in phylogenetics are known to be sensitive to the evolutionary model assumed. However, there is a lack of established methodology to accommodate the trade-off between incorporating sufficient biological reality and avoiding model overfitting. In addition, as traditional methods measure distances based on the observed number of substitutions, their tend to underestimate distances between diverged sequences due to backward and parallel substitutions. Various techniques were proposed to correct this, but they lack the robustness against sequences that are distantly related and of unequal base frequencies. In this article, we present a novel genetic distance estimate based on information theory that overcomes the above two hurdles. Instead of examining the observed number of substitutions, this method estimates genetic distances using Shannon&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;s mutual information. This naturally provides an effective framework for balancing model complexity and goodness of fit. Our distance estimate is shown to be approximately linear to elapsed time and hence is less sensitive to the divergence of sequence data and compositional biased sequences. Using extensive simulation data, we show that our method 1) consistently reconstructs more accurate phylogeny topologies than existing methods, 2) is robust in extreme conditions such as diverged phylogenies, unequal base frequencies data, and heterogeneous mutation patterns, and 3) scales well with large phylogenies.