Frontiers in Bioinformatics: Unsolved Problems and Challenges
Organized by Samuel Karlin, David Eisenberg and Russ Altman
Beckman Center of the National Academies, Irvine, CA
October 15-17, 2004
Final Program and Presentations
Colloquium perspective and paper are available free for download on the PNAS website.
Meeting Overview
The Sackler Colloquium Frontiers of Bioinformatics held October 16 and 17, 2004, provided a forum for discussing concepts and methods emerging in bioinformatics concomitant with recent advances in theory and experiment across the biological and medical sciences. The deluge of genome data in the last two decades has driven the emergence of bioinformatics as an important discipline. The first wave of genome sequence data created a demand for tools for search, comparison, and analysis of nucleic acid protein sequences and macromolecular structures. The second wave of expression data has similarly created a demand for tools that allowed the data to be understood and reduced. Future waves promise to bring innovations in proteomics research, including protein structure, interactions, compartmentalization, and turnover. In addition, experimental biologists are likely to create other new technologies that will further enable high throughput collection of useful biological data. These sources of cellular data will also be correlated with higher levels of phenotypic data, based on observations of the nature of cells, organs and organisms.
Our understanding of basic biology will be facilitated through the comparison of organisms at different evolutionary distances, in order to reconstruct both the tree of life and the emergence of important phenotypic traits. Also, there is a growing expectation that bioinformatics will help fuel the creation of computational models (both qualitative and quantitative) which will allow us to capture, store and maintain biological models that help explain experimental observations.
Algorithms in bioinformatics cover research in all aspects of computational biology. The emphasis is on discrete algorithms that address important problems in molecular biology, genomics, and proteomics, that are computationally efficient, that have been implemented and tested in simulations and on real datasets, and that provide new biological results and insights. Exact and approximate algorithms pertain to genomics, sequence analysis, gene and signal recognition, alignment, molecular evolution, phylogenetics, structure determination or prediction, gene expression and gene networks, proteomics, functional genomics, and drug design. In particular, bioinformatics tools include the BLAST program (homology searching), GENSCAN, GENIE (gene-finding), SAPS (statistical analysis of protein sequences), CLUSTAL, ITERALIGN, (multiple sequence alignment), r-SCAN STATISTICS (target array clustering, overdispersion), etc. These programs, are used by thousands of researchers every day in molecular biology and medicine. The BLAST protocol currently serves more than 100 000 queries per day at the National Center for Biotechnology Information (NCBI) in Washington, DC.