Title: Next steps in DNA sequence analysis Slide set [PowerPoint] Abstract: DNA sequence analysis is often considered to be a mostly solved problem. However, the four DNA bases contain much information that is not currently accessible to biologists. One of the most prominent outstanding problems in biological sequence data analysis is the decryption of the intergenic or regulatory DNA sequence. We have developed an application, degsuite, that uses conventional statistics together with optimized algorithms to discover regulatory DNA motifs by comparison of sequence from differentially regulated genes. We are using the motifs detected using this method to develop a machine learning solution for analysis and prediction of plant gene expression based on regulatory sequence. In parallel, we are exploring the use of nanotechnology-based sequencing for the analysis of large, unsequenced crop genomes. We have sequenced 7% of the soybean genome and 90% of the soybean cyst nematode genome using nanowell sequencing. The informatics challenges raised by the nanowell sequencing method will be discussed. Bio: Matthew Hudson has an MA from Cambridge University and a PhD from Leicester University. He was a post-doctoral fellow at the University of California, Berkeley from 1998 - 2002, and Bioinformatics Scientist at the Torrey Mesa Research Institute (later Diversa corporation) from 2002-2004, where he managed genomics, database and Java develoment projects, before joining the Crop Sciences department at UIUC as part of the IGB initiative. His interests include developing and applying computational methods to understanding the large and complex genomes of plants, particularly crops. |