title: genewiz browser: an interactive tool for visualizing sequenced chromosomes by peter f....
TRANSCRIPT
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes
ByPeter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T.
Binnewies, Craig J. Benham, and David W. Ussery
Published on Standards in Genomic Sciences (2009) 1: 204-215Citation count: 35
Background• Over 15 years of the genomic sequencing development, the
public genome database has held more than a thousand sequenced genomes.
• It is explicitly useful for biologists to analyzed multiple genomes cross different species for a broad range of interests, especially:o identify the phylogenetic relationship, genomic region causing the
pathogencity to human and animalso new targeted genes worthy for industrial and economical use.
• Such availability of the analytics tools is limited and often requires users with both analytical and programming knowledge, hence the analysis of multiple genomes is not always easy in a broad range of the biological research.
Function of the GeneWiz browser• GeneWiz browser for visualizing genomic data of
prokaryotic chromosomes.• This tool provides several functions:
o visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species
o visualizing DNA physical properties such as curvature along the chromosome
o identifying the repeat sequences along the chromosomeo Additionally, custom numerical data such as gene expression and
regulation data can also plotted
• This web-interface service provides an interoperable method to carry out whole genome visualization
Implementation of GeneWiz browser
• The method behind this visualization tool is to convert numerical information to color-encoded lanes in either using a linear scale with a fixed minimum and maximum range, or a dynamic scale of standard deviations.o DNA properties based on various developed methods to indicate
particular regions posing biological functionso Mapping of homologous genes by BLAST (Basic Local Alignment Search
Tool)o Mapping of short sequencing reads with the weighted coverageo Custom lanes with pre-processed data provided by users
Workflow of GeneWiz browser• This web interface includes two parts:
1. the client is written as a JavaApplet that obtains the data remotely from the server
2. the server is written in Perl/CGI, while a compiled C-program handles the access to the binary data files.
• All input/output objects are defined in a separated XSD file (XML schema definition) within the WSDL file,
• MySQL on the server provides the storage function for pre-binning of data for each zoom level
• The maximum uniqueness quality is shown for the actual reads (green-to-blue lane) plotted along with reference genome.
• This figure shows that a good correspondence between the in-silico and experimental reads suggests little bias towards certain chromosomal regions if read coverage is around 40 times.
• BLAST comparison of 14 closely related bacteria chromosomes.
• This figure clearly indicates that a strong preference of deletion on the pathogenic islands exist for a few of bacteria not causing infection to human.
• A final example illustrates how the marks indicating the uniqueness of DNA physical properties can be used to integrate known regulatory elements and gene annotations to draw a more complete picture of a particular region for gene expressions.
Summary• Most biologists believe that a visualization of the multidimensional
genomic information is necessary, but the use of an analytic tool is relatively difficult to them.
• GeneWiz browser is superior to numerous tools which are all the command-line programs generating publication quality static images and vector graphics for the genomic visualization.o easily navigate using mouseo zooming function to allow users to interpret the genomic information at varying
scaleso an automatic workflow that can be directly called from the users via the client
part
• This tool can be relevant in many pangenomic (cross-sequenced-species) as well as in metagenomic (cross-unsequenced-species) studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set.