bio linux

24
LIVE BIOTOOLS LIVE BIOTOOLS Through”VIGYAAN” Through”VIGYAAN” Palani Kannan. K Palani Kannan. K AU-KBC Research Centre AU-KBC Research Centre

Upload: guest294984

Post on 24-May-2015

1.155 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Bio Linux

LIVE BIOTOOLS LIVE BIOTOOLS Through”VIGYAAN”Through”VIGYAAN”

Palani Kannan. KPalani Kannan. K

AU-KBC Research CentreAU-KBC Research Centre

Page 2: Bio Linux

Bio LinuxBio Linux

• It is a Linux system customized for bioinformatics analysis and development work.

• Most of them are open source.

Page 3: Bio Linux

Classification based on Classification based on UsageUsage

Package RepositoriesIt is a stored form of Operating Packages in bioinformatics applications. It is available as whole operating system supported by Linux environment with bioinformatics applications and tools.

Eg.

Page 4: Bio Linux

Classification based on Classification based on UsageUsage

Live DVD or CDIt is a temporarily mountable

operating environment for bioinformatics applications. The tools and applications are mounted along with the operating environment.

Eg.

Page 5: Bio Linux

Need of Bio LinuxNeed of Bio Linux• Provision of complete systems• Provision of bioinformatics software

repositories• Addition of bioinformatics packages

to standard distributions• Live DVD/CDs with bioinformatics

software added• Community building and support

systems

Page 6: Bio Linux

LIVE VigyaanLIVE Vigyaan• Live environment • Based on Linux environment

(Knoppix)• Tools and applications of

computational biology and Computations Chemistry.

• Provides tools required to compile and install other applications

Page 7: Bio Linux

Vigyaan Demo…Vigyaan Demo…• Artemis, Genome viewer : It loads a genome for viewing. In this

demo a local copy of genome is loaded, however Artemis can also load sequences from the Internet.

• ClustalX, Sequence alignment : It calls clustalx with a filename as argument. The filename is PPIases, which contains sequence of 10 proteins in FASTA format. To perform sequence alignment from top menu select: Alignment -> Do Complete Alignment. On the pop-up box click on ALIGN.

• Ghemical, MD (small molecule): This demo loads sample.mm1gp. To add hydrogens click mouse right button on black area and from the pop-up menu select:Build -> Hydrogens -> Add. To optimize the structure from the pop-menu select: Compute -> Geometry Optimization. Click OK. To perform molecular dynamics: Compute -> Molecular Dynamics.

Page 8: Bio Linux

Vigyaan Demo…Vigyaan Demo…

• GROMACS, MD (peptide in water): This demo prepares and performs molecular dynamics for a peptide in water.

• Jmol, Quantum chemistry calculation visualization : This demo loads an output file from GAMESS-US run (Hessian calculation for proton between 2 water molecules). Use Extras -> Vibrate to view different normal modes.

• Open Babel, Structure file format conversion : This demo converts a protein structure from PDB format to XYZ format using command babel.

Page 9: Bio Linux

Vigyaan Demo…Vigyaan Demo…• NJPlot, Phylogenetic tree creation : This demo create a

phylogenetic tree based on a sample input file.

• PSI3, Quantum chemistry calculation : This demo runs a quantum chemistry calculation using PSI3 package. The calculation performed is SCF optimization of CH4 with DZP basis set.

• PyMOL, Structure visualization and high-quality image rendering : This demo loads a sample project with a protein and 3 conformations of a peptide in the active-site. To change the display use options from S (show), H (hide), L (label) and color menu. To render a high-quality image use ray option in the main menu. To save image use from main menu: File -> Save Image.

Page 10: Bio Linux

Vigyaan Demo…Vigyaan Demo…• Rasmol, Structure visualization: This demo loads

a protein for visualization. You can use mouse to rotate the molecule. Display and colors can be changed from top menu.

• Raster3D, High quality image rendering (for biomolecules) : This demo creates a high quality image of a protein active-site.

• TINKER, Molecular modeling : This demo creates a TINKER file from PDB file for a protein, performs a single point energy calculation and computes molecular volume and surface area.

Page 11: Bio Linux
Page 12: Bio Linux

GROMACS

Groningen Machine for Chemical Simulations

Page 13: Bio Linux

Advantage

-Though designed to calculate bonded interactions, GROMACS is extremely fast at calculating non-bonded interactions;

- GROMACS provides extremely high performance compared to other programs

-GROMACS comes with a large selection of flexible tools for trajectory analysis;

- GROMACS is user-friendly, with topologies and parameter files written in clear text format

Page 14: Bio Linux

molecular dynamics (MD) a computer simulation technique where the time evolution of a set of interacting atoms is followed by integrating their equations of motion

In molecular dynamics we follow the laws of classical mechanics, and most notably Newton's law: F=maHere, m is the atom mass, a its acceleration, and F the force acting upon it, due to the interactions with other atoms.

Page 15: Bio Linux

pdb :

Protein data bank format

gro:

Gromacs format (atom co-ordinates)

Information in the columns, from left to right:

residue number,residue name,atom name,atom number

x, y, and z position, in nm, x, y, and z velocity, in nm/ps

itp:

atom topologies (charges, mass, radii, etc)

top:

topology file contains all the force field parameters force fields, number of molecules, water, etc

Input Files

Page 16: Bio Linux

Creating Input Files

Converting pdb to gro:

Pdb2gmx

reads in a pdb file and allows the user to chose a forcefield

reads some database files to make special bonds (i.e. Cys-Cys) adds hydrogen's to the protein

generates a coordinate file in Gromacs (Gromos) format (*.gro) and a topology file in Gromacs format (*.top).

Editconf

converts gromacs files (*.gro) back to pdb files (*.pdb) allows user to setup the box:

the user can define the type of box (i.e. cubic, triclinic, octahedron) set the dimensions of the box edges relative to the molecule (-d 0.7 will set the box edges 0.7 nm from the molecule)center the molecule in the box

Page 17: Bio Linux

mdp:

molecular dynamics simulation parameters allows the user to set up specific parameters for all the calculations that Gromacs performs.

tpr:

contains the starting structure of the simulation, the molecular topology file and all the simulation parameters; binary format (all of the above)

em.mdp file:

sets the parameters for running energy minimizations; allows you to specify the integrator (steepest descent or conjugate gradients), the number of iterations, frequency to update the neighbor list, constraints, etc.

md.mdp file:

sets the parameters for running the molecular dynamics program;

Input Files

Page 18: Bio Linux

Solvation of the SystemGenbox

solvates the box based on the dimensions specified using editconf

solvates the given protein in the specified solvent (by default SPC- Simple Point Charge water)

water molecules are removed from the box if the distance between any atom of the solute and the solvent is less than the sum of the VanderWaals radii of both atoms (the radii are read from the database vdwradii.dat)

Electroneutrality:

check the total charge. If net charge is not zero, then add counter ions to get neutral system. (select random water molecules and replace with ion). Eg: H2O -> Cl- (remove H1, H2, and rename O as Cl-) or use genion (which does the same thing).

Page 19: Bio Linux

Grompp reads a molecular topology file (*.top) and checks the validity of the file expands the topology from a molecular description to an atomic description (*.tpr) it reads the parameter file (*.mdp), the coordinate file (*.gro) and the topology file (*.top) it ouputs a *.tpr file for input into the MD program mdrun since *.tpr is a binary file, it can not be read with ‘more’ but it may be read using gmxdump, which prints out the input file in readable format (it also prints out the contents of a *.trr file)

Mdrun performs the Molecular Dynamics simulation can also perform Brownian Dynamics, Langevin Dynamics, and Conjugate Gradient or Steepest Descents energy minimization reads the *.tpr file, creates neighborlists from that file and calculates the forces. globally sums up the forces and updates the positions and velocities. outputs at least three types of files:

(1) trajectory file (*.trr): contains coordinates, velocities, and forces(2) structure file (*.gro):contains coordinates and velocities of the last step(3) energy file (*.edr): contains energies, temperatures, pressures

Page 20: Bio Linux

gmxcheck: gmxcheck reads a trajectory (*.trr) or an energy file (*.edr) and

prints out useful information in them.

g_energy: extracts energy components or distance restraint data from an

energy file into a *.xvg file (may be read using Xmgr or Grace).

trjconv: allows compression of trajectory file into a *.xtc file that can be

analyzed using ngmx

ngmx:-Gromacs trajectory viewer- plots a 3-D structure of the molecule- allows rotation, scaling, translation, labels on atoms, animation

of trajectories, etc.

Page 21: Bio Linux

Output files:

trr: contains the trajectory data for the simulation; binary format.

It contains all the coordinates, velocities, forces and energies as was indicated the mdp file.

xtc: portable format for trajectories which stores the information about the trajectories of the atoms in a compact manner (it only contains cartesian coordinates).

edr: portable file that contains the energies

log: CPU time, MFLOP, etc.

Page 22: Bio Linux

PSI3 is a program system and development platform for ab initio molecular electronic structure computations.

The PSI3 suite of quantum chemical programs is designed for efficient, high-accuracy calculations of properties of small to medium-sized molecules.

It’s capabilities include a variety of Hartree-Fock, coupled cluster, complete-active-space self-consistent-field, and multi-reference configuration interaction models.

Molecular point-group symmetry is utilized throughout to maximize efficiency.

Non-standard computations are possible using a customizable input format.PSI3 can perform ab initio computations employing basis sets of up to 32768 contracted Gaussian-type functions of virtually arbitrary orbital quantum number. PSI3 can recognize and exploit the largest Abelian subgroup of the point group describing the full symmetry of

the molecule.

Page 23: Bio Linux

It includes mature programming interfaces for parsing user input, accessing commonly used data such as basis-set information or molecular orbital coefficients, and retrieving and storing binary data especially multi-index quantities such as electron repulsion integrals.

This platform is useful for the rapid implementation of both standard quantum chemical methods, as well as the development of new models.

Features that have already been implemented include Hartree-Fock, multiconfigurational self-consistent-field, second-order Møller-Plesset perturbation theory, coupled cluster, and configuration interaction wave functions.

Distinctive capabilities include the ability to employ Gaussian basis functions with arbitrary angular momentum levels;

linear R12 second-order perturbation theory; coupled cluster frequency-dependent response properties, including dipole polarizabilities and optical rotation; and diagonal Born-Oppenheimer corrections with correlated wave functions.

Page 24: Bio Linux