mascot · proteomics is the study of the protein complement of a cell or tissue in a specific...

12
MASCOT ® Take the guesswork out of protein identification... www.matrixscience.com

Upload: others

Post on 20-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

MASCOT®

Take the guesswork out of protein identification...

www.matrixscience.com

Page 2: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

Proteomics is the study of the protein complement of a cell ortissue in a specific physiological state. The core technologies ofproteomics are gel electrophoresis and HPLC for separation fol-lowed by mass spectrometry for analysis. Mascot takes themass spectrometry data and searches it against molecular

Supports all three proven search strategies in a single,integrated package

Unique, true probability based scoring allows standard statistical tests of significance to be applied.

Search any FASTA database, whether protein, EST, or genomic DNA.

No time-consuming index building; total flexibility in specifying chemical or post-translational modifications; search with or without enzyme specificity.

Accepts mass spectrometry data from all the leading instrument manufacturers

Fast, threaded code gives high throughput on a wide range of single and multi-processor systems and clusters.

MASCOT®: SOFTWARE FOR PROTEIN IDENTIFICATION USINGMASS SPECTROMETRY DATA

sequence databases to identify the constituent proteins and tocharacterize post-translational modifications. The search proce-dure is computationally intensive, requiring complex statisticalcalculations to be performed rapidly while streaming throughprotein or nucleic acid sequence databases.

Sophisticated client software to automate search submission without custom programming.

Summary and detail reporting of search results to any web browser, together with comprehensive, on-line help.

Licensed for in-house use by more than a thousand academic and commercial laboratories. Described by Frost & Sullivan as “the gold standard for the searching of databases with mass spectrometric data”.

Supported by a dynamic and independent company, committedto developing state-of-the-art bioinformatics software.

Mascot has become established as the de facto standard for this application,because of advantages such as these:

2

Page 3: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

For protein mixtures, or where greater specificity is required, a digest can beseparated by HPLC, coupled directly to an MS/MS instrument. Individual peptidesare then selected and induced to fragment,yielding MS/MS spectra. A MascotMS/MS Ions Search looks for the bestpeptide sequence match to each MS/MSspectrum, then groups these peptidematches into protein matches. This tech-nique is applicable to even the most complex protein mixture, such as thosegenerated by digesting a cell lysate without any gel separation step.

The third type of search supported byMascot is a Sequence Query, a powerfuland flexible tool that allows molecularand fragment ion mass values to be com-bined with amino acid sequence andcomposition data.

SUPPORTS ALL THREE PROVEN SEARCH STRATEGIESIN A SINGLE, INTEGRATED PACKAGE

One of many approaches to proteome analysis is to use 2D gel electrophoresis to sepa-rate and visualise the proteins in a cell lysate. Individual spots, containing one or a fewproteins, are then excised and digested with trypsin. Mass spectrometric analysis ofthe intact digest mixture from a spot provides a set of peptide molecular masses for aPeptide Mass Fingerprint search. This is a rapid and sensitive technique, but may failif the digest mixture represents a complex mixture of proteins.

Protein

Enzymedigest

Peptidemixture

MS/MSStage I

Gas phasefragmentation

Isolatedpeptide

Fragmentions

MS/MSStage II

Peptidemap

Peptidemolecular

masses

Fragmention mass &

intensity values

MASCOTSEARCH ENGINE

Protein/DNAsequence databases Protein

identification &characterisation

MS/MSspectrum

3

Page 4: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

Mascot computes the probability that theobserved match between the experimen-tal data and mass values calculated froma candidate peptide or protein sequenceis a random event. The correct match,which is not a random event, then has avery low probability.

A histogram of the Mascot score distribution for the top 50 best matchingproteins is displayed at the top of a peptide mass fingerprint report. Scoresare -10*Log(P), where P is the probability that the observed match is a ran-dom event. Scores in the green shaded area represent random matches, whilethe correct match has a score of 117. The chance of this being a randommatch is 3 in 106.

Reference: Perkins, D.N. et al. (1999) Probability-based protein identificationby searching sequence databases using mass spectrometry data.Electrophoresis, 20, 3551-3567

UNIQUE, TRUE PROBABILITY BASED SCORING ALLOWS STANDARD STATISTICAL TESTS OFSIGNIFICANCE TO BE APPLIED

True probability based scoring is the key to recognising and avoiding false positives. It is also an essential pre-requisite forautomation. Only by establishing scores on a fixed, absolute scale, can the decisionto accept or reject an identification bemade by simple, rule-based software.

4

Page 5: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

All calculations are performed on the fly,direct from the FASTA file, giving totalflexibility.

Search with or without enzyme specificity.‘No-enzyme’ searches using MS/MS dataare essential for finding non-specificcleavage products, and working with tar-gets such as MHC peptides.

A wide range of chemical and post-trans-lational modifications can be specified ina search. These modifications can befixed (quantitative) or variable (non-quantitative). Arbitrary combinations offixed modifications and up to 9 variablemodifications can be included in a singlesearch. If a peptide contains multiplepotential modification sites, as in thisexample, Mascot can identify preciselywhich residues have been modified.

For interactive searching, the user interface is provided by any JavaScriptaware web browser.

The choice of modifications is taken from a simpletext file, which can be updated by editing or bydownloading a new file from www.unimod.org. Newmodifications are available for searching immediately,no waiting for new indexes to be built.

SEARCH ANY FASTA DATABASE, WHETHER PROTEIN,EST, OR GENOMIC DNA

5

Page 6: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

Protein and peptide analysis by mass spectrometry uses a bewil-dering variety of instruments and techniques, which means thatspectra can vary enormously in terms of mass resolution, signalto noise, and fragmentation behaviour.

Mascot isn’t tuned for data from one particular type of instrument.It is designed to extract all the statistically significant informa-tion and deliver optimum results, whatever the source of the data.

MASCOT ACCEPTS MASS SPECTROMETRY DATA FROM ALL THE LEADINGINSTRUMENT MANUFACTURERS

All trade marks and service marks on this page are the properties of their respectiveowners and are hereby acknowledged6

Most instrument vendors now incorporate a Mascot interfaceinto their data analysis packages. In addition, you can alwaysuse Mascot Daemon to automate data processing and searchsubmission, as illustrated elsewhere in this brochure forFinnigan Xcalibur data from

Page 7: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

7

SUMMARY AND DETAIL REPORTING OF SEARCH RESULTS TO ANY WEB BROWSER, TOGETHER WITH COMPREHENSIVE, ON-LINE HELP

Detailed on-line help is provided in the form of HTML pages

By clicking on a link, you can drill downdetails of an individual peptide match

Comprehensive reports help visualise and understand thesearch results. The example to the right shows a protein viewreport from a search of LC-MS/MS data.

Page 8: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

Data streams from multiple mass spectrometers can be routed toa Mascot Server for real-time searching by using Mascot Daemon,a Microsoft Windows application that is bundled with Mascot.

Each Mascot Daemon task defines the data source (a list of datafiles or a file path), how the data are to be searched, when thesearches are to take place, and any follow-up activities, such asconditional repeat searches.

Batch task: A batch of data files to be searched immediately or at a defined time

Real-time monitor task: New files on a defined path are searched as they are created.

Score dependent follow-up task. For example, automatically repeating a search against a different sequence database.

SOPHISTICATED CLIENT SOFTWARE TO AUTOMATE SEARCH SUBMISSION WITHOUTCUSTOM PROGRAMMING

Search parameters are defined in the Parameter Editor, whichclosely resembles the HTML form used for interactive Mascotsearches. Tasks and their associated search results are displayedon the status tree. The full result report can be displayed in aweb browser by clicking on the blue hyperlink.

8 Finnigan, Xcalibur, LCQ, and LTQ are trademarks of Thermo Electron Corporation

Mascot Daemon can take advantage of a variety of data importfilters, including Mascot Distiller, to automate the processing ofraw data files into peak lists. For example, Finnigan LCQ andLTQ raw files can be processed into peak lists using either theutility supplied with Xcalibur or Mascot Distiller.

Page 9: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

9

FAST, THREADED CODE, OPTIMIZED FOR A WIDE RANGE OF SINGLE ANDMULTI-PROCESSOR PLATFORMS.

Mascot Server is available for all these popular computing platforms:

Microsoft Windows NT / 2000 / XP / 2003

Linux

IBM AIX

Solaris

HP Tru64 Unix

SGI IRIX

If required, Mascot can be executed on a multi-processor server ora cluster of servers connected by a standard LAN. Cluster-modeexecution is a standard feature of the code, and is enabledwhenever the license is for four or more processors. Throughputscales almost perfectly with the number of processors.

Images on this page are reproduced courtesy of International Business Machines Corporation. Unauthorized use not permit-ted. The IBM Business Partner emblem, IBM, eServer, and BladeCenter are trademarks of International Business MachinesCorporation in the United States, other countries, or both.

In addition to our software-only solutions, Mascot Clusterprovides a complete turn-key system for high throughputprotein identification. The IBM eServer™ BladeCenter™hardware is easy to manage and cost effective. The operatingsystem can be either Microsoft Windows or Linux. A single,fully populated BladeCenter contains 14 blades (28 processors).Up to 6 BladeCenters, (168 Mascot processors), can bemounted in a single industry-standard 19" rack. TheBladeCenter chassis incorporates redundant power suppliesand individual blades are hot-swappable.

Page 10: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

10

MASCOT DISTILLER: CROSS-PLATFORM BROWSING AND PROCESSING OFRAW MASS SPECTROMETRY DATA

Mascot Distiller processes raw massspectrometry data into high quality peaklists for database searching. The graphicaluser interface provides a simple and intu-itive data browser for viewing spectra andgenerating peak lists from a wide rangeof mass spectrometers. The processingcode is in the form of a Windows COMlibrary, that can be called by applicationssuch as Mascot Daemon, or by programsyou write yourself.

Mascot Distiller detects a peak by fittingan ideal isotopic distribution to the experimental data. The advantage of thisapproach is that the complete experimentaldistribution is fitted, not just the 12Cpeak. The charge state is automaticallydetermined and the peak list containsonly monoisotopic peaks, even when thesignal to noise is poor or the isotopic distribution not fully resolved. Smoothingis not necessary or desirable.

Mascot Distiller 2.0 includes a fast de novosequencing algorithm, manual and automatic

calling of sequence tags, in silico protein digestionand peptide fragmentation, and the ability to

import and view Mascot search results.

Page 11: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

MASCOT INTEGRA: DATA MANAGEMENT FOR PROTEOMICS

Mascot Integra is a complete solution forproteomics workflow automation anddata mining.

Unlike a traditional LIMS system, MascotIntegra is fully functional “out of the box’.Both gel-based and chromatography-based workflows can be modelled usingan intuitive, graphical user interface.Sample and workflow management ispowered by the technology of Sapphire™

LIMS from LabVantage Solutions Inc.

The database engine is Oracle® and thesystem is truly multi-tier, requiring onlya web browser on each client. MascotIntegra automates Matrix Science'sproven tools, including Mascot Distillerfor browsing and processing mass spec-trometry data, and the Mascot searchengine for protein identification andcharacterization.

All trade marks and service marks on this pageare the properties of their respective ownersand are hereby acknowledged

11

The database schema is structured tofacilitate data mining, and MicrosoftExcel provides a familiar interface forcustom reports.

Mascot Integra will scale to the largestprojects, yet has a very affordable entrylevel, making it a practical choice forsmaller laboratories.

CFR21 part 11 ERES compliant

Web Browsers Mascot Daemons Instruments Web Browsers

HTTP(S) HTTP(S) HTTP(S)CLIENTS

APPLICATIONS

DATA

PresentationComponents

(Perl CGI)

MASCOTSearch Engine

HTTP(S)

Presentation Components (JSP's, Java Applets, Servlets))

LabVantage Sapphire™

MASCOT Server MASCOT Integra

Web Server Sybase J2EE Application Server

ODBC JDBC

FASTADatabase

Result Files taskDB IntegraDB adminDB

FLAT Files ORACLE® RDBMS

OracleExternal FIles

Page 12: MASCOT · Proteomics is the study of the protein complement of a cell or tissue in a specific physiological state. The core technologies of proteomics are gel electrophoresis and

Contacts

Headquarters:Matrix Science Ltd.8 Wyndham PlaceLondon W1H 1PPUKPhone: +44 20 7723 2142Fax: +44 20 7725 9360Email: [email protected]

USA & Canada:Matrix Science Inc.225 Franklin Street, 26th FloorBoston, MA 02110USAPhone: (800) 716 6702Fax: (800) 716 6704Email: [email protected]

Japan:Matrix Science K.K.KN-bldg 3F6-10-12, SotokandaChiyoda-ku, Tokyo 101-0021JapanPhone: 03 5807 7895Fax: 03 5807 7896Email: [email protected]

About Matrix Science

Matrix Science is an independent bioin-formatics company specialising in prod-ucts and services for mass spectrometryand proteomics. For further informationon all of our products and for freeaccess to the Mascot search engine, visitthe Matrix Science web site –http://www.matrixscience.com

MASCOT is a registered trademark of MatrixScience Ltd.

All other company, product and service names areacknowledged as trademarks or registered trade-marks of their respective owners.

© 2005 Matrix Science Ltd. All rights reserved.Publication 01-2/2005.

www.matrixscience.com