plotting genetic maps on a microcomputer

4
Gene. 60 (1987) 299-302 Elsevier GEN 02211 299 Plotting genetic maps on a microcomputer (Recombinant DNA; linkage map; restriction map; Lotus l-2-3; personal computer; spreadsheet; keyboard; IBM PC; database) Donald P. Nierlich Department of Microbiology and Molecular Biology institute, University of Calfornia, Los Angeles, CA 90024 (U.S.A.) Received 5 July 1987 Accepted 8 September 1987 SUMMARY Maps of genetic linkage and restriction enzyme cleavage sites can be quickly prepared on an IBM PC microcomputer with the commercially available program Lotus l-2-3. Data can be entered on the keyboard or imported from other programs. The maps can be displayed on the screen or with a printer or plotter. These procedures should be useful in the research laboratory, in preparing figures for publication and in teaching. INTRODUCTION It is the purpose of this note to point out that genetic maps can be quickly prepared on an IBM PC (International Business Machines Corp.) or a com- patible microcomputer using the program Lotus l-2-3 (Lotus Development Corp., Cambridge, MA, U.S.A.). Lotus l-2-3 is a widely available program, intended originally for business applications, which Correspondence lo: Dr. D.P. Nierlich, Department of Micro- biology, Life Sciences 5304, UCLA, Los Angeles, CA 90024 (U.S.A.) Tel. (213)825-6720. Abbreviations: Lotus l-2-3 and Freelance Plus, registered trade- marks for software programs of Lotus Development Corp.; IBM PC, a registered trademark for a microcomputer of International Business Machines Corp.; kb, 1000 base pairs. can be used for recording, manipulating and graph- ing laboratory data. Data are entered in tabular form on a ‘spreadsheet’ that resembles accounting or engineering notebook paper, and can then be manipulated with the program’s arithmetic, exponen- tial, logical, statistical and database functions. Data can be typed in or imported from the files of other programs. In the case of genetic data, the program can accept the names of loci and their map positions, sort them, and with a few additional key strokes, display the map on the screen. With a printer or plotter, graphs can be prepared without the tedious calculation of scaling factors and drawing. Finally, spreadsheets can be saved or copied so that revisions can be carried out with little effort, and portions of a spreadsheet can be protected (locked) so that they cannot be accidentally altered. 0378-I 119/87/$03.50 0 1987 Elsevier Science Publishers B.V. (Biomedical Division)

Upload: donald-p

Post on 30-Dec-2016

218 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Plotting genetic maps on a microcomputer

Gene. 60 (1987) 299-302

Elsevier

GEN 02211

299

Plotting genetic maps on a microcomputer

(Recombinant DNA; linkage map; restriction map; Lotus l-2-3; personal computer; spreadsheet; keyboard;

IBM PC; database)

Donald P. Nierlich

Department of Microbiology and Molecular Biology institute, University of Calfornia, Los Angeles, CA 90024 (U.S.A.)

Received 5 July 1987

Accepted 8 September 1987

SUMMARY

Maps of genetic linkage and restriction enzyme cleavage sites can be quickly prepared on an IBM PC

microcomputer with the commercially available program Lotus l-2-3. Data can be entered on the keyboard

or imported from other programs. The maps can be displayed on the screen or with a printer or plotter. These

procedures should be useful in the research laboratory, in preparing figures for publication and in teaching.

INTRODUCTION

It is the purpose of this note to point out that

genetic maps can be quickly prepared on an IBM PC

(International Business Machines Corp.) or a com-

patible microcomputer using the program Lotus

l-2-3 (Lotus Development Corp., Cambridge, MA,

U.S.A.). Lotus l-2-3 is a widely available program,

intended originally for business applications, which

Correspondence lo: Dr. D.P. Nierlich, Department of Micro-

biology, Life Sciences 5304, UCLA, Los Angeles, CA 90024

(U.S.A.) Tel. (213)825-6720.

Abbreviations: Lotus l-2-3 and Freelance Plus, registered trade-

marks for software programs of Lotus Development Corp.; IBM

PC, a registered trademark for a microcomputer of International

Business Machines Corp.; kb, 1000 base pairs.

can be used for recording, manipulating and graph-

ing laboratory data. Data are entered in tabular form

on a ‘spreadsheet’ that resembles accounting or

engineering notebook paper, and can then be

manipulated with the program’s arithmetic, exponen-

tial, logical, statistical and database functions. Data

can be typed in or imported from the files of other

programs. In the case of genetic data, the program

can accept the names of loci and their map positions,

sort them, and with a few additional key strokes,

display the map on the screen. With a printer or

plotter, graphs can be prepared without the tedious

calculation of scaling factors and drawing. Finally,

spreadsheets can be saved or copied so that revisions

can be carried out with little effort, and portions of

a spreadsheet can be protected (locked) so that they

cannot be accidentally altered.

0378-I 119/87/$03.50 0 1987 Elsevier Science Publishers B.V. (Biomedical Division)

Page 2: Plotting genetic maps on a microcomputer

EXPERIMENTAL AND DISCUSSION (b) Preparation of maps

(a) Principle of the method

Conceptually, the maps are X-Y graphs in which

the value of one parameter has been held constant,

e.g. X = 0, for the loci on the vertical axis in Fig. 1A.

The features of Lotus l-2-3 that allow it to display

such a map are the ability to scale and plot a series

of numbers, and the ability to draw the names of the

loci, called ‘data-labels’, adjacent to the points.

Slightly more sophisticated maps, as shown in

Fig. lB, depend on the program’s ability to plot

data-labels while suppressing the associated points

and lines. These features may be found in other

programs as well.

A 9 ar0L proc

phoA

brnQ

brnl? hem6

t codAB

fecB

7 fecA

katC

argF

t cxm

i

attp22

B

Fig. 1 shows typical maps of genetic linkage data

and restriction enzyme cleavage sites. In the simplest

case (most of Fig. 1 A), the names of the loci and map

positions were entered in two columns on the screen.

In a third column, zeros were entered. The graph

mode was entered, an X-Y type of graph was

designated, and the column of zeros and the map-

positions were designated as the X and Y (‘B’)

values, respectively. The column of loci was desig-

nated as the data-labels (for the B-values). To display

the loci that are offset on the map, e.g., 1ucZYA in

Fig. 1 A, they were given an X-value of 10; two

additional ‘dummy’ data points with values of 0 (and

without locus names) were included on each side of

these points to specify the points at which the offset

leaves and returns to the axis, and an additional

dummy data point with a value of 100 was placed at

Fig. I. Maps of genetic linkage and restrictmn enzyme cleavage

sites prepared with Lotus l-2-3. (A) Portion of the genetic map

ofErcherichin coli (Bachmann, 1983). The figure was prepared as

described m Table I and in EXPERIMENTAL AND DIS-

CUSSION. section b. The graph was printed on a Hewlett

Packard LaserJet 500 Plus (Hewlett Packard. Palo Alto. CA.

U.S.A.) after slight editing with Freelance Plus. (B) Restriction

map ofthe /ucZ gent. Restriction sites in the ECOLAC tile of the

GenBank ’ database (available from BBN Laboratories. Boston,

MA, U.S.A.) were identified using the program ANALYSEQ

(Staden. 1986) running on a DEC VAX 750 computer. The

resulting data tile (moved to the PC) was cntercd directly into a

l-2-3 spreadsheet with the import and parse functions. The graph

was prepared on an Hewlett Packard X-Y pen plotter. model

747012.

EcoRV &Ii xflorr

Mstii Cloi ororri AVUli Acci EcoRl

Mstf Aotli Sspi EssHII sacr NspHINdeI

I_ 1 Y I I I 7 - 0 4 0. 8 I. 2 I. 0. 6 2 2. 4 2. 8 3. 2

6 J 10~2 Nrb)

Page 3: Plotting genetic maps on a microcomputer

TABLE I

Detailed instructions for Fig. IA

A simplitied version of Fig. 1A (without showing the lac genes offset “) can be made with the following entries:

(a) Start the l-2-3 program and in cells A3, B3 and C3, respectively, enter the name of the first locus (attP22), its map position

(6.2), and a zero”. In cells A4-C4 through Al&Cl& repeat this with each successive locus. List the luc operon as a single

locus, IacZYA.

(b) (optional) Use these keystrokes to adjust the map positions to two significant figures: /rfK!(cr) b3..b18( cr). [ (cr) indicates

the Enter key.] These keystrokes center the locus-names in column A: /rlc a3..a18(cr).

(c) With the following keystrokes, indicate an XY type of graph: /gtx.

(d) Indicate that the column of zeros (cells C3-C18) contains the X-values: x c3..c18 (cr).

(e) Indicate that the column of map positions are the B-values: b b3..b18(cr). (0 ne can use another of the ranges, A-F, to

mark the loci with a different symbol.)

(f) (optional) To view the map at this stage, touch ‘v’; hit any key to continue.

(9) To place the names of the loci on the map, associate them with the B-values: odb a3..a23(cr) rqq.

(h) To view the completed map, hit ‘v’, touch any key to return to the graphics menu.

(i) To save the map itself for printing later, hit ‘s’ and enter a tile name. Hit (Esc) one or more times to return to the data screen.

To save the data in a tile, type: /fs. Finally, to map a subset of the loci, change the values for the range in steps e and g. A

Lotus tile can hold many thousands of loci (depending on how much ancillary information is included), and graphics settings

for several such subsets can be stored in named sets with the command ‘/gnc’.

” To draw a map with the luc genes offset as in Fig. lA, see EXPERIMENTAL AND DISCUSSION, section h.

h It is often helpful to enter the map-position data twice, in two columns. One column serves as a record of the actual values, and the

other (identified as the B-values in step e) serves for making the drawing. In this way, changes can be made for the purpose ofthe drawing

(as for the /UC genes) without disturbing the record of data values. The copy command, /c. is useful in duplicating the column of

map-position values, and also in creating the column of repeated values (e.g., zeros).

the end of the column of data to provide a scale for

drawing the figure. For the restriction map in

Fig. lB, the map positions were entered as values of

X, and a column of zeros designated for the values

of Y (‘C’ in this case). Two additional columns were

then created, one with the names of the enzymes

(designated the data-labels for the A-range) and the

other with the constant values (lo,20 or 30) used for

plotting the names (designated as the A values); a

dummy A-data point (200) was also included to scale

the figure vertically. Using the program’s Graph/

Options/Format command, the drawing of points

and lines was turned off for the A range. Detailed

instructions for preparing Fig. IA are given in

Table I. Similar methods can be used to prepare

maps of other types; circular maps can be prepared

but require subsequent editing (below) to put them in

an acceptable format.

(c) Software and hardware requirements

The equipment needed is not stringent. Version 1

or 2 of Lotus l-2-3 runs on an IBM PC or a

compatible machine that is equipped for mono-

chrome or color graphics. A Hercules Graphics

display board (Hercules Computer Technology,

Berkeley, CA, U.S.A.) and an IBM monochrome

monitor are relatively inexpensive and provide

graphs of high resolution - important if locus names

must be read on-screen. Graphs are displayed

virtually instantaneously. Graphs to be printed are

first saved and then printed as a separate operation,

which requires a few minutes. A wide variety of

printers/plotters can be used. With l-2-3 version 2,

this includes dot-matrix printers, X-Y plotters, and

laser-driven devices. Finally, it should be noted that

there are programs, for example, Lotus Freelance

Plus, that permit manipulation of l-2-3 graphs to

allow editing, fine-adjustment of the labeling, addi-

tional drawing, choice of fonts, combining of graphs

and other features.

(d) Discussion

The utility of Lotus l-2-3 has been previously

pointed out by Campione-Piccardo and Ruben

Page 4: Plotting genetic maps on a microcomputer

302

(1986) who incorporated it into a larger, menu-driven

program for maintaining a recombinant DNA data-

base that can combine genetic sequence information

at different levels and draw restriction maps. In

addition, of course, there are many microcomputer

programs that will generate and draw a map of

restriction enzyme cleavage sites from a nucleic acid

sequence (Abremski and Ward, 1986; Pustell and

Kafatos, 1986; and additional references in Sol1 and

Roberts, 1986). On the other hand, I am not aware

of any microcomputer program that will generate

genetic linkage maps, although there is a program

running on larger computers that will maintain a

genetic database, calculate linkages and draw maps

(Eppig and Either, 1983). The use of Lotus l-2-3 to

draw both types of maps will, however, be advanta-

geous where its availability, varied capabilities, mi-

crocomputer operation, and ability to use a wide

variety of printers are considerations.

ACKNOWLEDGEMENT

REFERENCES

Abremski, K. and Ward, D.F.: PLASMID MAP: a micro-

computer program for display and storage of plasmid data.

Gene 46 (1986) 127-130.

Bachmann, B.: Linkage map of Escherichiu coli K-12, edition 7.

Microbial. Rev. 47 (1983) 180-230.

Campione-Piccardo, J. and Ruben, M.: An integrated software

system for microcomputer management of recombinant

DNA data. Nucl. Acids Res. 14 (1986) 571-574.

Eppig, J.T. and Either, E.M.: The mouse linkage map. A com-

puter program. J. Hered. 74 (1983) 218-231.

Pustell, J. and Kafatos, F.C.: A convenient and adaptable micro-

computer environment for DNA and protein sequence

manipulation and analysis. Nucl. Acids Res. 14 (1986)

479-488.

Siill, D. and Roberts, R.J. (Eds.). The applications of computers

to research on nucleic acids HI. IRL Press, Oxford, 1986.

Staden, R.: The current status and portability of our sequence

handling software. Nucl. Acids Res. 14 (1986) 217-231.

Communicated by A.J. Podhajska

This project benefited from an IBM Project

Advance award to UCLA.