plotting genetic maps on a microcomputer
TRANSCRIPT
Gene. 60 (1987) 299-302
Elsevier
GEN 02211
299
Plotting genetic maps on a microcomputer
(Recombinant DNA; linkage map; restriction map; Lotus l-2-3; personal computer; spreadsheet; keyboard;
IBM PC; database)
Donald P. Nierlich
Department of Microbiology and Molecular Biology institute, University of Calfornia, Los Angeles, CA 90024 (U.S.A.)
Received 5 July 1987
Accepted 8 September 1987
SUMMARY
Maps of genetic linkage and restriction enzyme cleavage sites can be quickly prepared on an IBM PC
microcomputer with the commercially available program Lotus l-2-3. Data can be entered on the keyboard
or imported from other programs. The maps can be displayed on the screen or with a printer or plotter. These
procedures should be useful in the research laboratory, in preparing figures for publication and in teaching.
INTRODUCTION
It is the purpose of this note to point out that
genetic maps can be quickly prepared on an IBM PC
(International Business Machines Corp.) or a com-
patible microcomputer using the program Lotus
l-2-3 (Lotus Development Corp., Cambridge, MA,
U.S.A.). Lotus l-2-3 is a widely available program,
intended originally for business applications, which
Correspondence lo: Dr. D.P. Nierlich, Department of Micro-
biology, Life Sciences 5304, UCLA, Los Angeles, CA 90024
(U.S.A.) Tel. (213)825-6720.
Abbreviations: Lotus l-2-3 and Freelance Plus, registered trade-
marks for software programs of Lotus Development Corp.; IBM
PC, a registered trademark for a microcomputer of International
Business Machines Corp.; kb, 1000 base pairs.
can be used for recording, manipulating and graph-
ing laboratory data. Data are entered in tabular form
on a ‘spreadsheet’ that resembles accounting or
engineering notebook paper, and can then be
manipulated with the program’s arithmetic, exponen-
tial, logical, statistical and database functions. Data
can be typed in or imported from the files of other
programs. In the case of genetic data, the program
can accept the names of loci and their map positions,
sort them, and with a few additional key strokes,
display the map on the screen. With a printer or
plotter, graphs can be prepared without the tedious
calculation of scaling factors and drawing. Finally,
spreadsheets can be saved or copied so that revisions
can be carried out with little effort, and portions of
a spreadsheet can be protected (locked) so that they
cannot be accidentally altered.
0378-I 119/87/$03.50 0 1987 Elsevier Science Publishers B.V. (Biomedical Division)
EXPERIMENTAL AND DISCUSSION (b) Preparation of maps
(a) Principle of the method
Conceptually, the maps are X-Y graphs in which
the value of one parameter has been held constant,
e.g. X = 0, for the loci on the vertical axis in Fig. 1A.
The features of Lotus l-2-3 that allow it to display
such a map are the ability to scale and plot a series
of numbers, and the ability to draw the names of the
loci, called ‘data-labels’, adjacent to the points.
Slightly more sophisticated maps, as shown in
Fig. lB, depend on the program’s ability to plot
data-labels while suppressing the associated points
and lines. These features may be found in other
programs as well.
A 9 ar0L proc
phoA
brnQ
brnl? hem6
t codAB
fecB
7 fecA
katC
argF
t cxm
i
attp22
B
Fig. 1 shows typical maps of genetic linkage data
and restriction enzyme cleavage sites. In the simplest
case (most of Fig. 1 A), the names of the loci and map
positions were entered in two columns on the screen.
In a third column, zeros were entered. The graph
mode was entered, an X-Y type of graph was
designated, and the column of zeros and the map-
positions were designated as the X and Y (‘B’)
values, respectively. The column of loci was desig-
nated as the data-labels (for the B-values). To display
the loci that are offset on the map, e.g., 1ucZYA in
Fig. 1 A, they were given an X-value of 10; two
additional ‘dummy’ data points with values of 0 (and
without locus names) were included on each side of
these points to specify the points at which the offset
leaves and returns to the axis, and an additional
dummy data point with a value of 100 was placed at
Fig. I. Maps of genetic linkage and restrictmn enzyme cleavage
sites prepared with Lotus l-2-3. (A) Portion of the genetic map
ofErcherichin coli (Bachmann, 1983). The figure was prepared as
described m Table I and in EXPERIMENTAL AND DIS-
CUSSION. section b. The graph was printed on a Hewlett
Packard LaserJet 500 Plus (Hewlett Packard. Palo Alto. CA.
U.S.A.) after slight editing with Freelance Plus. (B) Restriction
map ofthe /ucZ gent. Restriction sites in the ECOLAC tile of the
GenBank ’ database (available from BBN Laboratories. Boston,
MA, U.S.A.) were identified using the program ANALYSEQ
(Staden. 1986) running on a DEC VAX 750 computer. The
resulting data tile (moved to the PC) was cntercd directly into a
l-2-3 spreadsheet with the import and parse functions. The graph
was prepared on an Hewlett Packard X-Y pen plotter. model
747012.
EcoRV &Ii xflorr
Mstii Cloi ororri AVUli Acci EcoRl
Mstf Aotli Sspi EssHII sacr NspHINdeI
I_ 1 Y I I I 7 - 0 4 0. 8 I. 2 I. 0. 6 2 2. 4 2. 8 3. 2
6 J 10~2 Nrb)
TABLE I
Detailed instructions for Fig. IA
A simplitied version of Fig. 1A (without showing the lac genes offset “) can be made with the following entries:
(a) Start the l-2-3 program and in cells A3, B3 and C3, respectively, enter the name of the first locus (attP22), its map position
(6.2), and a zero”. In cells A4-C4 through Al&Cl& repeat this with each successive locus. List the luc operon as a single
locus, IacZYA.
(b) (optional) Use these keystrokes to adjust the map positions to two significant figures: /rfK!(cr) b3..b18( cr). [ (cr) indicates
the Enter key.] These keystrokes center the locus-names in column A: /rlc a3..a18(cr).
(c) With the following keystrokes, indicate an XY type of graph: /gtx.
(d) Indicate that the column of zeros (cells C3-C18) contains the X-values: x c3..c18 (cr).
(e) Indicate that the column of map positions are the B-values: b b3..b18(cr). (0 ne can use another of the ranges, A-F, to
mark the loci with a different symbol.)
(f) (optional) To view the map at this stage, touch ‘v’; hit any key to continue.
(9) To place the names of the loci on the map, associate them with the B-values: odb a3..a23(cr) rqq.
(h) To view the completed map, hit ‘v’, touch any key to return to the graphics menu.
(i) To save the map itself for printing later, hit ‘s’ and enter a tile name. Hit (Esc) one or more times to return to the data screen.
To save the data in a tile, type: /fs. Finally, to map a subset of the loci, change the values for the range in steps e and g. A
Lotus tile can hold many thousands of loci (depending on how much ancillary information is included), and graphics settings
for several such subsets can be stored in named sets with the command ‘/gnc’.
” To draw a map with the luc genes offset as in Fig. lA, see EXPERIMENTAL AND DISCUSSION, section h.
h It is often helpful to enter the map-position data twice, in two columns. One column serves as a record of the actual values, and the
other (identified as the B-values in step e) serves for making the drawing. In this way, changes can be made for the purpose ofthe drawing
(as for the /UC genes) without disturbing the record of data values. The copy command, /c. is useful in duplicating the column of
map-position values, and also in creating the column of repeated values (e.g., zeros).
the end of the column of data to provide a scale for
drawing the figure. For the restriction map in
Fig. lB, the map positions were entered as values of
X, and a column of zeros designated for the values
of Y (‘C’ in this case). Two additional columns were
then created, one with the names of the enzymes
(designated the data-labels for the A-range) and the
other with the constant values (lo,20 or 30) used for
plotting the names (designated as the A values); a
dummy A-data point (200) was also included to scale
the figure vertically. Using the program’s Graph/
Options/Format command, the drawing of points
and lines was turned off for the A range. Detailed
instructions for preparing Fig. IA are given in
Table I. Similar methods can be used to prepare
maps of other types; circular maps can be prepared
but require subsequent editing (below) to put them in
an acceptable format.
(c) Software and hardware requirements
The equipment needed is not stringent. Version 1
or 2 of Lotus l-2-3 runs on an IBM PC or a
compatible machine that is equipped for mono-
chrome or color graphics. A Hercules Graphics
display board (Hercules Computer Technology,
Berkeley, CA, U.S.A.) and an IBM monochrome
monitor are relatively inexpensive and provide
graphs of high resolution - important if locus names
must be read on-screen. Graphs are displayed
virtually instantaneously. Graphs to be printed are
first saved and then printed as a separate operation,
which requires a few minutes. A wide variety of
printers/plotters can be used. With l-2-3 version 2,
this includes dot-matrix printers, X-Y plotters, and
laser-driven devices. Finally, it should be noted that
there are programs, for example, Lotus Freelance
Plus, that permit manipulation of l-2-3 graphs to
allow editing, fine-adjustment of the labeling, addi-
tional drawing, choice of fonts, combining of graphs
and other features.
(d) Discussion
The utility of Lotus l-2-3 has been previously
pointed out by Campione-Piccardo and Ruben
302
(1986) who incorporated it into a larger, menu-driven
program for maintaining a recombinant DNA data-
base that can combine genetic sequence information
at different levels and draw restriction maps. In
addition, of course, there are many microcomputer
programs that will generate and draw a map of
restriction enzyme cleavage sites from a nucleic acid
sequence (Abremski and Ward, 1986; Pustell and
Kafatos, 1986; and additional references in Sol1 and
Roberts, 1986). On the other hand, I am not aware
of any microcomputer program that will generate
genetic linkage maps, although there is a program
running on larger computers that will maintain a
genetic database, calculate linkages and draw maps
(Eppig and Either, 1983). The use of Lotus l-2-3 to
draw both types of maps will, however, be advanta-
geous where its availability, varied capabilities, mi-
crocomputer operation, and ability to use a wide
variety of printers are considerations.
ACKNOWLEDGEMENT
REFERENCES
Abremski, K. and Ward, D.F.: PLASMID MAP: a micro-
computer program for display and storage of plasmid data.
Gene 46 (1986) 127-130.
Bachmann, B.: Linkage map of Escherichiu coli K-12, edition 7.
Microbial. Rev. 47 (1983) 180-230.
Campione-Piccardo, J. and Ruben, M.: An integrated software
system for microcomputer management of recombinant
DNA data. Nucl. Acids Res. 14 (1986) 571-574.
Eppig, J.T. and Either, E.M.: The mouse linkage map. A com-
puter program. J. Hered. 74 (1983) 218-231.
Pustell, J. and Kafatos, F.C.: A convenient and adaptable micro-
computer environment for DNA and protein sequence
manipulation and analysis. Nucl. Acids Res. 14 (1986)
479-488.
Siill, D. and Roberts, R.J. (Eds.). The applications of computers
to research on nucleic acids HI. IRL Press, Oxford, 1986.
Staden, R.: The current status and portability of our sequence
handling software. Nucl. Acids Res. 14 (1986) 217-231.
Communicated by A.J. Podhajska
This project benefited from an IBM Project
Advance award to UCLA.