tiamat: the manual - dhirubhai ambani institute of information...

11
Tiamat: The Manual Sean Williams 1 Preamble This document is intended to provide a high-level understanding of Tiamat, most notably its use and uses. This document will not be a fine-toothed combing of Tiamat’s functionality, nor will it be a technical specification. All images of Tiamat in this document were produced using Windows Vista; conveniently, this allows you to see translucent title bars, Windows Vista’s feature. Hopefully seeing the raw form of this document will subconsciously persuade users to begin using L A T E Xfor document creation. Also, this documentation was produced by the programmer who wrote Tia- mat. This prodigal programmer is not a biochemist – though he does sometimes play one at parties – so this document will also occasionally misuse, under use, and fabricate biochemistry jargon. The biochemistry in the program itself – the geometry of the DNA backbone, the checks to prevent secondary structure formation, and knowledge of the existence of Watson and Crick, for example – is continuingly being verified by a group of very bright biochemists, so any biochemical incompetence should only manifest in this document. The name Tiamat is from the Enuma Elish, the Babylonian creation epic, in which Tiamat, the mother of the other gods, gets angry at the other gods after they kill her husband; she creates terrible monsters to take her revenge. At the eleventh hour, the god-hero Marduk agrees to slay Tiamat for the low price of being crowned king of the gods, and after slaying her, Marduk creates the world out of Tiamat’s body. Any serious questions or bugs not addressed by this document can be an- swered either by referring to the source code (just a joke. . . ), or by sending an e-mail to [email protected] (also just a joke. . . or is it?). 2 The Very High Level Tiamat is a tool for modeling structural DNA. The core components for this task are a specialized 3D modeling environment and a random sequence generator based on applying known constraints. The basics of 3D modeling are fairly well developed by the computer graphics industry, so the interface is very obviously derived from such applications as Autodesk’s Maya and Lionhead Studio’s Black and White series. 1

Upload: others

Post on 29-Jan-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Tiamat: The Manual

Sean Williams

1 Preamble

This document is intended to provide a high-level understanding of Tiamat,most notably its use and uses. This document will not be a fine-toothed combingof Tiamat’s functionality, nor will it be a technical specification. All images ofTiamat in this document were produced using Windows Vista; conveniently,this allows you to see translucent title bars, Windows Vista’s feature. Hopefullyseeing the raw form of this document will subconsciously persuade users to beginusing LATEXfor document creation.

Also, this documentation was produced by the programmer who wrote Tia-mat. This prodigal programmer is not a biochemist – though he does sometimesplay one at parties – so this document will also occasionally misuse, under use,and fabricate biochemistry jargon. The biochemistry in the program itself –the geometry of the DNA backbone, the checks to prevent secondary structureformation, and knowledge of the existence of Watson and Crick, for example– is continuingly being verified by a group of very bright biochemists, so anybiochemical incompetence should only manifest in this document.

The name Tiamat is from the Enuma Elish, the Babylonian creation epic,in which Tiamat, the mother of the other gods, gets angry at the other godsafter they kill her husband; she creates terrible monsters to take her revenge.At the eleventh hour, the god-hero Marduk agrees to slay Tiamat for the lowprice of being crowned king of the gods, and after slaying her, Marduk createsthe world out of Tiamat’s body.

Any serious questions or bugs not addressed by this document can be an-swered either by referring to the source code (just a joke. . . ), or by sending ane-mail to [email protected] (also just a joke. . . or is it?).

2 The Very High Level

Tiamat is a tool for modeling structural DNA. The core components for this taskare a specialized 3D modeling environment and a random sequence generatorbased on applying known constraints. The basics of 3D modeling are fairly welldeveloped by the computer graphics industry, so the interface is very obviouslyderived from such applications as Autodesk’s Maya and Lionhead Studio’s Blackand White series.

1

Page 2: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Tiamat is loosely derived from a previous structural DNA modeling utilitycalled GIDEON (though the prodigal programmer did not actually know thisuntil well into Tiamat’s development), and largely exists to solve two or threeproblems. First, GIDEON does not have a built in sequence generator. Second,GIDEON does not handle very large structures gracefully. Third, GIDEONonly runs under Apple’s OS X, which I state as a “problem” because I prefer tolive in the real world, where the vast majority of computers run some versionof Windows. Of course, the obvious corollary is that Tiamat only runs underWindows; it would be a fair undertaking to rewrite it enough to work under OSX, Linux, UNIX, Solaris, or, say, MenuetOS.

Another important difference, which may seem trivial at first, is what con-stitutes an “indivisible data unit” in the two applications. GIDEON uses DNAhelices as the modeling primitive; modeling at this more abstract level allowshigher level operations. Primarily, this is the structure relaxation operation,which you will find does not exist in Tiamat. Tiamat uses oligonucleotides –which will be referred to throughout this document as “bases” for brevity andbiochemistry incompetence – as the modeling primitive. Fear not, however, asyou are not required to personally apply the parametric helix equation; instead,the operation that draws a helix puts the bases of the helix at their appropriateworld space positions, then discards all other information. This might seem likea bad idea, but this choice is twofold: first, during the initial specifications,structure relaxation was never brought up, but more importantly, this designallows greater flexibility. That is, a structure can be composed of basically any-thing so long as it’s nucleotides, but this adds the ability to use funny-shapedprobes.

3 Interfacing

It is the opinion of the prodigal programmer that there are exactly two accept-able uses for the term “interface.” It may be used to describe the boundarybetween two media, as for example is relevant to Snell’s Law, and it may beused to describe the component of a computer program that allows a user tointeract with the program.

3.1 Four Panels, One World

We can imagine our DNA structures as existing within a three-dimensionalcartesian space. This is of course a good analogy, since the real world can also beimagined as a three-dimensional cartesian space, but it does run into the ratherimportant hitch that three-dimensional displays are cost- and space-prohibitive.The obvious solution to this problem is to project this world space onto the flatscreen of a computer monitor and draw it there, but as the expression usuallygoes, the devil’s in the details.

Simply using a freeform three-dimensional view, such as the Battle Room ofOrson Scott Card’s Ender’s Game, has the basic problem addressed by Ender:

2

Page 3: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Figure 1: Tiamat’s user interface

it’s very easy to get disoriented. Conveniently, this problem has already beenaddressed at some length by architects, who design buildings using front, side,and top views.

This actually gets into a slightly separate issue of the kind of projection used.The lens of the human eye projects the real world onto what is almost a point,which results in angular artifacts summed up by the topic of parallax. To be alittle less roundabout, objects close to you appear larger than objects far away,even though they could in fact be the same size. A “flat” projection, whichwould project space onto a plane rather than a point, removes the anomalies,but also removes the most obvious way the human brain distinguishes depth.

To get back to computer graphics, the solution that has been used quitesuccessfully in the past is to employ both techniques: orthographic (flat) projec-tions from the front, side, and top, and one unconstrained perspective (natural)projection. In this system, the perspective view can be used to view a scene nat-urally, while the three orthographic views provide a distortion-free environmentto model the structure. In Tiamat, the front, side, and top views are cleverlylabeled F, S, and T, respectively, and the perspective view is left unlabeled.

3.2 Mice and Other Woodland Creatures

It has been the standard of the computing world for some time that the majorityof interaction occurs with the mouse; the lack of focus on the keyboard couldexplain why such Pulitzer-winning lines as “lf1m sm arm/cath need tank” quali-fies as a sentence nowadays, but that (like most of this document) is a discussionfor another time.

Of course, most of the nuts and bolts of interaction were defined long ago

3

Page 4: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

in a Gordian knot of intercompany idea theft that ultimately resulted in thecurrent computing paradigms. Which is to say, most of the interaction is donethrough menus and a toolbar, which are fairly well understood by most usersand therefore not worthy of any more discussion.

This primarily leaves the question of how to interact with the four viewsmentioned above. There are three navigation operations – rotate, pan, andzoom – though only the latter two apply to orthographic views. Panning, doneby clicking and dragging with the right mouse button, moves the view side toside and up and down. Zooming, done by spinning the mouse wheel, movesthe view in and out, or, closer to and further away from the point the view islooking at. Rotating, done by clicking and dragging the middle mouse button(the scroll wheel), orbits the view around the point the view is looking at.

You might be wondering by now what the left mouse button does. While theprodigal programmer does get a smile from the idea of leaving the left mousebutton – the most commonly used of the lot – completely unused, its function ismodal. Which is to say, there are a lot of functions for the left mouse button, sowhat it does must be chosen by the user. These options are all featured underthe “Mode” menu, which will be detailed later.

4 Bases and Connection

Tiamat is based entirely on the idea of connected bases. There are four kinds ofconnections in Tiamat – down, across, slide, and sticky – though only the firsttwo have direct physical analogues.

Down connections are connections that run along the helix backbone. Astrand is defined as a set of all bases that can be reached only through up anddown connections.

Across connections are connections across the helix backbone. Two baseslinked by an across connection are paired, and must follow the Watson andCrick rule that Adenine pairs with Thymine, and Cytosine pairs with Guanine.

Slide connections are only used by the sequence generator, with the rulethat two slide connected bases cannot be Watson-Crick pairs. That is, if onebase in a slide connection is Adenine, the other base in that connection can beAdenine, Cytosine, or Guanine, but cannot be Thymine. Slide connections existto ensure that certain junctions do not slide along the backbone axis, producingan unwanted structure.

Sticky connections are slightly more regional. Two groups of bases can besticky connected if they are contained in exactly two strands, and the numberof selected bases in each of those strands is equal. The sequence of one of thesticky ends must be Watson-Crick pairable with the sequence of the other stickyend. This can used to create self-assembling array structures, where the rightside of one instance of the structure links to the left end of another instance ofthe structure.

4

Page 5: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Figure 2: A very large structure, simplified

5 Simplification

Very large structures add two design problems. First, they are very time con-suming for the computer to draw; if the draw rate slips below about fifteenframes per second, the tool becomes essentially unusable. Second, very largestructures bombard the user with so much information that it isn’t possible totake it all in. Tiamat’s simplification algorithm replaces individual bases withlines where there would otherwise be a continuous series of bases.

6 Menu and Toolbar

The purpose of a toolbar is to provide easier access to commonly used functions,and to associate a snappy picture with those functions. This actually createsan interesting exercise: toolbar images are sixteen by sixteen pixels in size, withsixteen colors to choose from. You can tell a lot about the prodigal programmerby the mental associations used to represent a function with a sixteen by sixteenimage. Regardless, since the toolbar buttons all map to a menu function, whatwill follow instead is an explanation of all the menu functions; if a toolbar imageis too avant garde to make any sense of, just hover the mouse over it to get anexplanation.

6.1 File

Options regarding the creation, loading, and manipulation of external files

5

Page 6: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

New creates a new document; this has the effect of deleting all bases and reset-ting the views to their default positions

Open loads a previously saved document; this has the effect of returning theprogram roughly to the state it was in when the document was saved

Save stores the current document to an external file with the .dna extension; ifthe document has already been saved, the previous file will be updated toreflect the current version

Save As acts like Save in all ways except that it will always require you specify thefile to save to, even if the document has already been saved

Import loads a previously saved document and adds its contents to the currentdocument; this can be useful if you create for example a stock four-armjunction that you might want to use in future structures

Export creates a plain text file containing a listing of the sequences of the strandsthat comprise a structure; this can be useful if you want to order thosestrands from a sequencing company for some real world experiment orother

Render creates an image using one of the four views, and saves it to an externalfile, a printer, or the operating system clipboard (so you can “paste” itinto a document)

Render Video creates an Audio Video Interleave (AVI) file of the perspective view rotatedabout a specified axis; this feature is somewhat hit or miss, so if at firstyou don’t succeed, failure may be your style

Print Setup gives you the operating system’s options for configuring your printer

Exit closes Tiamat, which will likely prompt you to save your document

6.2 Edit

Options regarding manipulation of items within a structure

Undo reverts the structure to the state it was in just before the last operation

Redo brings the structure to the state it was in just after the last undone oper-ation

Cut puts all selected bases in the operating system clipboard (so they can bepasted) and deletes them

Copy puts all selected bases in the operating system clipboard (so they can bepasted)

Paste adds an instance of all bases in the operating system clipboard to thestructure

6

Page 7: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Delete removes all selected bases from the structure, as well as all connectionsthey may have to other bases

Select All of course selects all bases in the structure

Translate brings up a dialog box allowing you to specify a distance in the threedirections (x, y, and z), and moves all selected bases

Rotate brings up a dialog box allowing you to specify rotations about the threeaxes (x, y, and z), and rotates all selected bases about their center of mass;the three directions of rotation are direct analogues of roll, pitch, and yawin aeronautics

Fill Strand randomly assigns base types to all selected bases; this assignment is com-pletely unconstrained and will thus fairly likely create secondary structuresif used irresponsibly

Reset Bases assigns all selected bases the Generic type

Generate Sequences produces a dialog box allowing you to specify the six constraints of thesequence generator (which will be discussed in a later section), then assignstypes to all Generic bases such that the overall structure is within theprovided constraints

Edit Bases produces a dialog box containing the entire sequence of one selected strand,which can then be edited directly

Delete Connection operations only work if exactly one base is selected for which a givenconnection type exists; the operation deletes that connection

Create Down only works if exactly two bases are selected, one of which lacks a downconnection and the other of which lacks an up connection; the operationcreates a down/up connection between them

Create Across only works if exactly two bases are selected that both lack across connec-tions; the operation creates an across connection between them

Create Slide only works if exactly two bases are selected that are not across connected;the operation creates a slide connection between them

Create Sticky only works if two groups of bases of the same size from exactly two strandsare selected; the operation creates the appropriate sticky connections be-tween bases in the two selected groups

6.3 Display

Options regarding how a structure is displayed in the four views

Edit Back Color specifies the background color for the four views; black and white makegood choices

7

Page 8: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

Edit Base Color specifies the color of the five base types – Adenine, Thymine, Guanine,Cytosine, and Generic

Edit Connection Color specifies the color all connections except up/down connections are drawn

Edit Strand Color specifies the six strand colors used to draw up/down connections; a strandis colored based on its strand number modulo six

Selection Offset specifies whether selected bases and connections are darker (Additive) orlighter (Subtractive) than unselected bases and connections

Grid Extents specifies how far the coordinate grids extend from the origin; very largenumbers will badly clutter the display

Set Custom Strand Color sets the color for up/down connections only for the currently selectedbases; this can be used to highlight certain sections of a structure

Reset Custom Strand Color returns the selected up/down connections to their default colors

Connection Mode specifies whether connections are drawn as lines or cylinders; cylinderslook better but draw more slowly, and should only be used when nicelooking images are needed

Backbone Line Thickness specifies the thickness in pixels of up/down connections

Base Pair Line Thickness specifies the thickness in pixels of across connections

Backbone Cylinder Thickness specifies the thickness in world units1 of up/down connections

Base Pair Cylinder Thickness specifies the thickness in world units of across connections

Sphere Detail specifies how smooth the spheres that represent bases appear; highersmoothness takes longer to draw but looks nicer

Cylinder Detail specifies how smooth the cylinders that represent connections appear;higher smoothness takes longer to draw but looks nicer

Simplified View sets the simplification mode; Always will simplify bases regardless of view;Sometimes will simplify bases far away from the view; Never will causesimplification to never occur

Default Settings contains options for setting all detail options at once

Color Scheme contains options for setting all color options at once1The reason line thicknesses are specified in screen units – pixels – while cylinder thicknesses

are specified in world units – nanometers – is extremely convoluted and ultimately not thefault of the prodigal programmer

8

Page 9: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

6.4 Mode

Options regarding the effect of the left mouse button on the four views

Select Helix selects a base and all bases recursively connected by an up, down, or acrossconnection

Select Strand selects a base and all bases recursively connected by an up or down con-nection

Select Base Pair selects a base and its across connected base

Select Base selects a base

Select Box draws a bounding box, selecting all bases within that box

Create Strand draws a line, then provides options for creating a helix or strand usingthat line as its backbone

Create Freeform Strand has successive clicks specify control points of a curve, with a final doubleclick ending the curve; clicking on a base first and/or last in this seriesconnects the freeform strand to those bases

Change Position moves all selected bases in the direction the mouse is dragged along

Change Rotation rotates all selected bases about their center of mass about axes specifiedby the direction the mouse is dragged

6.5 View

Options regarding what elements of the interface are displayed

Toolbar shows and hides the toolbar

Status Bar shows and hides the bar at the bottom of the screen that shows programstatus, selection information, and the mouse pointer’s world coordinates

Grids shows and hides the coordinate grids in the four views

Fog determines whether objects are “greyed” depending on their distance fromthe camera; this effect helps the brain determine distance

Strand Numbers shows and hides strand numbers near the top base of each strand

Slide Connections shows and hides stippled (dashed) lines between all slide connected bases

Bounding Boxes shows and hides colored boxes around sticky end connected groups

6.6 Help

An artifact left over from the stock menu created by Visual Studio

About Tiamat... contains most notably the name of the prodigal programmer

9

Page 10: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

7 Sequences

The question is occasionally asked, how does a completely deterministic machinegenerate random numbers? It’s a closely guarded secret, but the truth is thatall computers from the 286 onward have included microscopic gnomes that rollmicroscopic dice, and feed the results of their dice-rolling into the processor.That being out of the way, there are six constraints to the sequence generator.

Unique sequence limit determines the maximum length a subsequence canbe to appear more than once in the structure. With a unique sequence limitof 6, if the sequence ATTCAG appears, that sequence can never appear again.However, the five-length sequence ATTCA can appear in other places, so thesequence ATTCAT is allowed to appear elsewhere. If a group of bases are setprior to running the sequence generator, all sequences of preset bases of thespecified length can never appear again.

Repetition limit determines the maximum length of a sequence that containsall the same bases. With a repetition limit of 4, the sequence AAAA may appear,but AAAAA may not.

Guanine repetition limit determines the maximum number of Guanines thatcan appear in a continuous sequence.

Guanine and Cytosine percentage determines what portion of bases in thestructure must be Guanine or Cytosine. The actual proportion will generallybe slightly in Guanine’s and Cytosine’s favor.

Check sliding bases simply determines whether the rule that slide connectedbases must be Watson-Crick pairs is enforced.

Operation timeout determines how long, in seconds, the sequence generatoris allowed to run before terminating with no result. If the constraints are tootight for a valid set of sequences to be generated, the algorithm would runforever, but it will find a solution very quickly if one exists.

The sequence generator will affect all Generic bases in the scene – not justthose that are selected. Bases set to a non-Generic type prior to running thesequence generator will not trip the constraints; the only constraint that espe-cially interacts with preset bases is the unique sequence limit, which only addsall preset sequences to a database of unusable sequences. There have also beenreports of random number generation gnomes forming unions and organizingstrikes due to the number of random numbers required in the process, but it’susually fairly trivial to locate enough scabs to offset this effect.

8 Exit Stage Left

The prodigal programmer would like to thank the following people for theirassistance on the Tiamat project: Dr. Peter Wonka of ASU’s PRISM lab, whogot the project started by hiring the prodigal programmer, and kept it goingwith thinly veiled threats, vague comments, and plenty of not quite praise.Drs. Stuart Lindsay and Hao Yan of ASU’s Biodesign Institute, who keptfunding alive throughout the project, and sufficiently liked Tiamat to not fire

10

Page 11: Tiamat: The Manual - Dhirubhai Ambani Institute of Information …courses.daiict.ac.in/.../content/0/tiamatManual.pdf · 2018-01-24 · Tiamat: The Manual Sean Williams 1 Preamble

me. And finally, Yonggang Ke, Chenxiang Lin, Kyle Lund, and Sherri Rinker,of ASU’s Biodesign Institute, who like all users used Tiamat like it should workrather than how it does, and thus exposed every manner of bug. (They alsokept the prodigal programmer up to date on all the gossip, which is perhapsmore important.) Finally, the prodigal programmer would like to thank JonAnderson, Chris Squire, Steve Howe, Rick Wakeman, and Alan White, of theprogressive rock band Yes, for producing the album Tales from TopographicOceans.

11