component coding, three-item coding, and consensus methods

Society of Systematic Biologists

Component Coding, Three-Item Coding, and Consensus MethodsAuthor(s): David M. Williams and Christopher J. HumphriesSource: Systematic Biology, Vol. 52, No. 2 (Apr., 2003), pp. 255-259Published by: Oxford University Press for the Society of Systematic BiologistsStable URL: http://www.jstor.org/stable/3651129 .

Accessed: 01/10/2014 01:16

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Oxford University Press and Society of Systematic Biologists are collaborating with JSTOR to digitize,preserve and extend access to Systematic Biology.

http://www.jstor.org

This content downloaded from 207.96.242.179 on Wed, 1 Oct 2014 01:16:35 AMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=oup

http://www.jstor.org/action/showPublisher?publisherCode=ssbiol

http://www.jstor.org/stable/3651129?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


2003 POINTS OF VIEW 255

Upper Triassic of east-central New Mexico. N.M. Mus. Nat. Hist. Sci. Bull. 21:221-234.

SMALL, B. J. 1989. Aetosaurs from the Upper Triassic Dockum For- mation, Post Quarry, West Texas. Pages 301-308 in Dawn of the age of dinosaurs in the American southwest (S. G. Lucas and A. P. Hunt, eds.). New Mexico Museum of Natural History, Albuquerque.

SMALL, B. J. 1998. The occurrence of Aetosaurus in the Chinle Formation (Late Triassic, U.S.A.) and its biochronological significance. Neues Jahrb. Geol. Palaontol. Monatsh. 1998:285-296.

SMALL, B. J. 2002. Cranial anatomy of Desmatosuchus haplocerus. Zool. J. Linn. Soc. 136:97-111.

ZEIGLER, K. E., A. B. HECKERT, AND S. G. LUCAS. 2002. A new species of Desmatosuchus (Archosauria: Aetosauria) from the Upper Triassic of the Chama Basin, north-central New Mexico. N.M. Mus. Nat. Hist. Sci. Bull. 21:215-219.

First submitted 8 December 2002; final acceptance 14 December 2002 Associate Editor: Mike Steel

Syst. Biol. 52(2):255-259, 2003 DOI: 10.1080/10635150390192753

Component Coding, Three-Item Coding, and Consensus Methods

DAVID M. WILLIAMS AND CHRISTOPHER J. HUMPHRIES

Department of Botany, Natural History Museum, Cromwell Road, London SW7 5BD, U.K.; E-mail: [email protected] (D.M.W.)

It has long been recognized that cladograms can be represented as binary data matrices (e.g., Farris, 1973). Con- version of suites of fundamental cladograms into matrices for parsimony analysis has been developed in biogeography (e.g., Wiley, 1988; Brooks, 1990) and gene-tree analysis in molecular systematics (Doyle, 1992; Guig6 et al., 1996; Slowinski and Page, 1999). The idea has recently been applied to consensus tree techniques (Baum, 1992; Ragan, 1992a, 1992b; Nelson and Ladiges, 1994), with Sanderson et al. (1998) advocating its application to super-tree construction (see also Bininda-Emonds and Bryant, 1998, and Hugot and Cosson, 2000, who linked all these approaches together in a biogeographic study).

In the context of cladogram analysis and historical biogeography, Nelson and Ladiges (1996) contrasted two ways of assigning data to nodes of cladograms to express the phylogenetic relationships of organisms. The coding method concerns how information, organized as 0 and 1 entries in a matrix for subsequent analysis by parsimony programs, is determined. Nelson and Ladiges de- scribed the two methods as component and three-item analyses. Component coding follows each internal node in a cladogram to all its tips and enters those data with scores of 1 in the matrix. All remaining taxa not sup- ported by that node are scored with a 0. If a cladogram has more than one node, then one component is coded for each node.

Component coding, when used for consensus tree construction, has been called matrix representation with parsimony (MRP; Ragan, 1992a, 1992b) or binary consensus (Nelson and Ladiges, 1994) and has been compared with parsimony approaches in general area cladogram construction in biogeography (Doyle, 1992; Baum and Ragan, 1993). Component coding has also been likened to additive binary coding, normally used for character data (Farris et al., 1970).

In contrast to component coding, in three-item coding each node is considered a relation between branches (Nelson and Ladiges, 1996). Thus, three-item coding relates some branches more closely than other branches of the tree, with each separate relation expressed minimally

as a three-item statement, e.g., A(BC), where B and C are more closely related to each other than either are to A.

In an earlier examination of component coding, Purvis (1995) suggested that redundancy occurred when additive binary representation of cladograms with more than one informative node was included in the matrix. He revised the protocol to remove the redundancy by coding taxa from each node on a cladogram with a 1, scoring the taxon sister to that node with a zero, and coding all remaining taxa in the matrix with a question mark instead of a zero. Ronquist (1996) questioned Purvis's protocol, suggesting that no redundancy was evident with the standard coding additive binary approach. More recently, Slowinski and Page (1999) offered further criticism of parsimony consensus trees and some general criticisms of consensus tree construction. The di- alogue so far might lead one to imagine that there is a need to explore possible differences between additive binary coding, multistate coding, and three-item coding for the representation of fundamental trees (sensu Nelson, 1979) in consensus approaches (Pisani and Wilkinson, 2002).

Discussion so far has centered on the information content of different approaches to the coding regimes as applied to particular cladograms (Purvis, 1995; Ronquist, 1996). Here, we explore this issue with respect to a few simple examples.

ASYMMETRY BETWEEN CLADOGRAM AND MATRIX

Consider the pectinate cladogram A(B(C(DE))) (after Sanderson et al., 1998: box 2, left subtree). The cladogram can be represented by three components and coded in a matrix accordingly (Table 1). These data, when analyzed with a standard parsimony program, yield the original tree. In a similar fashion, the cladogram A(B(C(DE))) can be coded as a matrix of characters according to Purvis's method (Table 1) and when analyzed with a standard parsimony program also yields the original tree. Any cladogram with more than one node may be represented by a single multistate character, such as the



256 SYSTEMATIC BIOLOGY VOL. 52

TABLE 1. Three approaches to coding the cladogram A(B(C(DE))) in matrix form: component coding,a Purvis (1995) component coding, and ordered multistate coding (MS). All methods yield the cladogram A(B(C(DE))) when analyzed with a parsimony program.

Component Purvis component MS 1 2 3 1 2 3 1

A 0 0 0 0 ? ? 1 B 1 0 0 1 0 ? 2 C 1 1 0 1 1 0 3 D 1 1 1 1 1 1 4 E 1 1 1 1 1 1 4

aComponent analysis originated from the work of Nelson and Platnick (1981) not Wilkinson (1994), as suggested by Bininda-Emonds and Bryant (1998:497). See also Nelson and Platnick (1980).

cladogram A(B(C(DE))), and it also will yield the original tree (Table 1) after parsimony analysis. The three matrices differ yet produce the same result. This finding raises the question of whether component coding, Purvis coding, and multistate coding are equivalent simply because they yield the same correct cladogram.

To some (e.g., Kluge, 1993:248), there is an exact equiv- alence between a multistate character coded in a single column and its additive binary equivalent coded in several columns; to others that equation is not exact (Nelson, 1993:262). The question of exactitude between multistate coding and their binary equivalent was first raised by Sneath and Sokal (1963:77; see also Sokal and Sneath, 1973:150), and as far as we are aware, no satisfactory solution has been offered.

To investigate the problem, three-item coding may provide some insight or, at the very least, expose some differences in information content among the various binary options (cf. Pisani and Wilkinson, 2002).

THREE-ITEM CODING

If the cladogram A(B(C(DE))) is represented by three binary characters equivalent to the three components A(BCDE), AB(CDE), and ABC(DE), these characters yield a total of 15 three-item statements (Table 2). Among those 15 statements, 6 are unique, 1, A(DE), occurs three times, and 3, A(CD), A(CE), and B(DE), occur twice. Thus, there are 10 different unique statements in total. Within the array of 15 statements, some components

contain statements that are logically dependent on one another.

Consider the component (character) A(BCD). This component yields three statements, A(BC), A(BD), and A(CD), of which any two will sum to recover the original component:

A(BC) + A(CD) = A(BCD)

A(BC) + A(BD) = A(BCD)

A(CD) + A(BD) = A(BCD)

Because any two of the three statements imply the original component, the three statements can be weighted such that they are equivalent to two statements; in this example a fractional weight of 2/3 can be applied (for details, see Nelson and Ladiges, 1992a; there is an error in their table 1 where values for n = 6 are 120 [not 130] for t = 14 and 135 [not 145] for t = 15).

Binary Coding Using the example above, component A(BCDE) yields

six statements. A minimum of three independent statements will sum to the original component, hence the ab- solute weight is 3/6 = 0.5 (Table 3; Nelson and Ladiges, 1992a).

The second component, AB(CDE), yields six statements. This component may be thought of as consist- ing of two parts, A(CDE) and B(CDE), each having three

TABLE 2. Three-item statements derived from three binary characters (three components) one multistate character. Both are equivalent to the cladogram A(B(C(DE))).

TABLE 3. Fractional weights for three-item statements derived from the three components A(BCDE), AB(CDE), and ABC(DE).

A(BCDE) AB(CDE) ABC(DE) Total

A(BC) 0.500 0.500 A(BD) 0.500 0.500 A(BE) 0.500 0.500 A(CD) 0.500 0.667 1.167 A(CE) 0.500 0.667 1.167 A(DE) 0.500 0.667 1.000 2.167 B(CD) 0.667 0.667 B(CE) 0.667 0.667 B(DE) 0.667 1.000 1.667 C(DE) 1.000 1.000 Totals 10.000 3.000 4.002 3.000 10.002

A B C

BC BD BE CD CE DE CD CE DE DE

Three binary characters (components) A(BCDE) + + + + + + AB(CDE) + + + + + + ABC(DE) + + +

Total 15 statements One multistate character + + + + + + + + + + Total 10 statements




statements apiece:

A(CD) + A(CE) + A(DE) = A(CDE)

B(CD) + B(CE) + B(DE) = B(CDE)

Because any two of the three statements can recover the original component, the three statements can be appro- priately weighted; in this example, a fractional weight of 2/3 is applied (Nelson and Ladiges, 1992a, 1994).

The third component, ABC(DE), yields three statements. In this example all three statements are required to achieve the correct result:

A(DE) + B(DE) = AB(DE)

A(DE) + C(DE) = AC(DE)

B(DE) + C(DE) = BC(DE)

A(DE) + B(DE) + C(DE) = ABC(DE)

As a consequence, for the total array of 15 three-item statements, application of fractional weighting reduces the relative value yielding a final weight of 10 and re- flecting the total weight of the original 10 combinations. The precision rests in accounting for all of the relevant information (Table 3).

Multistate Coding Seen from the perspective of three-item analysis, the

multistate character, A(B(C(DE))), is equivalent to a suite of unique three-taxon statements with no statement ap- pearing more than once (Nelson and Ladiges, 1992a, 1992b). The cladogram A(B(C(DE))), expressed as a multistate character, yields 10 unique statements, with either uniform or fractional weighting (Table 2).

Purvis Coding When the cladogram A(B(C(DE))) is coded according

to Purvis's method, the three modified binary characters, A(BCDE), B(CDE), and C(DE), obtained (Table 1) yield a total of 10 statements (Table 4). Coding in this way

TABLE 4. Comparison of numbers of three-item statements derived from the cladogram (= multistate character) A(B(C(DE))). Standard coding = component coding = binary coding. Weighting is either uniform (UW, all statements have equal weight) or fractional (FW, some statements have fractional weights).

Three-item Standard coding Purvis coding statement UW FW UW FW Multistate

A(BC) 1.000 0.500 1.000 0.500 1.000 A(BD) 1.000 0.500 1.000 0.500 1.000 A(BE) 1.000 0.500 1.000 0.500 1.000 A(CD) 2.000 1.167 1.000 0.500 1.000 A(CE) 2.000 1.167 1.000 0.500 1.000 A(DE) 3.000 2.167 1.000 0.500 1.000 B(CD) 1.000 0.667 1.000 0.667 1.000 B(CE) 1.000 0.667 1.000 0.667 1.000 B(DE) 2.000 1.667 1.000 0.667 1.000 C(DE) 1.000 1.000 1.000 1.000 1.000 Total 15.000 -10.000 10.000 ;6.000 10.000

treats the cladogram A(B(C(DE)))* as if it really were a multistate character (compare columns Purvis coding UW with multistate in Table 4).

Comparison There is a real difference (in terms of information con-

tent) between a single multistate character and its equivalent binary components despite the fact that they all yield the same result (Nelson, 1993; Williams and Siebert, 2000). Multistate characters are presumed to have dependent states, whereas binary characters are presumed to have independent states (Nelson, 1993:262-263).

Purvis coding is somewhat different. Of the 10 statements derived from the three truncated binary components, some are logically dependent and require cor- rection. For instance, three statements are derived from B(CDE): B(CD), B(CE), and B(DE), any pair of which will summarize the original component, e.g., B(CD) + B(CE) = B(CDE). Hence, each statement should be given an appropriate value, which in this case is 2/3. Appli- cation of fractional weighting to the entire array of 10 statements yields an overall total weight of m6 (Table 4). This reduced total weight is deficient, in part because of the missing values added for each component but also because of a critical loss of informative statements.

At first sight, one might conclude that Purvis's method of coding is more accurate than its three-item equivalent because it corrects for redundancy without any weighting. However, it remains inaccurate because the recov- ered information is significantly below that needed when compared with either the three separate components or the single multistate character. In effect, Purvis's coding method implies that there is missing information relative to the basal taxon, when the reverse is actually true.

Three-item coding exposes the differences between the various coding schemes that have been suggested. The example does not, however, suggest which approach may be most efficient. For consensus methods to be efficient, the components are best treated as independent, a view that corresponds to that of Ronquist (1996). But the use of binary components does indeed overweight, a view that corresponds to that of Purvis (1995). Three- item coding, corrected for redundancy, may satisfy both Ronquist and Purvis.

SOME EXAMPLES

Analysis of the pairs of cladograms in Table 5 re- veals some differences. Example 1 provides a reference point by presenting the results for the analysis of two identical cladograms. Using binary consensus (standard parsimony), examples 2-4 of each yield three cladograms with no overall solution. For each of the suite of three cladograms, two are repetitions of the original (fundamental) cladograms and the third is a unique solution. It might seem odd that parsimony analysis of two con- flicting cladograms will produce both as possible solutions (Nelson et al., 2003). For Purvis coding, examples 2 and 4 yield only one cladogram, which is a repetition of one of the original pair, and example 3 yields three



258 SYSTEMATIC BIOLOGY VOL. 52

TABLE 5. Analyses of pairs of cladograms using binary consensus (standard parsimony), Purvis coding, and three-item coding, with fractional weighting.

Standard Purvis Three-item Pairs of cladograms parsimony coding coding 1. A(B(CD)) + A(B(CD)) A(B(CD)) A(B(CD)) A(B(CD)) 2. A(B(CD)) + D(A(BC)) A(B(CD)) A(D(BC))

D(A(BC)) A(B(CD)) A(D(BC))

3. A(B(CD)) + C(D(AB)) A(B(CD)) A(B(CD)) C(D(AB)) C(D(AB)) (AB(CD) (AB)(CD) (AB)(CD)

4. A(B(CD)) + B(C(AD)) A(B(CD)) B(C(AD)) B(C(AD)) B(A(CD)) B(A(CD))

solutions identical to those for binary consensus. Using three-item consensus, examples 2-4 yield unique cladograms identical to the unique cladogram from the binary consensus solutions.

DISCUSSION Under the assumption that repetition of the original

fundamental cladograms in the result merely restates the problem (as if 2 + 2 = 2 + 2 is an informative solution), their removal seems to indicate that both binary and three-item consensus produce identical results. Purvis coding produces only one unique result (example 3, Table 6). Two factors seem to play a part in obtaining the results: (1) for binary consensus, there is too much information in the coded data and the consensus trees obtained are thus ambiguous; and (2) for Purvis coding, there is inappropriate information and results are con- sequently deficient. Furthermore, and perhaps more im- portantly, both binary consensus and Purvis coding are affected by the vagaries of optimization, as implemented in current parsimony programs. Because optimization is a feature of parsimony programs, poorly fitting characters (= components) result in at least some homoplasy, which always remains difficult to explain.

Slowinski and Page (1999:818; cf. Page, 1990:125) offered a similar criticism of binary consensus: "homoplasy in this context has no obvious biological meaning. When an extra step occurs for a character (= gene tree [in our example a fundamental cladogram sensu Nelson, 1979]) it is not clear just what that extra step means."

It might seem odd that although Page leveled this criticism at parsimony approaches to biogeography some years ago (Page, 1990), it has had little effect (Brooks, 1996:30). The three-item approach to coding cladograms deals directly with relationships, with the resulting tree maximizing those relationships. Homoplasy is not an is-

TABLE 6. Analyses of pairs of cladograms using binary consensus (standard parsimony), Purvis coding, and three-item coding with fractional weighting (FW) with duplicated results removed.

Standard Purvis Three-item Pairs of cladograms parsimony coding coding 1. A(B(CD)) + (A(B(CD)) A(B(CD)) A(B(CD)) A(B(CD)) 2. A(B(CD)) + (D(A(BC)) A(D(BC)) A(D(BC)) 3. A(B(CD)) + (C(D(AB)) (AB)(CD) (AB)(CD) (AB)(CD) 4. A(B(CD)) + (B(C(AD)) B(A(CD)) B(A(CD))

TABLE 7. Results from consensus methods applied to the examples in Table 5. Consensus trees were derived using component 2.0 (Page, 1993).

Pairs of cladograms Strict Semistrict Nelson Adams

1. A(B(CD)) + A(B(CD)) A(B(CD)) A(B(CD)) A(B(CD)) A(B(CD)) 2. A(B(CD)) + D(A(BC)) ABCD ABCD ABCD (AD(BC)) 3. A(B(CD)) + C(D(AB)) ABCD ABCD ABCD (AC(BD)) 4. A(B(CD)) + B(C(AD)) ABCD ABCD ABCD (AB(CD))

sue; the resulting trees indicate which relationships are preserved in the final tree(s). Three-item analysis dis- penses with optimization altogether. The relevance, biological or otherwise, of each statement reflects the most fundamental and significant aspect of systematics: A is more closely related to B than either is to C.

Slowinski and Page (1999:818) offered two general criticisms of consensus methods. Their first criticism was that "consensus methods cannot easily accommodate differing sets of terminal entities." This issue relates in part to the initial coding procedures and is somewhat similar to coding terminal taxa that occur in several areas in a biogeographic analysis. Examples, in the context of biogeography, were presented by Nelson and Ladiges (1996: tables 1-6). A general solution to the redundancy of areas was addressed by considering geographic paralogy (Nelson and Ladiges, 1996). Such an approach might usefully be extended to the analysis of gene trees as well as consensus.

The second criticism of Slowinski and Page (1999:818) was that "many consensus trees are too conservative." All commonly employed consensus methods yield un- informative results, with the exception of the Adams tree, which recovers a single node consensus (Table 7). We do not dispute Slowinski and Page's contention that consensus trees are conservative; they do appear to be so (Table 7). The core of the issue has more to do with the kind of information used for achieving results, i.e., the kind of information captured by differing coding proto- cols. The common aspect of all consensus tree methods is that they rely on components in one form or another, and components relate to binary coding methods. We see the issue not as a choice between different kinds of methods (cf. Pisani and Wilkinson, 2002) but as a choice between binary coding and three-item coding for the original cladograms. If sensitivity is a major issue, findings so far indicate that three-item coding is indeed more appropriate (Nelson, 1996; Platnick et al. 1996; Williams, 2002).

ENVOI Bininda-Emonds and Bryant (1998:498) wrote, "We

have referred to the coded components of source trees as 'matrix elements' rather than as 'characters' because the two are not equivalent (Baum and Ragan, 1993)." From a technical point of view, why are they not equivalent? The varying parameters of redundancy, logical de- pendency, and total information content have been applied to character evidence (Nelson and Platnick, 1991; Williams, 1996; Williams and Siebert, 2000; Ebach and McNamara, 2002) and geographic evidence (Nelson and Ladiges, 1991a, 1991b) with some measure of success.




Viewing all comparative data from a three-item perspective, rather than a component perspective, may allow greater precision in extracting the information that relates organisms. In any event, the three-item approach has already exposed some vagaries concerning optimization (Nelson, 1996; Platnick et al., 1996; Williams, 2002). The two options are open to empirical investigation.

REFERENCES

BAUM, B. R. 1992. Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 41:3-10.

BAUM, B. R., AND M. A. RAGAN. 1993. Reply to A. G. Rodrigo's 'A comment on Baum's method for combining trees.' Taxon 42:637-640.

BININDA-EMONDS, O. R. P., AND H. N. BRYANT. 1998. Properties of matrix representation with parsimony analysis. Syst. Biol. 47:497- 508.

BROOKS, D. R. 1990. Parsimony analysis in historical biogeography and coevolution: Methodological and theoretical update. Syst. Zool. 39:14-30.

BROOKS, D. R. 1996. Explanations of homoplasy at different levels of biological organization. Pages 3-36 in Homoplasy. The recurrence of similarity in evolution (M. J. Sanderson and L. Hufford, eds.). Academic Press, San Diego.

DOYLE, J. J. 1992. Gene trees and species trees: Molecular systematics as one-character taxonomy. Syst. Bot. 17:144-163.

EBACH, M. C., AND K. J. McNAMARA. 2002. A systematic revision of the family Harpetidae (Trilobita). Rec. West. Aust. Mus. 21:235-267.

FARRIS, J. S. 1973. On comparing the shapes of taxonomic trees. Syst. Zool. 22:50-54.

FARRIS, J. S., A. G. KLUGE, AND M. J. ECKHARDT. 1970. A numerical approach to phylogenetic systematics. Syst. Zool. 19:172-191.

GUIGO, R., I. MUCHNIK, AND T. F. SMITH. 1996. Reconstruction of an- cient molecular phylogeny. Mol. Phylogenet. Evol. 6:189-213.

HUGOT, J.-P., AND J.-E COSSON. 2000. Constructing general area cladograms by matrix representation with parsimony: A western Palearc- tic example. Belg. J. Entomol. 2:77-86.

KLUGE, A. G. 1993. Three-taxon transformation in phylogenetic inference: Ambiguity and distortion as regards explanatory power. Cladistics 9:246-259.

NELSON, G. J. 1979. Cladistic analysis and synthesis: Principles and def- initions with a historical note on Adanson's "Familles des Plantes" (1763-1764). Syst. Zool. 28:1-21.

NELSON, G. J. 1993. Reply. Cladistics 9:261-265. NELSON, G. J. 1996. Nullius in Verba. J. Comp. Biol. 1:141-152. NELSON, G. J., AND P. Y. LADIGES. 1991a. Standard assumptions for

biogeographic analysis. Austr. Syst. Bot. 4:41-58. NELSON, G. J., AND P. Y. LADIGES. 1991b. Three-area statements: Stan-

dard assumptions for biogeographic analysis. Syst. Zool. 40:470-485. NELSON, G. J., AND P. Y. LADIGES. 1992a. Information content and frac-

tional weight of three-taxon statements. Syst. Biol. 41:490-494. NELSON, G. J., AND P. Y. LADIGES. 1992b. TAX: MSDOS program for

cladistics. American Museum of Natural History, New York. NELSON, G. J., AND P. Y. LADIGES. 1994. Three-item consensus:

Empirical test of fractional weighting. Pages 193-209 in Models in phylogeny reconstruction (R. W. Scotland, D. J. Siebert, and

D. M. Williams, eds.). Systematics Association Special Volume 52. Clarendon Press, Oxford, U.K.

NELSON, G. J., AND P. Y. LADIGES. 1996. Paralogy in cladistic biogeography and analysis of paralogy-free subtrees. Am. Mus. Novit. 3167:1- 58.

NELSON, G. J., AND N. I. PLATNICK. 1980. Multiple branching in cladograms: Two interpretations. Syst. Zool. 29:86-91.

NELSON, G. J., AND N. I. PLATNICK. 1981. Systematics and biogeography: Cladistics and vicariance. Columbia Univ. Press, New York.

NELSON, G. J., AND N. I. PLATNICK. 1991. Three-taxon statements: A more precise use of parsimony? Cladistics 7:351-366.

NELSON, G. J., D. M. WILLIAMS, AND M. C. BEACH. 2003. A question of conflict: Three item and standard parsimony compared. Syst. Biodiv. 2.

PAGE, R. D. M. 1990. Component analysis: A valiant failure? Cladistics 6:119-136.

PAGE, R. D. M. 1993. Component for Windows. Software and manual. Natural History Museum, London.

PISANI, D., AND M. WILKINSON. 2002. Matrix representation with parsimony, taxonomic congruence, and total evidence. Syst. Biol. 51:151- 155.

PLATNICK, N. I., C. J. HUMPHRIES, G. J. NELSON, AND D. M. WILLIAMS. 1996. Is Farris optimization perfect? Cladistics 12:243-252.

PURVIS, A. 1995. A modification to Baum and Ragan's method for combining phylogenetic trees. Syst. Biol. 44:251-255.

RAGAN, M. A. 1992a. Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. BioSystems 28:47-55.

RAGAN, M. A. 1992b. Phylogenetic inference based on matrix representation of trees. Mol. Phylogenet. Evol. 1:53-58.

RONQUIST, E 1996. Matrix representation of trees, redundancy, and weighting. Syst. Biol. 45:247-253.

SANDERSON, M. J., A. PURVIS, AND C. HENZE. 1998. Phylogenetic su- pertrees: Assembling the trees of life. Trends Evol. Ecol. 13:105-109.

SLOWINSKI, J. B., AND R. D. M. PAGE. 1999. How should species phy- logenies be inferred from sequence data? Syst. Biol. 48:814-825.

SNEATH, P. H. A., AND R. R. SOKAL. 1973. Numerical taxonomy: The principles and practice of numerical classification. W. H. Freeman, San Francisco.

SOKAL, R. R., AND P. A. S. SNEATH. 1963. Principles of numerical taxonomy. W. H. Freeman, San Francisco.

WILEY, E. 0. 1988. Vicariance biogeography. Annu. Rev. Ecol. Syst. 19:513-542.

WILKINSON, M. 1994. Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus trees and profiles. Syst. Biol. 43:343-368.

WILLIAMS, D. M. 1996. Fossil species of the diatom genus Tetracyclus (Bacillariophyta, 'ellipticus' species group): Morphology, interrela- tionships and the relevance of ontogeny. Philos. Trans. R. Soc. Lond. Ser. B 351:1759-1782.

WILLIAMS, D. M. 2002. Parsimony and precision. Taxon 51:143-149. WILLIAMS, D. M., AND D. J. SIEBERT. 2000. Characters, homology and

three-item analysis. Pages 183-208 in Homology and systematics (R. W. Scotland and T. Pennington, eds.). Systematics Association Special Volume 58. Taylor and Francis, Philadelphia.

First submitted 11 June 2002; reviews returned 22 September 2002; final acceptance 3 December 2002

Associate Editor: Peter Linder

Syst. Biol. 52(2):259-271, 2003 DOI: 10.1080/10635150390192762

Popper and Systematics

OLIVIER RIEPPEL

Department of Geology, Field Museum, 1400 S. Lake Shore Drive, Chicago, Illinois 60605-2496, USA; E-mail: [email protected]

The philosophy of Karl Popper is frequently ap- pealed to either in support or in defense of theories and methods of systematic biology. Perhaps the best il- lustration of Popper's prominence in this field is the

collection of papers published in Systematic Biol- ogy in June 2001 (de Queiroz and Poe, 2001; Faith and Trueman, 2001; Farris et al., 2001; Kluge, 2001b).



component coding, three-item coding, and consensus methods

Documents