I need another data matrix for that purpose, one with multi-state characters with more than two states. We may choose these data from other studies published on the application of character compatibility analysis, for instance the data on the Alcidae from Strauch (1984).
Data Matrix (multi-state) : Alcidae (Columns represent characters) 1 2 3 4 5 6 7 8 9 10 11 -------------------------------------------- 1 Pinguinis_impennis | 2 1 1 2 1 2 1 1 1 1 1 2 Alca_torda | 2 1 1 2 1 1 1 1 1 1 1 3 Uria | 2 1 1 2 1 1 1 1 1 1 1 4 Uria_aalge | 2 1 1 2 1 1 1 1 1 1 1 5 Alle_alle | 2 1 1 2 1 2 1 1 1 1 1 6 Cepphus_grylle | 2 1 1 2 2 2 1 1 1 1 1 7 Cepphus_columba | 2 1 1 2 2 1 1 1 1 1 1 8 Brachyramphus_marmoratus | 2 1 1 2 2 2 1 1 1 1 1 9 Brachyramphus_brevirostris | 2 1 1 2 2 2 1 1 1 1 1 10 Endomychura_hypoleucus | 2 1 1 2 2 2 1 1 1 1 1 11 Synthliboramphus_antiquus | 2 1 1 2 2 2 1 1 1 1 1 12 Synthliboramphus_wumizusume | 2 1 1 2 2 2 1 1 1 1 1 13 Ptychoramphus_aleuticus | 2 1 2 1 2 2 1 1 1 1 1 14 Cyclorhynchus_psittacula | 2 1 2 1 2 2 1 1 2 1 1 15 Aethia_cristatella | 2 2 2 1 2 2 1 1 2 1 2 16 Aethia_pusilla | 2 2 2 1 2 2 1 1 2 1 2 17 Aethia_pygmaea | 2 2 2 1 2 2 1 1 2 1 2 18 Cerorhinca_monocerata | 1 1 1 1 2 2 2 2 1 2 2 19 Fratercula_arctica | 1 1 1 1 2 1 2 2 1 2 2 20 Fratercula_corniculata | 1 1 1 1 2 1 2 2 1 2 2 21 Lunda_cirrhata | 1 1 1 1 2 1 2 2 1 2 2 22 OutGroup | 1 1 1 1 1 1 1 1 1 1 1 12 13 14 15 16 17 18 19 20 21 22 -------------------------------------------- 1 Pinguinis_impennis | 2 1 2 2 1 2 2 3 1 1 1 2 Alca_torda | 2 1 2 2 1 2 2 3 1 1 1 3 Uria | 2 1 2 2 1 2 2 3 1 1 1 4 Uria_aalge | 2 1 2 2 1 2 2 3 1 1 1 5 Alle_alle | 1 2 1 2 1 2 1 2 1 1 2 6 Cepphus_grylle | 1 1 2 2 1 2 1 2 1 1 2 7 Cepphus_columba | 1 1 2 2 1 2 1 2 1 1 2 8 Brachyramphus_marmoratus | 2 2 2 2 1 2 1 2 1 1 2 9 Brachyramphus_brevirostris | 1 2 2 2 1 2 1 2 1 1 2 10 Endomychura_hypoleucus | 2 1 2 2 1 2 1 2 2 1 2 11 Synthliboramphus_antiquus | 2 1 2 2 1 2 1 2 2 1 2 12 Synthliboramphus_wumizusume | 1 1 2 2 1 2 1 2 2 1 2 13 Ptychoramphus_aleuticus | 2 2 1 1 2 2 1 2 1 1 2 14 Cyclorhynchus_psittacula | 1 2 1 1 2 2 1 2 1 1 2 15 Aethia_cristatella | 1 2 1 1 2 2 1 2 1 1 2 16 Aethia_pusilla | 1 2 1 1 2 2 1 2 1 1 2 17 Aethia_pygmaea | 1 2 1 1 2 2 1 2 1 1 2 18 Cerorhinca_monocerata | 2 1 2 1 1 1 1 2 1 1 3 19 Fratercula_arctica | 2 1 2 1 1 1 1 1 1 2 3 20 Fratercula_corniculata | 2 1 2 1 1 1 1 1 1 2 3 21 Lunda_cirrhata | 2 1 2 1 1 1 1 1 1 2 3 22 OutGroup | 1 1 1 1 1 1 1 1 1 1 1 23 24 25 26 27 28 29 30 31 32 33 -------------------------------------------- 1 Pinguinis_impennis | 2 1 2 1 2 2 1 1 1 3 1 2 Alca_torda | 2 1 2 1 1 2 1 1 1 3 1 3 Uria | 2 1 2 1 1 1 1 1 1 3 1 4 Uria_aalge | 2 1 2 1 1 1 1 1 1 3 1 5 Alle_alle | 2 1 1 1 1 1 1 1 3 2 1 6 Cepphus_grylle | 1 1 1 2 1 1 2 2 3 2 1 7 Cepphus_columba | 1 1 1 2 2 1 2 2 3 2 1 8 Brachyramphus_marmoratus | 1 1 2 2 2 1 2 1 3 3 2 9 Brachyramphus_brevirostris | 1 1 2 2 2 1 2 1 3 3 2 10 Endomychura_hypoleucus | 1 1 1 2 1 1 2 2 2 2 1 11 Synthliboramphus_antiquus | 1 1 1 2 2 1 2 2 2 2 1 12 Synthliboramphus_wumizusume | 1 1 1 2 2 1 2 2 2 2 1 13 Ptychoramphus_aleuticus | 1 1 1 2 2 1 2 1 3 1 1 14 Cyclorhynchus_psittacula | 1 1 1 2 2 1 2 1 3 2 1 15 Aethia_cristatella | 1 1 1 2 2 1 2 1 3 2 1 16 Aethia_pusilla | 1 1 1 2 2 1 2 1 3 2 1 17 Aethia_pygmaea | 1 1 1 2 2 1 2 1 3 2 1 18 Cerorhinca_monocerata | 1 1 1 2 3 1 2 1 3 1 1 19 Fratercula_arctica | 1 2 1 2 3 1 2 1 3 1 1 20 Fratercula_corniculata | 1 2 1 2 3 1 2 1 3 1 1 21 Lunda_cirrhata | 1 1 1 2 3 1 2 1 3 1 1 22 OutGroup | 1 1 1 1 1 1 1 1 1 1 1
Table 3.29: Data matirx for the Alcidae (Aves); after Strauch (1984), except the outgroup.
Strauch (1984) did not include an outgroup in his data matrix, but from the description he presents of the direction and ordering in the characters it is clear that an all-states-one outgroup is used. Most of the characters have only two states (and are thus always ordered). Only 5 characters have three states (# 19, 22, 27, 31, and 32). These latter characters constitute the real test case in our comparison of group- and character compatibility analysis. When ordered, multi-state characters are pairwise compatible if their binary factors are all pairwise compatible. Strauch's character compatibility analysis results in one maximum clique of 23 characters.In order to make a direct comparison possible a CACFA primary analysis was run with option 1 (PMS), and all characters ordered, and directed by choosing taxon # 22 as outgroup. The analysis results in 50 partially monothetic sets for which 8 cladograms can be found. The table below presents the lengths of the cladograms as well as their number of compatible character states.
cladogram: 1 2 3 4 5 6 7 8 length: 67 70 68 71 72 75 73 76 compatible states: 55 56 54 55 51 52 50 51As was already noted with Meacham's data, longer cladograms tend to have less compatible character states. This relation is not monotonous as is indicated by cladograms # 5 and 8 which both have 51 compatible states but differ four steps in length. Cladogram # 2 is an exception to this tendency as it is three steps longer than # 1 but has 1 more state compatible with it.
Within the constraints of the search option there is only one most parsimonious cladogram with 67 steps. This cladogram as well as its list of compatible character states is given in table 3.30. Most striking in the list of compatibilities are the states for the root-node (# 50). In directing and ordering all characters, state 1 in the additive binary coding for the multi-state characters is now present in all taxa, and for that reason also on the root of the cladogram.
Besides state 1 for all characters, there are 22 other states listed. Not all multi-state characters are present in this list; 19, 22 and 27 are there with states 1 and 3, characters 31 and 32 are only present with state 1 (only due to the process of ordering and directing the characters).
As all (binary expressions of the) states for multi-state characters must be pairwise compatible to make the characters compatible, the characters referred to above must be excluded as incompatible. All in all, the list of characters fully compatible with the cladogram (and therefore also mutually compatible) only counts 19 entries (in contrast to Strauch's 23). Some of the states of the other (incompatible) characters, however, are compatible with the cladogram, and CAFCA has used this information on state-compatibility in the search for cladograms.
AlcidaeB:Cladogram - 1 /-- 1 /--24 | \-- 2 /--31 /--33 |----- 3 | | | | | \----- 4 | | | \-------- 5 | | /----- 10 /--38 | | | /--27---- 11 | | | | | |--34 \----- 12 | | | /--46 | |-------- 6 | | | | | | | \-------- 7 | | | /--49 | | /-------- 8 | | | \--25 | | | \-------- 9 | | | | | | /----- 15 | | | | | | | /--26---- 16 | | | /--30 | | | \--32 | \----- 17 | | | | | | | \-------- 14 | | | | | \----------- 13 | | | | /-------- 19 | | /--23 | | /--28 \-------- 20 | \--29 | | | \----------- 21 | | | \-------------- 18 | \-------------------- 22 Out Cladogram-1 : COMPATIBILITIES ------------------------- Cladon |Character| State ------------------------- 23 | 24 | 2 24 | 28 | 2 25 | 33 | 2 26 | 2 | 2 27 | 20 | 2 28 | 21 | 2 29 | 7 | 2 | 8 | 2 | 10 | 2 | 22 | 3 | 27 | 3 30 | 9 | 2 31 | 18 | 2 | 19 | 3 32 | 3 | 2 | 16 | 2 33 | 23 | 2 34 | 30 | 2 38 | 4 | 2 | 15 | 2 46 | 1 | 2 | 17 | 2 50 | 1 | 1 | 2 | 1 | 3 | 1 | 4 | 1 | 5 | 1 | 6 | 1 | 7 | 1 | 8 | 1 | 9 | 1 | 10 | 1 | 11 | 1 | 12 | 1 | 13 | 1 | 14 | 1 | 15 | 1 | 16 | 1 | 17 | 1 | 18 | 1 | 19 | 1 | 20 | 1 | 21 | 1 | 22 | 1 | 23 | 1 | 24 | 1 | 25 | 1 | 26 | 1 | 27 | 1 | 28 | 1 | 29 | 1 | 30 | 1 | 31 | 1 | 32 | 1 | 33 | 1 -------------------------
Table 3.30: Most parsimonious cladogram and its compatible character states found by CAFCA for Strauch's Alcidae data.
When we compare the cladogram found by CAFCA with that given by Strauch (1984) we see that CAFCA's cladogram as presented in table 3.30 has all the groupings from Strauch's cladogram but is more resolved. As a consequence, less characters are fully compatible with it.When we run an heuristic search by PAUP to find the most parsimonious cladogram for this data set, we find 11 cladograms with 64 steps (38 steps is the theoretical minimum). The number of characters states compatible with these cladograms ranges from 53 to 55 (including the 33 states compatible with the root).. None of these cladograms has taxa 18 through 21 as a sistergroup of taxa 1 through 17, as, according to Strauch, should be the case. This grouping is present in CAFCA's most parsimonious cladogram, apparently at the cost of three extra steps.
AlcidaeBtree: Cladogram - 1 /-- 15 | /--27- 16 /--31 | /--34 | \-- 17 | | | | | \----- 14 | | /--38 \-------- 13 | | /--40 | /-- 19 | | | /--25 | | | /--28 \-- 20 | | \--32 | /--44 | | \----- 21 /--45 | | | /--46 | | | \-------- 18 | | | | | | | | | | /----------- 8 | | | | \--24 /--47 | | | \----------- 9 | | | | | | | | | | /----------- 10 | | | | | /--26 | | | | |--29 \----------- 11 | | | | | | | | | | | \-------------- 12 | | | | | | | | | \----------------- 7 | | | | | | | \-------------------- 6 | | | | | \----------------------- 5 | | | | /-------------------- 1 | | /--23 | | | \-------------------- 2 | \--30 | |----------------------- 3 | | | \----------------------- 4 | \----------------------------- 22 Cladogram-1 : COMPATIBILITIES --------------------------- Cladon | Character | State --------------------------- 23 | 28 | 2 24 | 33 | 2 25 | 24 | 2 27 | 2 | 2 28 | 21 | 2 29 | 20 | 2 30 | 18 | 2 | 19 | 3 31 | 9 | 2 32 | 7 | 2 | 8 | 2 | 10 | 2 | 22 | 3 | 27 | 3 34 | 3 | 2 | 16 | 2 45 | 5 | 2 | 26 | 2 | 29 | 2 46 | 22 | 2 | 31 | 2 --------------------------
Table 3.31: One out of 11 most parsimonious cladograms found by PAUP for Strauch's Alcidae data, and its character compatibilities as listed by CAFCA.
When we analyse one of the 11 most parsimonious cladograms found by PAUP (table 3.31; 64 steps) in CAFCA as a user-tree in order to trace its compatibilities, we find only 18 fully compatible characters. The reason is clear. This cladogram is even more resolved, although not completely, than the ones found by CAFCA, at the cost of fully compatible characters, but gaining 3 extra steps. It appears that maximising the number of fully compatible characters, or compatible character states for that matter, does not necessarily lead to minimal trees, i.e., is not equivalent to minimising the number of steps (see also dePinna, 1991). Sticking to the maximal set of fully compatible characters can block the route to the most parsimonious cladograms.
The states compatible with the root node are omitted. They are the same as shown earlier in the list of compatibilities, and only relate to state 1 of all the characters as they are all directed and ordered. From the list above characters 19, 22, and 27 must be omitted as well as they are incomplete; only one of their three states is compatible with groups in the cladogram. They do not comply with the definition of a compatible character.
We can explore the relationship between character state compatibility and cladogram length in more detail by using PAUP to generate 9470 cladograms (search incomplete) with a length less than or equal to 67 steps (the restricted MPC found by CAFCA has 67 steps). The maximum number of compatible character states remains constant for cladograms in the range of 64-67 steps (figure 3.6).
Figure 3.6: The relation between cladogram length (x-axis) and number of compatible character states as expressed by the CCSI (y-axis) in Strauch's Alcidae data.