Skip to main content

Axiomatic Panbiogeography

offers an application of incidence geometry to historical biogeography by defining collection localities as points, tracks as lines and generalized tracks as planes.
About Us
Contact Us
Site Map
Member Login
Incidence Geometry
Composite Construction
Quaternion Algebraic Geom
Primate Vicariances
Individual Track Construc
Generalized Tracks
Main Massings
Track Analysis and MetaCo
Martitrack Panbiogeograph
Replies to Criticism
Multimodel Selection
Search Encounter
Track Analysis beyond Pan

MultiModel Selection and  The Croizat Method as  Information Theoretic Panbiogeography

Looks like there really is a non-partisan == non-"political" solution. It was hard to understand how to realize Croizat's sour and forlorn attitude relative to those that claimed him on 'the' "lunatic fringe" or asserted that without an adequate and reliable null test that his claimed output of his "methodology" - the tracks etc themselves were not really statistical but rather appeared as subjetive artifacts of his a priori approach.  The rise of the information theoretic approach ( in wildlife management situations) that differentiates itself from Bayesain techniques seems to provide a resolution to the difference of opinion between say Patterson and Croizat etc.  Recent panbiogeographic consideration of congruence brought on by the Martitrack algorithm may be addressed by multiple model selection of generalized tracks.

Thus I will conclude that Croizat's contributions have been eschewed not for simple sociological differences (continental vs anglophonic philosophy for instance("Croizat’s major works represented thedevelopment of a long line of evolutionary thought originating in continental Europe."Carmine Colacino and John Grehan)) but rather because the difference between frequentists and Bayesians do not present the full gamit ("versimultude"(of Kant)) of statistics applied to evolution  nor that it was due to Croizat's use of paraphyla etc.  It is not possible to logically separate empirically an an original event from full model truth even thought these are different.

The difference in the view of say Patterson and Croizat might be thought of as one where Croizat was referring to f with his own thoughts on the parameters while Patterson realized that science really uses g (given the data).  That conversation should have been about if Croizat's records and data used enable him to express a model panbiogeogrpahy g close to f and to say what quantitative statistical support there was for that model of tracks and nodes etc etc.

The development of an anti-node notion means that this functionality could be rigoursly sought and here I try to show to set up generalized track models a priori of various topological Croizat method ontologies that permit model averaging and thus avoid the need to perfectly specify a definition of congruence but yet still provide for changing knoweledge of/in generalized tracks and the Crtoizat method as new data sets are added to the synthesis of space, time and form. Further the information theoretic statisticas are enriched as the the panbiogeographic progression of acceptable models informs larger and larger sets of phyolgenetic tress for use systematically (given a common historical geology) and thus the simple difference between TRUTH and model reality is actually more diverse than is currently be practice with Mark-Recapture Analyses.

Here we show how panbiogeography may develop from its meger beginnings into a full blown historical biogeography that evolves our understanding of space time and form by casting it as a form information theoretic statistics of multiple model inference set creation and recreation of my petrsonal model evolution

Eurycea--> Salamanders--> & Fish--> metacommunities with others

This progression is used to addresses both the information - theorecitc concepetsw relvant but also how other panbiogeogrpher's work might be merged into a whole and tested against further molecular and systematic multiple modeling.

Multimodel Inference in Panbiogeography

There is a suggestion  to extend the program of AIC type model construction beyond mark and recapture into  a “void” around phylogeography. 

Information theoretic methods are finding increasing use where designed experiments are not possible. (Model selection in ecology and evolution JJohnson K Omland Trends in Ecology and Evolution Vol.19 No.2 Feb 2004)

Millington and Perry have presented the general case for MMI in bioeography. There is no doubt that this form of inference panbiogeographically is better done with AIC rather than Bayesian wise since we do not know about the origin of life location and the discipline is going to begin like it or not from Earth’s data first, no matter what. The truth that Croizat enunciated was difficult for others to realize was new.  Now we can understand other non-panbiogeographic biogeographic approaches as some kind of model with a loss of this information from the Croizat Method.  The true model however is never going to be in the set so constructing bayesian priors will always be callable into question.


 Millington and Perry (2011) Multi-model inference in biogeography. Geography Compass 5(7) 448–463 10.1111/j.1749-8198.2011.00433.x

Can the Panbiogeographic generalized track provide geographic closure in mark-capture syntheses? The differnces of opinions over generalized track construction can simply be cognized as different models once one relates the individual tracks to parameterization of the general model track.  This way one can avoid my pure math approach ( of axiomatics  based on using some kind of logic ( why I called it "terminological" panbiogeography like that below)

and instead use "conflict resolution" of Andersonet. al.  into  track construction and analysis.  It was quite a relief to recognize that this approach fully supports the argument in  Heads, Craw and Grehan (Panbiogeography: Tracking the History of LIfe)

that Panbiogeography is a "progressive research program".  So many people tried to argue that it can not be one but none of those criticisms take the perspective presented by Anderson et al.  It only remains to establish a "definitive" panbiogeographic data set.  If I can get my Urodelan application to work within Head's primate structure such records might b e able to become such. "Terms colinear" become expressed as linear combinations of component parameters and do not depend to specific use of some of Bertrand Russell's concepts(Strecke) that I used.


 Here we show how to build  those multiple generalized track models with this new methodology by using different number of individual track, node and mass parameters.

We develop an AIC criterion for model selection and demonstrate the new methodology to predicting distributions not in the original data set as well as projecting sets of compatible taxonomic trees for use in taxonomy.

Constructiing a generalized track with more parameters results in a "better fit"(more parameters per area) to the biogeographic record data.  Here one might model the apparent geographic trend below with 4 or 13 parameters (Simply select a covering size and place them over locations where all individual tracks are represented and distribute them to maximize the total number of collection records within all of the covergings.

Weston wrote, "Croizat regarded generalised tracks as having a statistical basis, their degree of justification being directly related to the number of individual tracks consistent with them."

Here show a use of the Akaike Information Criterion to develop "degrees of justification" for a particular generalised track model as well via model averaging for cross model average of the track parts (which includes individual tracks but may also have nodes, anti-nodes, masses and different algebraic cosets associated with a particular baseline directed).

 Model selection leads a generalized track statements and individual parameter model averaging can provide statistical means to predict missing data in other phyla not used to create the generalized track. This is really exciting!!

In this sense tracks are eminently statistical  and meaningful.  (Some of the discrensipciy about Croizat tracks being statistical or not stem from the differenece of the null vs multiple model significna e in statstiics).  Thus to be logically valid the "explicit statistical basis" of Weston need not have a model wich is "the truth" as in some Bayesian applications which might be tried here in expanding from Pag's use of minimal spanning trees for constructing individual tracks. The statistics will be developed as different probabilites are assoicated with different MST indiviudal tracks, nodes masses and coset basline parts sum to one both for the historical biogeography as well as for the total taxonomic or clade node information derivable.

A history of Panbiogeographic notions of the node are related to this approach. Nicely, Heads (Molecular Panbiogeography of the Tropics (page 2) wrote "In contrast, geologists (Chamberlin, 1890, repreinted 1965) and now molecular biologists (Hickerson, 2010) cite the method of "multiple working hypotheses," to explain a given phenomenon.  Accepting a single interpretation as definitive can be counter productive" and Anderson (Model Based Inference in the Life Sciences (page1) opens with Chamberlin and "science philosophy" based on multiple working hypotheses. 


Robin Craw introduced the notion  of null hypothesis testing of tracks with Clique Compatibility.

solution to the congruence problem is presented  and a route to combination with phlogenetic systematic multi-model truthing is hinted at. We will be able to use the mutliple hypotheses of molecular panbiogeography and molecular biology in terms of track parts thus fusing the nodes of panbiogeography and cladistics as Nelson realized was happening panbiogeographically.

Through the creation of multiple models it becomes possible to lessen the apparent subjectivity in the intial description of the parameters.  I will account this in the "evolution " of models which proceeds wholly within panbiogeography and does not depend on systematics at all.

This may appear counter intuitive since the multiple model can only approximate truth from the models it contains.  The missing true model however must be phylogentically informative overall.  That model just is never the one the panbiogeographer works with.

While the true model need not be in the set evolution as a fact specifies that there is some true model- How well the forces implied by the multiple models fits the substance of the truth however can  be bettered by improved inference given the initial set.

For intance futher tests of Head's molecular panbiogeography of the tropics will result in global systematic changes that may or may not be able to evolve as he proceeds to do the same for Australia.

So if the intial set is well ordered beyond and organon (threshold) it may work recursively and objectgively to attain the true substance that gave rise to the the various sets of forces derivable.  A very poor intitally subjective set however will never be able to do this.

It is suggested that with respect to salamander speciation in Eastern USA that the initial set example provided is suffienenht to assomptoicically approach the true salamanders evolution over increased research time.

As better and better approaches to reality are constructed (through mutimodel inference to systematic tree alternatives after parameter testing relative to geology) the degree of apparent subjectivity can  thus be lessened.

Thus we address the question raised , “Is the union of two individual tracks to be a generalized track”?

We show that other parameters , nodes, masses and baselines shape the fit of the union of indivudal tracks when the criteria is either historical contingency or metacommunity data output creation under the constraint of geology and systematic.

The GT is both a model approximating a biogeographic realty and a representation of truth.  Union of individual tracks can proceed first or creation of the GT can be first depending on if the model inference is within the multiple model panbiogeograpic model set or if it is to be part of truth (then GT is first)


What guides parameter alteration by the user is the total baseline set fit to all other systematic trees for the same taxa as well as better and better biogeographic generalized tracks for larger and larger spatial taxa and parameter numbers.  Admittedly the project remains subjective to a large extent since mutimodel creation starts with givens that may or may not be true but the ability to make predictions and get better and better fits overtime indicates that panbiogeography is indeed progressive and may even eventually come to be definitive for larger timing suggestions of evolution as a whole.

The multiple model approach allows panbiogeography to develop repeatable, reliable and quantitative criteria to capture congruence on species distributions. Maximum likelyhood enables a definition of congruence.  We use variance to measure deviance from similiarty through nodes and anti nodes and masses to influence bias.  Model averageing gives rise to diffenet baseline derivations. While  parameter averaging enables fits to other multimodel systematic trees.

The parameters used for GT creation are it(1),(2),(3)…n(1),(2),(3), antin(1),(2),(3)…m(1),(2),(3)…

Baselines are reserved for the truth model.

Confusions over Croizat’s use of deductive and inductive are c leared up and the use of null hypothesis as used by Craw with clique compatibility  is advanced.  The null can not becreated because we can not be certain without all taxa data on the parameters as they work algebraically through thye southern hemisphere.


Model selection vs  Model averaging – panbiogeography melds them together!!

The models are different sets of claimed tracks, nodes, masses and baselines which series present the parameters and these can be selected to fit the collection locality data as closely as possible. Model selection is motivated by panbiogeography but incremental averaging provides evidence of the need for a futher model closer the truth.  Thus in panbiogeography there is a circuit of model selection to model averaging to better candidate models to better averaging as the increasing amount of useable data is incoportated.  This is not the case in other uses where the hypothesis are not embedded in larger hypotheses.  This will be the case in panbiogeography until the all fo the temproatily is investgigagte s  the spaces compassed.

The first assumes that models can be used to assess the importance of a variable in the context of a given process/system by evaluating whether its parameter estimate is different from some specified null value (often zero). This approach emphasises uncertainty in model parameters but ignores uncertainty in the selection of the model itself (e.g., which variables should be selected).

So the importance of the Ozark node may be important in two different models some in which it is connected the Ouichita node and others where it does not  when either of these larger models has some use of the Ozark node in larger models (where the mass is to the west or the east for instance) in important nonetheless. As  the large3r model uncertainty is resolved refinement of the node differences into track width permits alteration of the structure through a progressive better fit and predictability.

’” The second approach, which has been the subject of much recent discussion in many sub-fields of biogeography and ecology (e.g., Stephens et al., 2007a, Grueber et al., 2011), holds that uncertainty in model selection must not be ignored and is as important as, if not more than, uncertainty in parameter estimates.

The uncertainity in the choice of one model or the other is due to lack of clear anti-node data (where the localities are not) but this becomes findable as the models which receive their support wholly biogeographically are used to predict other missing data in other phyla which are then inturn used to recreate the intital model with refined parameters.

The reason that one needs multiple working hypotheses in panbiogeography is that there can be differentiation both within and between genera and species.  This lies at the heart of the issue with  Darwin’s use of biogeography and its criticism by Croizat as used by Nelson and others over the center of dispersal.  Without mutilple hypothesis (essentially one process from the origin) the nested and test of cross taxa within and without differences would not be possible.  If there is actaully a binary division in cause (which can be now hypothesized given DNA and molecular systematic molecular evolution) then there may be reason to start with multiple rather than a single origin hypothesis.  This does complicate the issue on the origin of life but that is so far off it may not be a hindrance to doing some good work in the mean time.

The general panbiogeography MMI divides the total probability into the three angles a,b,c where the idea of seen of not seen is through the third angle c.  Each parameter has a decided proportion of each angle totals probability.

T1(a,b,c)+T2(a,b,c)…+N1(a,b,c)….=1.  As the model fits and data increases continue there will be shift in the proportion of the c amount to One.  As one moves off Earth and to the origin question this proportion will be mostly one unless there is strong mixing of life both on and off Earth..

given {pan(t1,t2,t3,...n1,n2,n3,....m1,m2,m3...)} what is the alpha, beta, gamma likelihood?

1 track has a lower likelihood than 3 tracks and two nodes.  Try to find the maximum likelihood for all combinations of track,nodes,and masses.


What is extremely interesting is that one can actually do the Chamberlin idea but within the Croizat one, namely one can do biogeography first and foremost and THEN taxonomy.  One creates multiple working Panbiogeographic hypotheses (BEFORE LOOKING AT PHYLOGENETIC DATA) and then uses these to work out evolutionary relationships systematically.  The inferences from biogeography are used to support different Panbiogeographic concepts for incorporation later in systematics.  Thus one need not know the phyla relations but only the models.  Some differences in Panbiogeographic models can support the same phylogenetic trees but some of the models imply different trees.

The complex relationship between biogeography and systematic really requires that one use MMI.


Computing Panbiogeographic Model Lilkely hoods


 The 3 track 2 node model is more likely than the one track model clearly.  It fits the data better. It has more parameters.  Here there are no parameters in common.

These do

So model averaging may focus down onto one parameter say a node which shows that it can be used to find anti-nodes in other parts of the range which causes one to eliminate other nodes and use a single mass and thus fit a larger systematic tree.


With AIC rather than Bayesian one only needs know that a true evolution exists not that the evolutionary sequence is actually one of the models.  This is a way around the creation evolution wrongful controversty.  One could still say there is no evolution since we don not yet havethe origin worked into the model structyure but as the model predcitivitiy in creases it will be hard to find a non evolution truth for the same.

As more and more phyla are incorporated and more and more c values occur the AIC number will go up for panbiogeography  as  a whole.

One does not want to use Bayesian methods in panbiogeography because one wants to keep the relation of biogeogrpqhy and systematic independent. Integration across all variables and models would not permit this and can only be used by evolutionary biology if and when it has all of the Earth’s distributions well understood.  There may be time for this but it is notgoing to be  prior to the collection of enough data to bind the species and gene trees together.  And that will depend on whether there is indeed a binary underlying theory or not to the codes expressivity evovabilities.


“So how do we put all of these criteria together to select the ‘best’ model?  Unfortunately, this is where it get a bit more difficult.  As of yet I have not encountered a metric that combines fit, predictivity, and parsimony into one optimisable statistic.  Some of the metrics described above already attempt to combine two of the above criteria.  The information criteria metrics such as AIC, BIC, and DIC already consider fit and parsimony jointly.  Stone (1977) shows that model selection by AIC and leave-one-out cross validation are asymptotically equivalent so this suggests that there is a link between predictivity and the joint consideration of fit and parsimony.”

Predictivity comes when anti-nodes can be predicted by the model (places where data is not expected ) as well as where it is expected in other taxa but missing.  Do Desmogs have other Ozark nodes? Is fish distribution show that salamanders are in Norht Carolina not connected so far? Is the crawfish anti node part of the Eurycea data set?

Likelihoods for mixed continuous–discrete distributions

The above can be extended in a simple way to allow consideration of distributions which contain both discrete and continuous components. Suppose that the distribution consists of a number of discrete probability masses pk(θ) (masses)and a density f(x | θ), (track,node anti-nodes) where the sum of all the p's added to the integral of f is always one. Assuming that it is possible to distinguish an observation corresponding to one of the discrete probability masses from one which corresponds to the density component, the likelihood function for an observation from the continuous component can be dealt with as above by setting the interval length short enough to exclude any of the discrete masses. For an observation from the discrete component, the probability can either be written down directly or treated within the above context by saying that the probability of getting an observation in an interval that does contain a discrete component (of being in interval j (which is a geographic distance) which contains discrete component k) is approximately

\mathcal{L}_\text{approx}(\theta \mid x \text{ in interval } j \text{ containing discrete mass } k)=p_k(\theta) + f(x_{*}\mid\theta) \Delta_j, \!

where x_{*}\ can be any point in interval j. Then, on considering the lengths of the intervals to decrease to zero, the likelihood function for an observation from the discrete component is

\mathcal{L}(\theta \mid x )= p_k(\theta), \!

where k is the index of the discrete probability mass corresponding to observation x.

The fact that the likelihood function can be defined in a way that includes contributions that are not commensurate (the density and the probability mass) arises from the way in which the likelihood function is defined up to a constant of proportionality, where this "constant" can change with the observation x, but not with the parameter θ.

So one could develop a frequentist or  classical approach.  In the classical approach the Croizat method is presumed to rely on a natural symmetry from track to node to mass to baseline per dendogram derived.  On the frequentist approach the anti-node can alter subtly the  truth of any symmetry by interacting between the discrete and continuous differences (masses vs tracknodes perbaseline).  Denial of the existence of any thing one can call the Croizat method is simply a denial that such a symmetry can be cognized or was found.

Croizat’s panbiogeography is an advance over Darwin’s understanding of geographic distributions relative to evolution because Darwin was not able to separate discrete from continuous contrtibutions in the modification by descent from that dyanically formed through natural selection.  Fisher worked out one half of this problem by focusing on discrete (Mendelian) genetic part but did not include (in principle) any discrete phenotype.  Wright’s approach although no explicit about non continuous phenoytpes relative to the mechanics of antural selection was able via versimultude (in the inverse probabilithy possibility) to allow for the possibility in unobserved variables that affect the interaction of mutation and immigration with selection.




  • Panbiogeography depicts distribution patterns by reducing apparent localization complexity. Mapping many species distributions on top of each other results in visually dense images that are really-pretty-useless for the human observer to dissect in meaninful ways.

    One way to analyze such information is to automate algorthims and have an computer sort all of the edges and nodes in some predefined way but another way is to deduce a generalized track from such data and thus induce a means to search and encounter patterns simply by visual inspection later verified statistically.​ The amount of published panbiogeographic material more or less requires that the stronger deductive approach be developed.

    As the process evolves computer searches and human recoveries merge as an horizon of past knowlege grows.
  • The tension between these two directions/ways was brought out recently by Ferrari Barao and Simoes (Quantitative panbiogeography: was the congruence problem solved? 2013) in  which they noted that parameter setup was too subjective and congruence too partiuclarlly objective.  Below I set up a circular pratice that can improve congruence increasingly and make the parameter setup more objective (statistical) (with increasing knowledge). I offer this as a possible solution in the deveolopment of quantitative panbigeoprahy by going beyond the cell or predefined area as begun with clique analysis and introduce definite use cases for track width, node shape, mass position, and baseline direction. A point set state space is used instead.
  • Millington and Peryy 2011 reviewed Multi-Model Inference in Biogeography but did not mention panbiogeography.  Here I show how to develop that means towards the aforementioned goal of axiomatic panbiogeography.  Multimodel panbiogeography enables one to use both subjective guesses and objective model averaging as the process of getting better patterns matched over time.  Thus there is a role for both the objective and subjective components and the inductive and deductive modes of investigation .  It is then up to debate as to which on- going alogrthmic creation flows andassociated logics are considered the consensus.  I make no apology for the increased complexity of the total working protocol as compared to say that used in vicariance biogeography or comparative biogeography.  As long as blocks are in place we can not advance historical biogeography into evolutionary theory proper any further now than before Croizat's time.
  • One uses exploratory analyses to hone the biogeoghraphic models that then are used apriori in comparison with taxonomic nodal models to find an objective atlas of increasing extents.This differs from vicarance biogeography in that there is a synthesis of space rather than an analysis of it. Time however is expected to be the same however. Thus geographic data is used for hypothesis generaion and taxonomic data for hypothesis testing.  The cladistic node is thus made to format no merely the panbiogeographic node but other dimensions of fundamental Croizatian concepts.  It repairs a rather dogmatic thininking that had linked algebraic structure with genetic differentiation (Robinson on Ethington) and noted by ABC in their anaysis of the hierarichacal problem withg phlyogentic tree reconstruction.  Vicariance comprehesively applied can divided where only combinations were otherwise designed.  In this way multimodel panbiogeography is also an advance beyond comparative biogeography.
  • Axiomatic panbiogeography provides an epistemology from which mult-model formats can be cognized (generalized linear models as generalized tracks).  With this multi-model perspective it is easy to see how Croizat's panbiogeography was not as warmly recieved as it could have been.  The truth is really not in the model. This will be apparent as I compare the Bayesian and Multinomial/Maximum likelihood representations of the multimodel technique.

     There will no doubt be crticism of the ultimatie statstical distributions behind the models (in the use of point set state spaces) but as the process continues to integrate more and more taxonomic data (both molecular and traditional) it will be hard to argue from pure apriori ness any further than the inshgihts that are to be gainsaid. And since model averaging will enable a increased precision through parameter equalization under increased data incorporation the subjectivity of intial choices/conditions will increasingly appear less. Bias can decrease.with a better understanding of the variances involved. Vicarance can be replaced with directed orthogenesis per form.
  • I start with a multi-model approach to integrating salamander distributions across the Mississppi in the central US, expand this to include fish of similar biogeography (showing how historical biogeography and taxonomy can dovetail), find plants that fit the same model set, and then produce statements about metacommunity dispersal movements, provide geological covariates of further congruent intricacy and suggest needs for conservation through the same means of landscape effects on possible speciation all the while testing and generting increasingly robust potential challenges to existing taxonomic relations amongst the polyphyletic groups included in the study.  Through null testing of the multinomial metacommunity conclusions we find a role for both the old and new statitical paradigams in the ever complex field of historical biogeography panbiogeographicalized to the levels/ disciplines  of ecology ,evolution and classification. The use of node-antinode density will improve this direction against bias but it is dependent on the data and realizes sampling inclusion directly in the process. Also where evolution is to go (with human induced cliamte change say) can be predicted from the movements of the metacommunites.  This is a place for the study of dispersal.
  • In attempting to design and develop a multi-model paradigm for panbiogeography it appears that the situation where the betas are all orthogonal may be a goal of construction wherein “Croizat concepts” (track width, node, mass, baseline) thus remain optimized to maximal orthogonality.  This would permit an objective criterion on which generalized tracks could be compounded from individual ones.  Interestingly, with this format orthogenesis might be interpretable as when vicariant speciation follows spatially the given concept orthogonalites and yet the lineage itself may present stochastic variations nonetheless.

    What this formation envisions is that for instance track1beta , track 2beta, node1beta, baseline1beta are in a limit such that the different concept predictors do not affect the other predictor betas (through the residual variance).  The error in the technique will simply be due to the concepts themselves not being able to capture the speciation process itself (more non spatial information(particular genetic info) is involved, other data from other lineages need be included, not enough data on the lineages under consideration are included.

    Once a maximally orthogonal set of concepts is developed for a given data set then this spatial path can be used as  probe for search encounter algorthims as if it was the generalized track so as to incorporated spatial covariates which might be used to improve the entire horizon of application.

    If this analysis is correct then one might think that model averaging could be used as a means to find this maximally orthogonal  concept organization when it does not appear simply by inspection of the data graphed. Model averaging thus enables one to find rather automatically and nonsubjectively parameters of track width, nodeshape, mass amount and baseline direction (to sustain orthogonality necessarily) per generalized track(“good”  model averaged predictions).  These parameters can then be used to write a fully statistical process in which to evaluate the relation of biogeographic distributions to geographic spatial divisions on which panbiogeography and also evolution interse depends.

    One possible use of these models may be for dissecting space usage of metacommunities (bound by cross sections of the conceptual orthoganlities) impacted by human caused distruption and or climate change.  So panbiogeography can be used as a means to asses how whole communites (rather than individual species)may respond to global environmental effects.

    The notion of track congruence can be defined through the use of this method of synthesis.