Date of Degree


Document Type


Degree Name





Michael J Hickerson

Committee Members

Ana Carnaval

Frank Burbrink

Andrew Rominger

Brian Tilston Smith

Subject Categories

Biodiversity | Bioinformatics | Computational Biology | Evolution | Genetics


Community Ecology, Population Genetics, Computational Biology


Biodiversity in ecological communities is structured hierarchically across spatial and temporal scales. Many open questions remain as to how this structure accumulates. For example, what are the relative contributions of dispersal versus in situ speciation? Or, how important are stochastic drift versus deterministic processes? Up to this point, these questions have been investigated by isolated disciplines (e.g. macroecology, comparative phylogeography, macroevolution) using tools and data that tend to focus on only one axis of community scale data (e.g. phylogenies, relative abundances, and/or trait information). Yet we know that there are feedbacks among processes that respond on short, medium, and long time scales (local changes of abundance, accumulation of population genetic variation, and speciation processes, respectively). Therefore, the focus of my work is: first, to develop a model of the distribution of genetic variation in ecological communities; second, to construct a multi-scale model of the accumulation of biodiversity in ecological communities that jointly models three axes of data that respond on ecological, population genetic, and phylogenetic timescales; and third, to incorporate abiotic variables with community-scale genetic data in a machine learning framework to make predictions about the distribution of genetic variation across the landscape. First, I will present a modelling approach that involves merging Hubbell's neutral theory with neutral population genetic theory to construct a joint model of species abundance and genetic diversity. This model simulates joint distributions of abundance and genetic variation assuming both ecological and iv population genetic neutrality, and captures both equilibrium and non-equilibrium dynamics. These simulations can be used for a variety of applications, including estimating the shape of the abundance distribution using only a sample of community-scale genetic data. Next, I will present a model that extends the double neutral model to incorporate non-neutral processes (such as ecological interactions) and to introduce a speciation process. The goal of this work is to fully integrate abundance and trait data with phylogenies and population genetic data into a unified framework with the aim of testing community assembly models and estimating ecological parameters using observed community data. One result of this work is the finding that genetic diversity is distributed more uniformly in ecological communities than abundance. Another critical insight is that community-scale genetic data provide a record of community history on a population-genetic timescale, which can complement ecological information obtained from sampled abundance data, and deep time community history recorded in phylogenies. Finally, I will describe a machine learning framework that integrates community-scale genetic data and abiotic variables (climatic/environmental) to make predictions about genetic diversity across the landscape. I demonstrate this method using densely sampled abundances and community-scale sequence data collected from 10 decapod crustacean communities distributed throughout the Coral Triangle. The observed distributions of abundance and genetic diversity in these communities largely agree with model predictions, in that abundance distributions demonstrated higher dominance. The machine learning inference procedure identified mean annual sea surface temperature and proximity of the sampling site to deep water as key factors contributing to the shape and magnitude of community-scale genetic diversity. As community-scale genetic data becomes easier to cost-effectively obtain, this only increases the importance of hierarchical models of biodiversity accumulation that account for feedbacks across timescales to make the most accurate inference about community history from this data

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.