The persistent homology of genealogical networks

ZM Boyd, N Callor, T Gledhill, A Jenkins… - Applied Network …, 2023 - Springer
ZM Boyd, N Callor, T Gledhill, A Jenkins, R Snellman, B Webb, R Wonnacott
Applied Network Science, 2023Springer
Genealogical networks (ie family trees) are of growing interest, with the largest known data
sets now including well over one billion individuals. Interest in family history also supports
an 8.5 billion dollar industry whose size is projected to double within 7 years [FutureWise
report HC-1137]. Yet little mathematical attention has been paid to the complex network
properties of genealogical networks, especially at large scales. The structure of
genealogical networks is of particular interest due to the practice of forming unions, eg …
Abstract
Genealogical networks (i.e. family trees) are of growing interest, with the largest known data sets now including well over one billion individuals. Interest in family history also supports an 8.5 billion dollar industry whose size is projected to double within 7 years [FutureWise report HC-1137]. Yet little mathematical attention has been paid to the complex network properties of genealogical networks, especially at large scales. The structure of genealogical networks is of particular interest due to the practice of forming unions, e.g. marriages, that are typically well outside one’s immediate family. In most other networks, including other social networks, no equivalent restriction exists on the distance at which relationships form. To study the effect this has on genealogical networks we use persistent homology to identify and compare the structure of 101 genealogical and 31 other social networks. Specifically, we introduce the notion of a network’s persistence curve, which encodes the network’s set of persistence intervals. We find that the persistence curves of genealogical networks have a distinct structure when compared to other social networks. This difference in structure also extends to subnetworks of genealogical and social networks suggesting that, even with incomplete data, persistent homology can be used to meaningfully analyze genealogical networks. Here we also describe how concepts from genealogical networks, such as common ancestor cycles, are represented using persistent homology. We expect that persistent homology tools will become increasingly important in genealogical exploration as popular interest in ancestry research continues to expand.
Springer