Low historical rates of cuckoldry in a Western European human population traced by Y-chromosome and genealogical data

Comment: Dispels the “cuckoldry myth that 10% of children are being fathered by “someone else.” In fact, Western Europe reveals a pattern of high paternity confidence, paired with male investment in offspring.

Proc Biol Sci.

Larmuseau M. H. D., Vanoverbeke J., Van Geystelen A., Defraene G., Vanderheyden N., Matthys K., Wenseleers T. and Decorte R.

280 Proc. R. Soc. B, 7 December 2013 ,Volume 280Issue 1772, http://doi.org/10.1098/rspb.2013.2400

Abstract

Recent evidence suggests that seeking out extra-pair paternity (EPP) can be a viable alternative reproductive strategy for both males and females in many pair-bonded species, including humans. Accurate data on EPP rates in humans, however, are scant and mostly restricted to extant populations. Here, we provide the first large-scale, unbiased genetic study of historical EPP rates in a Western European human population based on combining Y-chromosomal data to infer genetic patrilineages with genealogical and surname data, which reflect known historical presumed paternity. Using two independent methods, we estimate that over the last few centuries, EPP rates in Flanders (Belgium) were only around 1–2% per generation. This figure is substantially lower than the 8–30% per generation reported in some behavioural studies on historical EPP rates, but comparable with the rates reported by other genetic studies of contemporary Western European populations. These results suggest that human EPP rates have not changed substantially during the last 400 years in Flanders and imply that legal genealogies rarely differ from the biological ones. This result has significant implications for a diverse set of fields, including human population genetics, historical demography, forensic science and human sociobiology.

1. Introduction

Over recent decades, it has been shown that both males and females in many pair-bonding species can seek out extra-pair copulations (EPCs) [1]. Although it has long been appreciated that selection will favour males to seek such EPCs, especially when these can be acquired at low cost [2], EPCs may also come with various distinct advantages for females, ranging from an improved genetic quality of offspring to insurance against male infertility, greater access to material resources and increased protection against infanticide [3–6]. In addition, increased EPC behaviour by females may be selected for as a correlated evolutionary response of selection on males to seek EPP [7], even when EPCs are actually detrimental to females. Hence, EPCs are likely to be driven by a combination of male and female interests [1,8]. Interestingly, several studies have shown that benefits of EPC might also apply to humans, with extra-pair paternity (EPP) sometimes being associated with clear reproductive and material benefits, particularly in some traditional small-scale societies [9,10]. Nevertheless, female adultery is also common in Western society, typically occurring in 15–50% of all relationships [11,12]. As the risk of losing paternity is therefore real, human males display various anti-cuckoldry tactics [13,14], including evolved psychological mechanisms (such as male sexual jealousy [15]), behavioural adaptations (such as mate-guarding and frequent in-pair copulation [16–18]), and morphological and physiological adaptations (such as large testes sizes [19] and high evolutionary rates of sperm and seminal fluid proteins caused by sperm competition [20]).

Given the potential benefits for males and females to seek EPCs, various studies have attempted to estimate EPP rates in humans using molecular techniques [21,22]. These studies have shown that median EPP rates are only between 1 and 3% in most Western European populations [11,21,23], although rates can be higher in low socioeconomic settings [24] and in some traditional small-scale societies, such as among Yanomami Indians [25] and the Himba [9]. Recent relatively unbiased studies carried out on bone marrow transplantation samples obtained maximum-likelihood estimates of EPP rates of only 0.94% in a German population [26] and of 0.65% in a Swiss population [27]—figures that are probably representative for most Western European populations. The relatively low rates of EPP documented in these studies contradict the frequently cited figure that the average rate would be 10–30% in Western populations [21]. These exaggerated rates most probably originate from the frequency with which they are seen in cases where there are grounds to question the biological paternity of a child [28].

Given that most studies on EPP in humans have focused on extant populations, where contraceptives are readily available, the question arises of whether EPP rates could perhaps have been higher in historical times [26,29]. As yet, temporal trends in historical EPP rates, however, have not been extensively studied. A significant decrease in EPP events was reported when persons born before and after the introduction of the birth-control pill were compared [22]. In addition, some studies tried to estimate historical EPP rates based on present behavioural measures of EPCs or based on kin investments of matri- and patrilineal family members. These studies estimated historical EPP rates at 8–30% in Western populations [30–33], again suggesting that EPP rates were higher in historical times. If EPP rates were indeed this high before, then this would be of great significance as it would question studies that use genealogies to infer human life history and population genetic structures.

To directly study the historical EPP rate within a population, Y-chromosome genotyping can be useful. The absence of recombination between Y-chromosome-specific markers is unique and allows the identification of EPP even after many generations have passed using a pedigree analysis [34]. The relation between two deep-rooted pedigrees, that is collections of families connected by a common ancestor who lived many generations ago, can be tested by high-resolution Y-STR and Y-SNP genotyping [35,36]. Until now, historical EPP rates have been estimated using this approach only within single families (e.g. within a South African family [37] or within a North American family in the light of the President Jefferson study [38]). Studies which tried to estimate historical EPP in an entire society have only been done by linking Y-chromosome genotypes with patrilineal surnames (e.g. in England [39–41] and Ireland [42]). The problem with this approach, however, is that a discrepancy between the Y-chromosomal variant and the surname of two individuals can also occur for several reasons that are not linked to EPP events, for example when there were several unrelated founders of the same surname, when the surname was transmitted matrilineally or when there were name changes, resulting in strong biases of such estimates [43].

By using high-resolution Y-chromosomal genotyping of Flemish males, the aim of this research was to verify the difference between the historical and the current EPP rate within a Western European population. In order to do so, we developed two novel methods to estimate the historical EPP rate. First, based on an unbiased population-wide sample, the Y chromosomes of presumed patrilineally related males were compared with each other. Subsequently, EPP rates were estimated based on the discrepancy between the legal genealogy and the actual genetic relatedness. Second, the historical EPP rate within Flanders was estimated based on the genetic traces of a substantial past migration event from northern France to Flanders [44]. This was done by comparing the observed genetic differentiation between the authentic Flemish male subpopulation and the French immigrant male subpopulation with the outcome of simulations carried out for a range of EPP rates.

2. Material and methods

(a) Sampling procedure

Samples were collected from the genetic genealogy project organized by the Flemish genealogical society Familiekunde Vlaanderen and the KU Leuven. The only restriction for participation was to provide a patrilineal genealogy with the oldest reported paternal ancestor (ORPA) living in Flanders before 1800. For several candidates, these genealogies were obtained by archivists and (amateur) genealogists after an agreement to participate by the DNA donor. Genealogies of all the participants were checked using the online databanks of the parish records and the civil registry of the State Archives in Belgium (www.arch.be). All samples were collected with written consent from the donors, who gave permission for the DNA analyses, the storage of the samples and the scientific publication of their anonymized DNA results, even when an EPP event was detected within their genealogy. To assure an unbiased representation of the Flemish population, we included families of all social classes in our study. In addition, no one refused when asked to participate in this genetic genealogical study. Detailed inspection of the surnames and genealogical data of the selected individuals also showed that all highly frequent surnames were covered and that all main historical events in Flanders were reflected in our dataset. Furthermore, although all participants in our study were aware of the fact that it was possible to observe EPP, they were not aware whether namesakes or paternal relatives would be included in the sample as well. Hence, our sampling scheme resulted in very little if any bias.

(b) Y-chromosome genotyping

A buccal swab DNA sample from each selected participant was collected for DNA extraction by using the Maxwell 16 System (Promega, Madison, WI). In total, 38 Y-STR loci were genotyped as the subhaplogroup to the most accurate level of the latest published Y-chromosomal tree (v. 1.2 of AMY-tree [45,46]) was assigned for all participants as described in the electronic supplementary material.

(c) Estimation of extra-pair paternity based on pair-analyses

In our first method to estimate historical EPP rates, all in-depth patrilineages of the DNA donors were entered in the genealogical software program Aldfaer v. 4.2 (Stichting Aldfaer, 2013; www.aldfaer.net), after which DNA donors with a common known paternal ancestor were detected based on the legal genealogies. Based on this comparison, couples of participants were selected when they were related to the seventh degree or higher (i.e. separated by more than six meioses). The DNA donors within each couple were not assumed to be family of each other and they were not aware of the fact that other genealogically related persons were involved in the project. Moreover, a DNA donor could be involved in only one couple such that all meioses in the analysis were analysed only once.

For each couple, the Y chromosomes were compared between the individuals to verify if their genealogical common ancestor (GCA) was also their biological common ancestor (BCA). In order to do so, we first compared the subhaplogroups between the two individuals at the finest resolution of the Y-chromosomal phylogenetic tree. Given that human Y-SNPs have a low mutation rate, approximately 2.0 × 10⁻⁸, they can be treated as unique evolutionary polymorphisms [47]. Male individuals who share the same Y-SNP also must share a common male lineal ancestor since the point of the SNP’s first appearance [48]. All Y-SNPs analysed in this study are known to be polymorphic in the population (i.e. we did not use private SNPs), non-recurrent and not located in the Y-SNP conversion hotspots on the Y chromosome [46]. Second, we compared the 38 Y-STR haplotypes of the two individuals with each other. Based on the calculated mean mutation rate for the 38 genotyped Y-STRs using the individual mutation rates measured by Ballantyne et al. [49], namely 5.91 × 10⁻³ mutations per generation, it is highly unlikely that more than seven mutational differences on the 38 Y-STRs were present between patrilineal relatives. Based on the formulae of Walsh [50], seven mutations on the 38 Y-STRs would mean that the biological ancestor of both individuals lived between 7 and 36 generations ago (95% credibility interval)—between 1110 and 1835 if we use a generation span of 25 years, or between 750 and 1765 if we use a generation span of 35 years.

For each of the couples in the dataset, our analyses provided data on the number of different meioses between the two individuals (N) as well as the presence or absence of a match between the GCA and BCA. Based on these data, we could subsequently estimate the EPP rate per generation. This was done by modelling one meiosis as a Bernoulli trial with random outcome ‘yes’ (in this case an EPP event, occurring with probability p) or ‘no’ (occurring with probability 1−p). For a given couple, a binomial distribution B(N,p) then described the probability distribution of the total number of EPP events out of the N meioses. We made the assumption that the N meioses were independent and that the probability p was identical for all generations and in all family trees. The exact number of EPP events between two individuals separated by N meioses is unknown. From the observation of a mismatch between a GCA and a BCA, we initially assumed that just one EPP event had occurred in the genealogical tree. The maximum-likelihood estimation and the corresponding 95% confidence interval (CI) of the probability p were first computed based on the raw data (i.e. on the total number of EPP events and the summed number of meioses in the dataset). Based on this initial estimate of p, the number of EPP events was then updated for each couple. The initial values of one were changed into the expected value of the total number of EPP events that occurred between two individuals separated by N meioses. An updated dataset then led to a new estimation of p, after which this procedure was repeated to iteratively refine the estimated probability p until convergence was observed. In this way, a correction was made to take into account ‘hidden’ EPP events. All aforementioned analyses were performed in Matlab (Mathworks, Natick, MA).

(d) Estimation of extra-pair paternity rates based on remains of past immigration

Our second method to estimate historical EPP rates was based on a comparison of haplotype frequencies between the authentic Flemish subpopulation and a subpopulation derived from a substantial past migration event from northern France to Flanders which occurred at the end of the sixteenth century. To achieve this, we first analysed the language (inclusive dialect) and the meaning of all surnames in the dataset and their first appearance in Belgium and northern France based on the study of Debrabandere [51] and data of the State Archives of Belgium (www.arch.be). Based on surname origin, two groups were defined, namely the autochthonous Flemish surnames subpopulation (AFS) sample, which contained all individuals with an authentic Flemish surname, and the French/Roman Surnames subpopulation (FRS) sample, which contained all individuals with a French/Roman surname observed since 1575–1625 in Belgian archives. For the reconstructed past AFS sample (further referred as rpAFS), only individuals with an ORPA born in Flanders before 1750 and a surname already present in Belgian archives since 1600 were retained. Different DNA donors with a recent GCA were excluded to avoid a family bias and to guarantee an independent analysis from the first pair-analysis-based method. Based on the genealogical data, known descendants of foundlings, adopted sons and sons with unknown fathers were also excluded owing to the known lack of a relationship between the origin of the surname and the Y-chromosomal variant. Finally, all DNA donors were excluded for rpAFS when they had a surname that referred to a toponym lying outside Flanders or a surname in a dialect or language from outside Flanders. For the current AFS sample (cAFS), all individuals with a Flemish surname were selected without any other restriction. For the reconstructed past FRS sample (rpFRS), the pooled frequencies of the main subhaplogroups in two northern French regions, namely Île-de-France and Nord-Pas-de-Calais, were reconstructed based on data published by Ramos-Luis et al. [52] and Busby et al. [53]. Finally, for the current FRS (cFRS) sample, all Flemish individuals with a French/Roman surname were selected. These surnames are known to be introduced only during the past gene flow at the end of the sixteenth century, as reported by Larmuseau et al. [44]. To avoid too complex a model, we did not consider the possibility of EPP events from foreign populations into the AFS as this would have had a marginal effect on the population diversity [44,54].

The genetic relationship between the four groups, namely rpAFS, cAFS, rpFRS and cFRS, were assessed by means of Weir & Cockerham’s [55] estimate of F_ST without taking evolutionary distance between individual subhaplogroups into account. This was based on the assumption that the different Y-chromosomal subhaplogroups were distributed independently from each other in Western Europe based on their wide and diverged distributions and their high mutual time to MRCA values (more than 5000 years ago [56–59]). All F_ST values were estimated using Arlequin v. 3.1 [60]. Significance of population subdivision was tested using a permutation test with correction for the large difference in sample size between the groups as described by Larmuseau et al. [44] using a script in R [61] (for script see electronic supplementary material, script S2). No further tests to observe population differentiation based on the Y-STRs were done because of the insufficient power of the used set of Y-STRs to detect population structure as a consequence of the high homoplasy associated with these markers [62].

A simulation model was constructed to analyse the maximum EPP rate which allows for the observed level of F_ST between the present AFS and FRS subpopulations after more than 400 years since the past gene flow from northern France to Flanders (figure 1). In this model, the time since the past immigration event of FRS from northern France is set to 16 generations as this is the average number of generations observed by broad genealogical research. The census population size of AFS was assumed to be 10 times larger than FRS, both in 1600 as well as in 2010, which is in line with recent research [44]. According to archival research, it is clear that after one generation the FRS group was socially almost completely integrated in the population [44]. Therefore, it can be safely assumed that if significant EPP frequency would have occurred in the Flemish population, the AFS genotypes would have invaded the FRS subpopulation (based on family names; and vice versa, although to a lesser extent owing to the differences in subpopulation sizes) and that genetic differentiation between the two subpopulations should be absent. In fact, our data revealed no signals of endogamy based on concrete archive research and persons with a FRS of the second generation were not a marginal group, as reflected by the fact that they had the opportunity to perform professions with a high societal value [44]. It should be noted, however, that this is not necessarily true for other surname-based sampling schemes in which no complete admixture can be guaranteed between the surname classes (e.g. for the Italian Arbereshe of Calabria [63]). Based on the reconstructed haplotype frequencies of rpAFS and rpFRS, the haplotype frequencies of cAFS and cFRS were simulated according to a given EPP rate P_np, assuming that the latter did not change since 1600. Technical details of the simulations are given in the electronic supplementary material.

Illustration of the model used to estimate the maximum rate of EPP compatible with the observed population genetic differentiation between the AFS and the FRS. The four graphs provide the relative frequencies (as percentages) of the three main Y-chromosomal subhaplogroups R-U106, R-M529 and R-U152, and of all other subhaplogroups in the datasets (pooled under ‘other’) for reconstructed past (rp) and current (c) subpopulations; P_np, probability of EPP; n, number of generations. In the simulations, starting from rpAFS and rpFRS, F_ST between simulated AFS and FRS subpopulations after 16 generations is compared with the genetic differentiated between cAFS and cFRS for different levels of P_np.

Download figure
Open in new tab
Download powerPoint

It should be noted that neither of our estimation methods can differentiate between an EPP without knowledge of the legal father and an EPP caused by hidden adoption (i.e. when an adopted child is not reported as such). Moreover, the methods will also not detect EPP caused by matings with patrilineal relatives of the husband. Nevertheless, there is no possibility to avoid these events when estimating historical EPP rates. On the other hand, EPP in the latter case will only have limited negative evolutionary consequences for the legal father because his indirect fitness will still remain relatively high.

3. Results

Overall, we determined the Y-chromosome genotypes of 1071 individuals of whom detailed patrilineal records could be obtained. All Y-chromosomal data genotyped for this study have been submitted to the open access Y-STR Haplotype Reference Database (YHRD, www.yhrd.org) and are available under accession nos. YA003651, YA003652, YA003653, YA003738, YA003739, YA003740, YA003741 and YA003742. All individuals were correctly assigned to the main haplogroup using Whit Athey’s Haplogroup Predictor (www.hprg.com/hapest5/hapest5a/hapest5.htm). The single exception was a Y chromosome which could not be assigned to haplogroup BT. This single chromosome is further referred to as Y(xBT). In total, 55 subhaplogroups were observed in the dataset (see electronic supplementary material, table S2). Only six subhaplogroups had a frequency higher than 5%, namely I1* (I-M253*, 12.42%), R1b1b2a1a1b* (R-Z381*, 7.84%), R1b1b2a1a1b2 (R-L48, 12.32%), R1b1b2a1a2* (R-P312*, 11.86%), R1b1b2a1a2e* (R-M529*, 8.78%) and R1b1b2a1a2g3* (R-L2*, 5.32%).

(a) Estimation of extra-pair paternity based on pair-analyses

In the whole dataset of 1071 individuals, 60 independent couples of DNA donors with a GCA were observed based on their in-depth genealogies. The closest relation between donors was in the seventh degree, meaning there were seven meioses between both males. The most distant relation between donors was in the 31th degree, meaning that there were 31 meioses between the donors. Across all 60 pairs with a GCA, an average of 16 meioses separated both donors.

The GCA of eight of the 60 pairs was not the BCA based on the Y-chromosome comparison. This could be inferred from the fact that for seven of these eight cases, the assigned subhaplogroups of the individuals within each pair did not match with one another. In one further case, both individuals of the couple belonged to subhaplogroup I1* (I-M253*), but the 38-Y-STR haplotypes were different from each other by 23 mutations. This indicates that their most recent BCA lived thousands of years before the GCA, based on a Y-STR mutation rate of 5.91 × 10⁻³ mutations per generation and the formulae of Walsh [50]. For the remaining 52 couples, the GCA was also the BCA, as the individuals within each couple were assigned to the same subhaplogroup at the highest phylogenetic resolution and their haplotypes revealed no more than seven Y-STR differences. Based on our pair-analysis, this resulted in an estimated per-generation EPP rate of 0.91% (95% CI: 0.41–1.75%).

(b) Estimation of extra-pair paternity based on remains of past immigration

The haplotype frequencies in the rpAFS, cAFS, rpFRS and cFRS subpopulations were reconstructed for subhaplogroup R-U106 (inclusively R-U106*, R-Z18, R-Z381* and R-L48, but exclusively R-U198), R-M529 and R-U152 (inclusively R-U152*, R-L2* and R-L20). All other subhaplogroups were pooled because their frequencies were too low. The frequencies in the four samples are given in the electronic supplementary material, table S3. Pairwise F_ST-values between rpAFS and cAFS, and between rpFRS and cFRS, are not significant. The F_ST-value between rpAFS and rpFRS is 0.03072 (p = 0.0004), and the value between cAFS and cFRS is 0.02110 (p = 0.0297). Figure 2 shows the results of the simulation study in which the simulated genetic differentiation between AFS and FRS after 16 generations and for a given probability of EPP is compared with the actual observed genetic differentiation between cAFS and cFRS. The mean F_ST in the simulations is equal to the actual F_ST between cAFS and cFRS for an EPP rate of around 2%, and the observed F_ST exceeds the (one-sided) 95% upper confidence limit of the simulations for EPP rates of 8% or higher, indicating that higher levels of EPP are unlikely to match with the observed genetic differentiation between cAFS and cFRS.

Results of the simulation model used to estimate the past EPP rate per generation based on the population genetic data of the AFS and the FRS. The solid line represents the average F_ST-value between AFS and FRS after 16 generations in the simulations in function of the EPP rate; the dashed line represents the upper 95% CI of these F_ST values; the dotted grey line is the empirically observed F_ST-value between cAFS and cFRS (F_ST = 0.02110).

Download figure
Open in new tab
Download powerPoint

4. Discussion

Overall, our results provide the first large-scale, unbiased genetic study of historical EPP rates in a human Western European population, with two independent estimation methods giving largely concordant results. Using the most direct estimation method, based on pairs of males that had a GCA in the last few centuries, we estimated the average EPP rate at 0.91% per generation (95% CI: lower bound 0.41% and upper bound 1.75%). This method took advantage of the hypervariability and mutability of Y-STR haplotypes, and the high phylogenetic resolution of the used Y-SNP haplogroups, which allowed paternally unrelated males to be easily recognized as such [35]. In addition, using a second method that was based on the population genetic traces of a past immigration event which happened at the end of the sixteenth century, we estimated the EPP rate at around 2%. Although this estimate had a broader CI (upper 95% confidence limit = 8%), the actual estimate was close to the first one.

Both of our methods therefore estimated a substantially lower historical EPP rate for Flanders than the 8–30% per generation suggested by previous studies based on behavioural data on rates of EPCs in Western Europe and given the absence of reliable contraceptive methods [30–33]. Our estimates, in contrast, are close to those obtained from genetic studies of contemporary Western European populations, particularly to the ones that focused on subjects where the father was confident of his paternity, and which typically report EPP rates of around 1–2% [21,26,27]. This suggests that EPP rates have not changed substantially during recent centuries in Western European populations, and have not greatly decreased after the large-scale introduction of contraceptives in the 1960s. Our results, however, are consistent with the theory that seeking EPP could be an adaptive male and/or female strategy [1,3,9,10,12] and would therefore occur independently of whether reliable contraceptive methods are available. Despite the evidence for historical occurrence of EPP, the fact that human paternity certainty is high in both extant and historical Western European societies may provide a partial explanation for the high levels of paternal care observed in our species [64,65], which are absent in our closest relatives, the bonobos and chimpanzees, where paternity uncertainty is much higher [66,67]. With the observed low rates of EPP, the benefits of paternal care outweigh the risks of investing in someone else’s offspring [68]. In addition, low rates of EPP will reduce the need to invest in mate guarding, thereby allowing fathers to allocate more resources to paternal care [18].

Aside from the expected coevolution of paternity certainty and paternal care [64,65], the low rates of EPP and high rates of paternal care are also likely to be influenced by several cultural factors [10,17]. For example, a recent study emphasized the role of religion to assure paternity in an African population [69]. Religions use belief systems to set limits on sexual behaviour, especially to promote female chastity. According to Strassmann et al. [69], males have disproportionately influenced religious texts and sexual morals, embedding tactics that may serve their reproductive interests into the religious systems. Therefore, a female who does not conform to this moral code risks a loss of marital opportunities and paternal investment. She may submit to the moral code and impose its standards on herself and the other women of her community. In Flanders, low rates of EPP could likewise reflect the dominance of the Roman Catholic Church and the strict religious morals about sex and marriage which were introduced since the end of the sixteenth century, and which dominated till the middle of the twentieth century [70,71]. This change of moral senses following the Counter-Reformation [72] is apparent in many publicly accessible works of visual art, such as Pieter Brueghel the Elder’s ‘Luxuria’ [73]. A second explanation for the low historical EPP rate could be Flanders’ rural civilization, which historically was also characterized by strong regional endogamy [70]. Typically, paternal investment is very high in such rural civilizations owing to the patrilineal inheritance of land and other resources to the next generation. Indeed, both strict religious morals and regional endogamy will result in low probability of cuckoldry and EPP. As a result, the costs of mate guarding will be outweighed by the small risk of investing in someone else’s offspring. The high probability of biological paternity will instead justify investing more resources in paternal care.

If cultural factors indeed influence the rate of human EPP, it is probable that this rate would have changed in time and place, even in Flanders (e.g. in concordance with increased mobility in several Flemish regions since the Industrial Revolution or with the reduced influence of religious morals during specific time frames [71]). Nevertheless, the sampling design of this study was not able to observe temporal and spatial differences in past EPP rates within Flanders. In addition, this study only deals with a relatively modern population of humans living in an agricultural state system, and may not give an indication of the EPP rates in hunter–gatherer populations or in prehistoric times. Indeed, the observed anti-cuckoldry tactics observed in humans [13–20] may well be a remnant of the higher incidence of EPCs in such ancestral environments. Alternatively, these presumed tactics may also have been selected for other reasons than to avoid losing paternity (e.g. due to precopulatory sexual selection, which has recently been found to influence the evolution of human genital traits [74]).

We should note that the rate of historical EPP documented in our study might still be slightly overestimated. This is due to the fact that, historically, EPP events could also have been the result of a hidden adoption (i.e. when an adopted child is not reported as such [72]), whereas such cases would typically be excluded in contemporary studies. This might overestimate past EPP rates in comparison with the current ones. The observed low EPP rates in this study nevertheless suggest that the frequency of hidden adoption was low. A specific type of hidden adoption is grandparental adoption, which occurred when a daughter got pregnant before a marriage or engagement, and whereby the shame was hidden by pretending that the child was that of the girl’s mother. This type of hidden adoption is assumed to occur in a civilization with a strict sexual moral such as that in Flanders between ca 1600 and the 1950s [61,75]. Yet our results imply that grandparental adoption and unknown adoption only had a marginal occurrence in Flanders during the last four centuries.

Whereas hidden adoption might result in a slight overestimation of historical EPP rates, on the other hand, in both genetic genealogical methods used in our study, EEP would only be detected if the resulting offspring actually reproduced. All cases of EPP where the child did not survive to adulthood or did not reproduce as an adult would go undetected. Based on examples in animals, it is at least conceivable that extra-pair offspring had a lower fitness than within-pair children [1]. Indeed, in humans it has been observed that paternal investment is positively related to both face and odour similarities between fathers and children, and that such discriminative paternal investment was linked to the children’s health, thereby suggesting that extra-pair children might have a reduced fitness in humans too [73]. If so, this effect might have resulted in a slight underestimation of the past EPP rate in comparison with the current one.

Overall, our results suggest that historical rates of EPP are not higher than contemporary ones. These low rates of EPP justify the high levels of paternal care in our species. Moreover, the low rates in historical times in Flanders make sense in the light of the strict religious morals and the mainly rural civilization during most of this period. The low incidence of historical EPP events documented in our study implies that legal genealogies will only rarely differ from biological ones. Consequently, these results are favourable for genealogists, but also for researchers in population genetics, human sociobiology and historical demography in Western Europe, because they typically consider legal genealogies as biological ones in their studies [75]. In addition, our results are valuable for forensic studies where researchers might like to predict the surname of an unknown person by searching for distant patrilineal relatives by comparing Y-haplotypes of DNA donors in a database with those from DNA traces on a crime scene when no autosomal links can be found [43]. Finally, the two estimation methods introduced in this study are shown to be efficient and concordant to estimate historical EPP rates for the period since the introduction of patrilineal heritable surnames and the listing of genealogical data. Future research may use these methods to investigate possible cross-cultural differences in past EPP rates, thereby helping us to understand the biocultural causes of variation in human reproductive parentage patterns.

The genetic analysis to detect historic EPP events within families was approved by the KU Leuven Ethics committee (Reference S54010; Belgian no. B322201213404).

Acknowledgements

We thank all the volunteers who donated DNA samples and who were involved in the collection of samples and genealogical data. We also acknowledge two anonymous reviewers, Hanna Kokko, Mattijs Vandezande, Marc Van Den Cloot, Luc De Meester, Jean-Jacques Cassiman, Manfred Kayser, Mannis van Oven, Bruno Defraene, Marie Boz, Tom Havenith, Lucrece Lernout and Hendrik Larmuseau for useful assistance and discussions. M.H.D.L. is a postdoctoral fellow of the FWO-Vlaanderen (Research Foundation-Flanders).

Data accessibility

All Y-chromosomal data genotyped for this study have been submitted to the open access Y-STR Haplotype Reference Database (YHRD, www.yhrd.org) and are available under accession nos. YA003651, YA003652, YA003653, YA003738, YA003739, YA003740, YA003741 and YA003742.

Funding statement

This study was financially supported by the Flemish Society for Genealogical Research ‘Familiekunde Vlaanderen’ (Antwerp), the Flanders Ministry of Culture and the KU Leuven BOF-Centre of Excellence Financing on ‘Eco- and socio-evolutionary dynamics’ (project no. PF/2010/07). The authors declare that they have no conflict of interest.