“When genetic ancestry tests conflict with family lore, what does it mean?”
“Where does THAT come from?” my dad asked, his voice a mix of surprise, incredulity, and amusement. We were talking over the phone, he in Boston and me in Princeton, discussing the results he had just received from a commercial genetic ancestry test. The “that” he was referring to was the 5% portion of his ancestry that was identified as “English”.
The remaining 95% of my dad’s genetic ancestry was identified as “Eastern European Jewish”, which didn’t surprise anyone. He has, hanging framed in his home office, the original papers of immigration for his great grandfather, who came to Ellis Island on a boat from Russia at the beginning of the 20th century. He also has a journal from a great uncle, which names the tiny town where they lived in the Russian Pale of Settlement. Between these and other documents, there wasn’t much doubt about our family’s history in Eastern Europe. So where did that “English” ancestry come from? Were the genetic data revealing a new twist in our family’s story? Or was the test flawed?
Situations where ancestry tests disagree with family lore or documented genealogies are not uncommon. The conflict can cause confusion about family history and doubt about the efficacy of the tests. The source of this conflict, however, can often be understood by appreciating a critical feature behind genetic ancestry tests. These tests rely on a special tool called a “reference panel”—the aggregated genetic data from a large number of individuals from diverse populations. The genetic data for a customer, for example my dad, gets compared to this reference panel, and complex algorithms then answer the question: What people in this panel does the customer look most like, and from what populations are those reference individuals?
The composition of the reference panel—the number of individuals collected, the number of different populations represented—determines the test’s precision. A reference panel of 10 Europeans, 10 East Asians, and 10 Africans is probably sufficient to determine from which global region a person derives most of their ancestry. A reference panel 100-times as large can be more accurate and more detailed, especially if it collects from numerous subpopulations— distinguishing African ancestry into North African or sub-Saharan, or European ancestry into Iberian or Scandinavian. A test service may curate their reference panel for their target audience. For example, one of the earliest genetic ancestry tests, African Ancestry Inc., offered to identify the precise ancestral tribes of origin for African Americans through extensive sampling of contemporary African populations.
So what was that 5% English ancestry my dad supposedly carried? Our answer came a few months later, when the ancestry test service announced a major development—they had quintupled the size of their reference panel to more than 16,000 individuals and doubled the number of populations represented. When we looked at my dad’s updated results, the 5% English ancestry had disappeared and he was now classified as almost entirely Eastern European Jewish. It was a reminder of the nascent stage of these ancestry tests. Probably, some individuals in the reference panel identified as “English” also had some Eastern European Jewish ancestry. This small level of relatedness initially made it look like my dad had some English ancestry, but this was corrected as more individuals were added to the reference panel. And how did that Eastern European ancestry get into the English population? Probably it came from other families, similar to my father’s, who were also emigrants from Eastern Europe but who arrived in England many generations earlier. It was a reminder that populations aren’t static, and that migration has been a consistent feature of human history.