Whose Past Gets Studied?
Bias, power, and indigenous sovereignty in ancient DNA research
Learning Objectives
By the end of this module you will be able to:
- Describe the African undersampling problem in aDNA research and explain what it does to global prehistory narratives.
- Explain the epistemic hierarchy concern: when genetic findings are positioned to override indigenous oral accounts of descent.
- Articulate the genetic essentialism critique and why tribal belonging cannot be reduced to DNA markers.
- Describe Free Prior Informed Consent (FPIC) as applied to ancient human remains and the debates it generates.
- Identify CARE and OCAP as community-centered alternatives to standard data governance, and locate the SING consortium as an institutional model.
Core Concepts
The sampling gap that shapes everything
Ancient DNA is only as informative as the bones that get sequenced. Before anything else, a structural problem warrants attention: the database is not a neutral cross-section of humanity — it is overwhelmingly European and Eurasian.
By 2019, more than 2,500 ancient European genomes had been published, while Africa, South Asia, Oceania, and the Americas remained severely undersampled. In the case of Africa — which houses the greatest human genetic diversity of any continent — the gap is starkest. A single 2022 Nature study presenting genome-wide aDNA data for six individuals from eastern and south-central Africa spanning roughly 18,000 years was described as having doubled the temporal depth of available sub-Saharan African ancient DNA. Doubling the entire record with six individuals: that sentence is a diagnostic of how little existed before.
This is not just a data footnote. It means the reference panels used to model ancient population structure, the questions researchers are equipped to ask, and the narratives that end up in syntheses of human prehistory are all shaped by what Europe has provided. European-rich reference panels produce skewed reconstructions when applied elsewhere.
In archaeogenetics, a reference panel is the set of known ancient and modern genomes used as comparison points when interpreting a newly sequenced individual. If the panel skews European, populations outside Europe become harder to model accurately — and may look more "European-influenced" than they actually are.
Burial practices compound the bias. aDNA sampling favors cultures that buried their dead intact in durable, aerobic-free conditions. Populations that practiced cremation, water burial, or surface exposure leave skeletal remains that either don't survive or don't yield DNA. Within even a single site, different depositional contexts — infant burials under house floors versus adults in formal cemeteries — can preserve different ancestral signals that get attributed to different time periods or population groups when they may have been contemporaneous.
And institutional geography amplifies it further. Three institutions dominate the field: Harvard's Reich lab (15,000+ genomes analyzed, predominantly Western Eurasian), the MPI for Evolutionary Anthropology's Department of Archaeogenetics (explicitly focused on Eurasian history over 10,000–20,000 years), and Copenhagen's Centre for GeoGenetics (5,000 ancient human genomes, mostly from Europe and Western Asia). These are excellent institutions producing rigorous science — and their geographic focus is not random or neutral. It reflects where funding flows, where skeletal collections are accessible, and what questions get treated as historically central.
Susanne Hakenbeck's 2019 article "Genetics, Archaeology and the Far Right: An Unholy Trinity" in World Archaeology named this pattern directly: archaeogenetic research systematically reinforces existing imbalances by focusing on populations "perceived to have had the highest impact," creating a feedback loop where Western-lab-driven research questions and Eurasian-biased reference panels jointly determine who is visible in global syntheses.
The epistemic hierarchy problem
Archaeogenetics claims to offer a new kind of evidence — molecular, falsifiable, independent of who tells the story. That is genuinely powerful. But it also generates a specific risk: the positioning of genetic connections as the primary — or sole — valid measure of social ties and historical causation.
This has been called biodeterminism: a systematic tendency to explain human social phenomena through genetic explanations while downplaying human agency, cultural transmission, and social structures. Paired with claims of positivist objectivity — we're just reading the molecules — this allows scientists to disclaim political consequences of their findings even when those findings enter directly into disputes about indigenous land rights, cultural continuity, and legal identity.
The problem is sharpest when genetic findings are positioned against oral histories. An indigenous community's account of ancestral continuity, territorial origin, or kinship may be centuries old and internally consistent. When a research paper concludes, on the basis of ancient genomes, that "the ancestors of group X were not present in region Y before date Z," that finding can be leveraged in courts, land claims, and policy disputes — overriding testimony that communities regard as foundational. This is the epistemic hierarchy: a Western institutional framework, operating through the prestige of genetics, that treats molecular data as more authoritative than indigenous knowledge systems.
The Max Planck Institute's own ethics documentation acknowledges this: researchers note that "as scientists from a mostly homogenous background, they are not neutral with respect to ethical questions around archaeogenetic research." That acknowledgment is significant precisely because it comes from within one of the dominant labs — a recognition that scientific authority does not float free of social position.
Genetic essentialism: the identity problem
A specific version of the epistemic hierarchy concern operates at the level of identity. Genetic science can show who a person was biologically related to and which ancient populations they descend from. It cannot determine who they were — their cultural membership, legal status, or social belonging. The confusion of these things is what Kim TallBear calls genetic essentialism.
Tribal membership is a legal and social category. It cannot be reduced to genetic markers.
TallBear's foundational work, Native American DNA: Tribal Belonging and the False Promise of Genetic Science (2013), demonstrates how genomic methods can revive nineteenth-century racial science frameworks — frameworks that were historically used to delegitimize indigenous sovereignty by claiming that indigenous peoples were "not really" the original inhabitants of a territory, or that assimilation had already extinguished their distinctiveness. The same logic reappears when genomic ancestry tests are used to adjudicate who counts as "authentically" indigenous.
This matters because the consequences are not merely academic. If a genetic finding positions an ancient population as "discontinuous" with a modern indigenous group, it can be used to argue that the modern group has no special claim to the remains — or to the territory. The science becomes a political instrument.
TallBear's critique does not reject genetics wholesale. It demands that indigenous communities govern how genetic findings are interpreted and framed in relation to identity, ancestry, and belonging — and that researchers stop treating molecular ancestry as a proxy for the social and legal categories that actually govern rights and recognition.
Biopiracy and "vampire science"
Indigenous critics have described paleogenomics in sharper terms: as a form of biocolonialism or "vampire science" — extracting bodies and genetic material from indigenous communities without meaningful consent, benefit-sharing, or accountability.
This fits within the broader framework of biopiracy: the unauthorized appropriation of biological resources and traditional knowledge without permission, compensation, or benefit-sharing with the communities of origin. While biopiracy conventionally refers to medicinal plants and agricultural biodiversity, the same logic applies to ancestral genetic material: a community's biological heritage is extracted, analyzed at external institutions, published in high-prestige journals, and the community receives nothing — no authorship, no data access, no say in how findings are interpreted or used.
The harms are not abstract. Communities can find their own knowledge commodified or their ancestors repatriated slowly or not at all. Research that redefines their ancestry can affect land claims and treaty rights. And they have no mechanism to correct or contest findings they were never consulted about.
Scientific advancement in paleogenomics has substantially outpaced dialogue about research ethics, with contradictory guidelines — some prioritizing research outcomes, others the wishes of descendants and local communities.
FPIC applied to ancient remains
Free Prior Informed Consent is a right established under UNDRIP, the Convention on Biological Diversity, and ILO Convention 169. It requires that indigenous communities be consulted before any research or development that may affect them — that consent be given freely, before the project begins, with full information about what will happen and why.
The application of FPIC to ancient human remains is contested but increasingly accepted in the field. The dead cannot consent. But contemporary archaeogenetic practice increasingly employs "proxy informed consent" frameworks: descendant communities are identified and engaged in decision-making about aDNA research, repatriation, and data use, rather than treating ancient bones as ownerless research material.
This is not without complications. Who counts as a descendant community when lineages diverged thousands of years ago? What happens when multiple modern groups claim descent from the same ancient population? What does "consent" mean for research that was designed before a community was engaged? These are live debates without settled answers. But the direction of travel is clear: treating ancient remains as unowned data resources is no longer defensible, and the burden now falls on researchers to demonstrate meaningful community engagement.
CARE, OCAP, and the architecture of indigenous data sovereignty
The FPIC principle, once operationalized, runs into a second problem: most existing scientific data governance frameworks — above all the FAIR Principles (Findable, Accessible, Interoperable, Reusable) — were built around the assumption that greater data sharing benefits the public good. For indigenous communities with histories of extractive research, unrestricted sharing is not neutral. It can enable further harm, commodification of genetic information, or reuse in contexts never consented to.
The CARE Principles for Indigenous Data Governance — Collective Benefit, Authority to Control, Responsibility, Ethics — were developed by the Global Indigenous Data Alliance in 2020, led by Stephanie Russo Carroll and colleagues, precisely to counteract this. They are not a replacement for FAIR but a complement: FAIR addresses how data flows between scientists; CARE addresses who controls data and for whose benefit.
Stephanie Carroll's peer-reviewed research documents the underlying problem directly: much genomic data from indigenous peoples has been collected and reused without informed consent or benefit-return. Operationalizing CARE means building consent, benefit-sharing, and community control into the research infrastructure — not as post-hoc ethics review, but as design requirements.
In Canada, the First Nations Principles of OCAP — Ownership, Control, Access, Possession — provide a parallel and complementary framework, established in 1998. OCAP establishes that First Nations communities hold collective ownership of data about themselves; that they control all stages of research; that they can access information about themselves wherever it is held; and that they have practical means (possession/custody) to enforce these rights. OCAP has been applied to aDNA and archaeological data involving First Nations ancestors.
The reciprocity these frameworks demand is concrete: benefit-sharing directed back to indigenous communities, support for indigenous data literacy, indigenous data workforces, and digital infrastructure. This is distinct from extractivist research models that treat communities as sources of raw material.
The SING consortium: an institutional response
Abstract frameworks need institutional embodiments. The Summer Internship for Indigenous Peoples in Genomics (SING), founded in 2011 at the University of Illinois, provides hands-on training in genomic techniques combined with ethical frameworks for indigenous data governance. The purpose is not to produce indigenous scientists who then join existing labs on existing terms — it is to develop capacity for indigenous-led research, so that future genomic work is shaped by the communities most affected by it.
SING has since expanded to Aotearoa New Zealand (University of Otago), Canada, and Australia, each adapting the model to its own jurisdictional and governance context — OCAP in Canada, UNDRIP applications across regions. More than 120 indigenous participants have been trained since inception.
SING is notable not just as a training program but as an institutional model: it demonstrates that the ethics of indigenous data sovereignty can be taught alongside technical genomics, and that the two are not in conflict.
Thought Experiment
Consider a hypothetical, constructed from real elements:
A large archaeogenetics lab sequences 30 ancient genomes from burial sites in a region where an indigenous nation maintains territorial and cultural claims. The genomes are from individuals 1,500–3,000 years old. Analysis suggests substantial genetic discontinuity between the ancient individuals and the modern indigenous group — that is, the ancient people were not the direct biological ancestors of the community living there today.
The lab publishes the findings without having consulted the nation. Journalists run headlines: "DNA shows the [Nation] are not the original inhabitants of [Territory]." The nation's ongoing land claim cites a continuous ancestral presence in the region for at least 2,000 years.
Hold these tensions at the same time:
-
The genetic data, if methodologically sound, is evidence. It is not nothing. What is the right epistemic weight to assign it, in the context of a land claim?
-
The nation was not consulted. The publication process gave them no opportunity to contextualize the findings, contest the interpretation, or present oral histories and alternative evidence. Does that procedural failure change the epistemic status of the findings, or only the ethics of publishing them?
-
"Genetic discontinuity" between ancient and modern populations is the norm in most regions of the world — population movements, admixture, and genetic drift mean almost no modern population is a direct linear descendant of everyone who lived in a territory 2,000 years ago. Does this mean the concept of "original inhabitant" as established by genetics is incoherent? Or is that a retreat from evidence?
-
Tribal belonging, as TallBear establishes, is a legal and social category. Suppose the nation's oral histories say: "we are the people of this land and always have been." And suppose the genetics says something different about biological continuity. Are these claims in conflict? Or are they answering different questions entirely?
There is no single correct answer here. The thought experiment is designed to sit with the genuine difficulty — to hold scientific evidence, procedural ethics, and questions of sovereignty in the same frame without collapsing any of them into the others.
Common Misconceptions
"The ethics problem is separate from the science problem." The two are inseparable. Sampling bias — who is and isn't in the database — is not merely an equity concern, it is a methodological one. A database dominated by European and Eurasian genomes generates inaccurate inferences about global human prehistory. The skew in the database is simultaneously an ethical failure and a scientific limitation. You cannot fix one without addressing the other.
"Ancient remains are ownerless because the people are dead." This treats biological material as analogous to an abandoned physical object. Indigenous legal and governance frameworks explicitly reject this. Ancestral remains carry ongoing obligations — to descendants, to communities, to cultural continuity. The fact that remains are ancient does not dissolve those relationships; it changes their structure. FPIC frameworks applied to ancient DNA are not a philosophical curiosity; they are increasingly the normative expectation in the field.
"Genetic ancestry testing tells you who you really are." Genetic ancestry describes biological relatedness across generations. It does not determine cultural identity, legal tribal status, or social belonging. Tribal membership, as Kim TallBear establishes directly, is a legal and social category that operates according to criteria set by the community — not by a haplogroup or admixture percentage. The equation of genetic ancestry with identity has been used historically to delegitimize indigenous sovereignty, and the critique of genetic essentialism is a response to that specific pattern.
"CARE principles are anti-science." CARE and FAIR are designed to work together, not in opposition. FAIR addresses data interoperability and discoverability — it says nothing about who controls data or for whose benefit. CARE addresses the latter. Open science and indigenous data sovereignty are not inherently in conflict; what CARE challenges is the assumption that unrestricted openness is always neutral and always beneficial.
"Underrepresentation is just a matter of fewer samples — more sequencing will fix it." More sequencing helps, but it does not resolve the structural problem without changes in governance, community engagement, and whose research questions get funded. If the new samples are still extracted without community consent, analyzed within Western-lab-defined frameworks, and interpreted without indigenous input, more data does not address the underlying epistemic hierarchy.
Key Takeaways
- The aDNA database is structurally biased. European and Eurasian remains dominate the record, African genomes are dramatically undersampled, and institutional geography concentrates epistemic authority in a handful of Western labs. This is simultaneously an ethical problem and a methodological one.
- The epistemic hierarchy risk is real. Positioning genetic findings as the primary or sole valid evidence of ancestral descent can override indigenous oral accounts in legal and political contexts — with material consequences for land rights, repatriation, and recognition.
- Genetic essentialism conflates biology with identity. Tribal belonging is a legal and social category, not a genetic one. Using genomic ancestry to adjudicate who is "authentically" indigenous can reproduce nineteenth-century racial science frameworks that were historically used to undermine sovereignty.
- FPIC, CARE, and OCAP are the governance architecture. Free Prior Informed Consent applied to ancient remains, the CARE Principles developed by GIDA, and OCAP established by First Nations leadership in Canada provide community-controlled alternatives to standard scientific data governance. They are complementary, not identical.
- Institutional models like SING show a path forward. Building indigenous capacity for genomics research — rather than consulting communities as research subjects — is how the field shifts from extraction toward reciprocity. The ethical and the scientific improvements are the same project.
Further Exploration
Foundational critiques
- Native American DNA: Tribal Belonging and the False Promise of Genetic Science — The essential text on genetic essentialism and indigenous identity.
- Genetics, Archaeology and the Far Right: An Unholy Trinity — On geographic bias and narrative priorities in archaeogenetics.
- Biodeterminism and pseudo-objectivity as obstacles for the emerging field of archaeogenetics — On the ideological risks of treating genetics as socially neutral.
Governance frameworks
- The CARE Principles for Indigenous Data Governance — The foundational CARE document.
- Operationalizing the CARE and FAIR Principles for Indigenous data futures — How to implement both frameworks in practice.
- First Nations Principles of OCAP — FNIGC
- UN OHCHR — Free Prior and Informed Consent — International legal grounding for FPIC.
Applied ethics in ancient DNA
- Informed proxy consent for ancient DNA research — How FPIC is operationalized when research subjects are deceased.
- Balancing openness with Indigenous data sovereignty — Community-engaged research as a model.
- An ethical crisis in ancient DNA research — Survey of where guidelines have lagged practice.