FAQ — Phenotype Catalog

Predictable questions, substantive answers. For deeper detail see About · Methodology · Ethics.

Is this scientifically valid?

Each dimension in the taxonomy is grounded in a peer-reviewed scale: Fitzpatrick I–VI (Fitzpatrick 1988, Archives of Dermatology), Halls breast-shape classification (Halls 1998), Regnault ptosis grading (Regnault 1976, Clinics in Plastic Surgery), Heath-Carter somatotype (Carter & Heath 1990), Andre Walker hair texture, Hamilton-Norwood and Ludwig hair-loss scales, Mendieta buttock shape (2007), Manning 2D:4D digit ratio (1998, Human Reproduction), and others. Every value bucket has a definition and a source citation.

The dataset itself is descriptive, not predictive. Per-image observations document phenotype features visible in source photographs; per-group aggregations summarize observed distributions with explicit caveats. We make no causal claims about phenotype-genetics relationships, no predictive claims about ethnic identification from photographs, and no claims that the per-group distributions are population-representative.

Can this predict ethnicity from a photograph?

No, and that's an explicit out-of-scope use. The dataset is structured per-image but the framing, aggregations, and use restrictions are all aggregate-level. Phenotype variance within ethnic groups overlaps substantially with variance between groups — many phenotype combinations are present in five or more ethnic groups simultaneously. A classifier built on per-group averages would have meaningful accuracy issues.

More importantly, this kind of application has well-documented harms. The dataset card explicitly forbids individual classification, identification, or surveillance applications. We chose this boundary deliberately and maintain it through the dataset's licensing and intended-use documentation.

How accurate are the labels?

The vision-LLM annotation produces a self-reported confidence score per image (0.0–1.0). Mean confidence across the 5,668-image corpus is 0.67. Quality split: 42% high, 39% medium, 16% low, 3% very-low.

External calibration of per-dimension accuracy hasn't been measured at scale on this prompt; we recommend filtering on image_quality IN ('high', 'medium') (which excludes ~19% of rows) and considering confidence > 0.55 as an additional threshold for high-precision use cases. Each row carries its own confidence so downstream consumers can apply their own filters.

Why these dimensions and not others?

Dimensions were selected based on (a) presence in established peer-reviewed scales — we prioritized scales that have specific medical, anthropometric, or aesthetic-anatomy literature backing, and (b) photo-observability — every dimension is flagged for whether it can be assessed from a typical photograph and at what minimum framing. Some dimensions (e.g. density_inference from BI-RADS, which requires mammography) are included in the schema for academic completeness but are flagged not_assessable and never get populated from photographs.

How does this handle mixed-heritage individuals?

The dataset is structured per-image, not per-individual-as-ethnic-representative. Each image-row documents the phenotype features visible in that specific photograph; the aggregation layer surfaces per-group distributions with substantial overlap between groups. A mixed-heritage individual's phenotype combination might appear in multiple per-group distributions simultaneously, and that's correct — it reflects the reality of phenotype distributions rather than collapsing them into stereotypical modes.

For AI generation, the per-dimension structure of the vocabulary lets users override specific dimensions independently. The system can generate "Cherokee woman, but with light brown eyes specifically"as a single-dimension override on the group's distribution sampling.

How is this different from race-science?

Race science relies on collapsing phenotype into a small number of typological categories with claimed hierarchical relationships and assumed genetic determinism. Contemporary craniofacial anthropology does the opposite: high-dimensional descriptive measurement, explicit environmental-variance acknowledgment, individual variation captured rather than collapsed, no hierarchical claims. This dataset operates in the second tradition.

A specific example: the cephalic index dimension in head-shape.json uses the dolichocephalic / mesocephalic / brachycephalic categories — terms with a documented history of misuse in 19th-century racial typology. The vocabulary file's framing_caveats block explicitly disclaims that history, citing Boas 1912 (which demonstrated environmental variability in cephalic index that undermines its use as a stable population-classification marker), and scopes the dimension to clinical / forensic / individual-descriptor use.

Where does the data come from?

Per-image observations are derived exclusively from public-domain Wikipedia photographs of notable people. The photographs are sourced via Wikipedia's "List of {Ethnicity} people" articles (where available — 291 of 484 groups have such articles) and the linked per-person Wikipedia biographical pages. Images are fetched from upload.wikimedia.org and analyzed via Anthropic Claude Sonnet 4.6 vision on AWS Bedrock.

Each row in the dataset preserves the source URL of the analyzed image, so per-image license can be verified and the analysis can be audited row-by-row.

What's the dominant source of bias?

The Wikipedia source frame: "notable people Wikipedia has a list-of-X-people article for, with a photograph in their individual article." This sample is gender-skewed male, biased toward public life (politicians, scientists, athletes, entertainers, historical figures), English-language- coverage-biased, and photographic-era-biased. The aggregator surfaces this caveat on every per-group summary that's 100% Wikipedia-sourced (which is currently every summary). Future releases that incorporate user-submitted images or a second public-domain source will dilute this skew.

How can I contribute corrections or additions?

The vocabulary system and pipeline are open source under Apache 2.0 at github.com/Agaveis/phenotype-catalog-pipeline; pull requests for vocabulary refinements, additional citations, or pipeline improvements are welcomed there. For corrections to specific group entries on the live site, contact admin@agaveis.com.

Why is this on a site called Ethnic Erotic?

The catalog originated as a curatorial side project of an adult-creator platform that wanted accurate, ethnically-grounded AI-generation prompts rather than the stereotype-collapsed defaults most image generators produce. The structured taxonomy turned out to be substantially broader in scope and more rigorous than the platform's immediate needs, so the public artifacts (dataset, vocabulary, methodology paper) are released under permissive licenses for reuse by anyone — including academic researchers who may prefer to mirror the dataset under a different brand. Apache 2.0 + CC BY 4.0 licensing makes that reuse explicit.

Read more: About · Methodology · Ethics · Glossary