Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images

Srinivasan, Hansa; Schumann, Candice; Sinha, Aradhana; Madras, David; Olanubi, Gbolahan Oluwafemi; Beutel, Alex; Ricco, Susanna; Chen, Jilin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2401.14322v1 (cs)

[Submitted on 25 Jan 2024]

Title:Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images

Authors:Hansa Srinivasan, Candice Schumann, Aradhana Sinha, David Madras, Gbolahan Oluwafemi Olanubi, Alex Beutel, Susanna Ricco, Jilin Chen

View PDF HTML (experimental)

Abstract:Capturing the diversity of people in images is challenging: recent literature tends to focus on diversifying one or two attributes, requiring expensive attribute labels or building classifiers. We introduce a diverse people image ranking method which more flexibly aligns with human notions of people diversity in a less prescriptive, label-free manner. The Perception-Aligned Text-derived Human representation Space (PATHS) aims to capture all or many relevant features of people-related diversity, and, when used as the representation space in the standard Maximal Marginal Relevance (MMR) ranking algorithm, is better able to surface a range of types of people-related diversity (e.g. disability, cultural attire). PATHS is created in two stages. First, a text-guided approach is used to extract a person-diversity representation from a pre-trained image-text model. Then this representation is fine-tuned on perception judgments from human annotators so that it captures the aspects of people-related similarity that humans find most salient. Empirical results show that the PATHS method achieves diversity better than baseline methods, according to side-by-side ratings from human annotators.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
Cite as:	arXiv:2401.14322 [cs.CV]
	(or arXiv:2401.14322v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2401.14322

Submission history

From: Candice Schumann [view email]
[v1] Thu, 25 Jan 2024 17:19:22 UTC (4,039 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators