Manipulating and Mitigating Generative Model Biases Without Retraining

Jordan Vice, Naveed Akhtar, Richard Hartley, Ajmal Mian

Research output: Chapter in Book/Conference paperConference paperpeer-review

Abstract

Text-to-image (T2I) generative models have gained increased popularity in the public domain. While boasting impressive user-guided generative abilities, their black-box nature exposes users to intentionally- and intrinsically-biased outputs. Bias manipulation (and mitigation) techniques typically rely on careful tuning of learning parameters and training data to adjust decision boundaries to influence model bias characteristics, which is often computationally demanding. We propose a dynamic and computationally efficient manipulation of T2I model biases by exploiting their rich language embedding spaces without model retraining. We show that leveraging foundational vector algebra allows for a convenient control over language model embeddings to shift T2I model outputs and control the distribution of generated classes. As a by-product, this control serves as a form of precise prompt engineering to generate images which are generally implausible using regular text prompts. We demonstrate a constructive application of our technique by balancing the frequency of social classes in generated images, effectively balancing class distributions across three social bias dimensions. We also highlight a negative implication of bias manipulation by framing our method as a backdoor attack with severity control using semantically-null input triggers, reporting up to 100% attack success rate.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 Workshops, Proceedings
EditorsAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
Place of PublicationSwitzerland
PublisherSpringer Science + Business Media
Pages63-79
Number of pages17
ISBN (Print)9783031920882
DOIs
Publication statusPublished - 2025
Event18th European Conference on Computer Vision - Milan, Italy
Duration: 29 Sept 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science
Volume15644 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Conference on Computer Vision
Abbreviated titleECCV 2024
Country/TerritoryItaly
CityMilan
Period29/09/244/10/24

Funding

FundersFunder number
ARC Australian Research Council FT210100268, DE230101058

    Fingerprint

    Dive into the research topics of 'Manipulating and Mitigating Generative Model Biases Without Retraining'. Together they form a unique fingerprint.

    Cite this