FAIRFLOW - Fair Representation Learning with Fine-grained Adversarial Regulation of Bias Flow
Sprache der Bezeichnung:
Englisch
Original Kurzfassung:
Societal biases and stereotypes are resonated in various deep
learning (DL) / natural language processing (NLP) models and
applications, among which contextualized word embeddings, and text
Navid Rekabsaz Young Career FAIRFLOW
2/16
classification. The current paradigm to mitigate such biases approach
it by adding fairness criteria to the optimization of the model, resulting
in a new model in a fixed state of fairness-utility tradeoff.
In the foreseen FAIRFLOW project, we pursue a fundamentally
different approach to bias mitigation in DL/NLP models. We introduce
the novel bias regulation networks, which exploit adversarial training
to provide fine-grained control of biases in the information flow of the
main network. These regulation networks are stand-alone
extensions, integrated into the main network's architecture. This
novel paradigm will provide extensive flexibility to end-users at
runtime (in contrast to the current paradigm), will expectedly lead to
better bias mitigation results, and will enable the simultaneous
mitigation of several biases in respect to different protected attributes,
i.e., gender, race, and age.
In the FAIRFLOW project, we will study the effectiveness of utilizing
this approach on (1) various contextualized word embeddings, and
(2) down-stream text classification tasks, and will compare the results
with strong recent baselines. Beside basic research, we will
showcase the benefits of FAIRFLOW by implementing a prototype of
an adaptable bias-aware biography classifier, and will release
packages for convenient adoption of the bias mitigation solution. The
FAIRFLOW project aim to benefit society by providing bias-free DL
solutions, and is particularly in line with the gender-equality
Sustainable Development Goal of the United Nation.