bhanu jain, manfred huber ramez elmasri

9
Increasing Fairness in Predictions Using Bias Parity Score Based Loss Function Regularization Bhanu Jain, Manfred Huber Ramez Elmasri [email protected] [email protected] [email protected] Abstract Increasing utilization of machine learning based deci- sion support systems emphasizes the need for resulting predictions to be both accurate and fair to all stakehold- ers. In this work we present a novel approach to increase a Neural Network model’s fairness during training. We introduce a family of fairness enhancing regularization components that we use in conjunction with the tradi- tional binary-cross-entropy based accuracy loss. These loss functions are based on Bias Parity Score (BPS), a score that helps quantify bias in the models with a single number. In the current work we investigate the behavior and effect of these regularization compo- nents on bias. We deploy them in the context of a re- cidivism prediction task as well as on a census-based adult income dataset. The results demonstrate that with a good choice of fairness loss function we can reduce the trained model’s bias without deteriorating accuracy even in unbalanced datasets. Introduction The use of automated decision support and decision-making systems (ADM) (Hardt, Price, and Srebro 2016) in applica- tions with direct impact on people’s lives has increasingly become a fact of life, e,g. in criminal justice (Kleinberg, Mullainathan, and Raghavan 2016; Jain et al. 2020b; Dres- sel and Farid 2018), medical diagnosis (Kleinberg, Mul- lainathan, and Raghavan 2016; Ahsen, Ayvaci, and Raghu- nathan 2019), insurance (Baudry and Robert 2019), credit card fraud detection (Dal Pozzolo et al. 2014), electronic health record data (Gianfrancesco et al. 2018), credit scor- ing (Huang, Chen, and Wang 2007) and many more di- verse domains. This, in turn, has lead to an urgent need for study and scrutiny of the bias-magnifying effects of ma- chine learning and Artificial Intelligence algorithms and thus their potential to introduce and emphasize social inequali- ties and systematic discrimination in our society. Appropri- ately, much research is being done currently to mitigate bias in AI-based decision support systems (Ahsen, Ayvaci, and Raghunathan 2019; Kleinberg, Mullainathan, and Ragha- van 2016; Noriega-Campero et al. 2019; Feldman 2015; Oneto, Donini, and Pontil 2020; Zemel et al. 2013). Bias in Decision Support Systems. As our increasingly digitized world generates and collects more data, decision makers are increasingly using AI based decision support systems. With this, the need to keep the decisions of these systems fair for people of diverse backgrounds becomes es- sential. Groups of interest are often characterized by sensi- tive attributes such as race, gender, affluence level, weight, and age to name a few. While machine learning based deci- sion support systems often do not consider these attributes explicitly, biases in the data sets, coupled with the used per- formance measures can nevertheless lead to significant dis- crepancies in the system’s decisions. For example, as many minorities have traditionally not participated in many do- mains such as loans, education, employment in high pay- ing jobs, receipt of health care, resulting datasets are of- ten highly unbalanced. Similarly, some domains like home- land security, refugee status determination, incarceration, parole, loan repayment etc., may be already riddled with bias against certain subpopulations. Thus human bias seeps into the datasets used for AI based prediction systems which, in turn, amplify it further. Therefore, as we begin to use AI based decision support system, it becomes important to en- sure fairness for all who are affected by these decisions. Contributions. We propose a technique that uses Bias Parity Score (BPS) measures to characterize fairness and de- velop a family of corresponding loss functions that are used as regularizers during training of Neural Networks to en- hance fairness of the trained models. The goal here is to per- mit the system to actively pursue fair solutions during train- ing while maintaining as high a performance on the task as possible. We apply the approach in the context of several fairness measures and investigate multiple loss function for- mulations and regularization weights in order to study the performance as well as potential drawbacks and deployment considerations. In these experiments we show that, if used with appropriate settings, the technique measurably reduces race-based bias in recidivism prediction, and demonstrate on the gender-based Adult Income dataset that the proposed method can outperform state-of-the art techniques aimed at more targeted aspects of bias and fairness. In addition, we investigate potential divergence and stability issues that can arise when using these fairness loss functions, in particular when shifting significant weight from accuracy to fairness. arXiv:2111.03638v1 [cs.LG] 5 Nov 2021

Upload: others

Post on 18-Dec-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bhanu Jain, Manfred Huber Ramez Elmasri

Increasing Fairness in Predictions Using Bias Parity Score Based Loss FunctionRegularization

Bhanu Jain, Manfred Huber Ramez Elmasri

[email protected] [email protected] [email protected]

Abstract

Increasing utilization of machine learning based deci-sion support systems emphasizes the need for resultingpredictions to be both accurate and fair to all stakehold-ers. In this work we present a novel approach to increasea Neural Network model’s fairness during training. Weintroduce a family of fairness enhancing regularizationcomponents that we use in conjunction with the tradi-tional binary-cross-entropy based accuracy loss. Theseloss functions are based on Bias Parity Score (BPS),a score that helps quantify bias in the models witha single number. In the current work we investigatethe behavior and effect of these regularization compo-nents on bias. We deploy them in the context of a re-cidivism prediction task as well as on a census-basedadult income dataset. The results demonstrate that witha good choice of fairness loss function we can reducethe trained model’s bias without deteriorating accuracyeven in unbalanced datasets.

IntroductionThe use of automated decision support and decision-makingsystems (ADM) (Hardt, Price, and Srebro 2016) in applica-tions with direct impact on people’s lives has increasinglybecome a fact of life, e,g. in criminal justice (Kleinberg,Mullainathan, and Raghavan 2016; Jain et al. 2020b; Dres-sel and Farid 2018), medical diagnosis (Kleinberg, Mul-lainathan, and Raghavan 2016; Ahsen, Ayvaci, and Raghu-nathan 2019), insurance (Baudry and Robert 2019), creditcard fraud detection (Dal Pozzolo et al. 2014), electronichealth record data (Gianfrancesco et al. 2018), credit scor-ing (Huang, Chen, and Wang 2007) and many more di-verse domains. This, in turn, has lead to an urgent needfor study and scrutiny of the bias-magnifying effects of ma-chine learning and Artificial Intelligence algorithms and thustheir potential to introduce and emphasize social inequali-ties and systematic discrimination in our society. Appropri-ately, much research is being done currently to mitigate biasin AI-based decision support systems (Ahsen, Ayvaci, andRaghunathan 2019; Kleinberg, Mullainathan, and Ragha-van 2016; Noriega-Campero et al. 2019; Feldman 2015;Oneto, Donini, and Pontil 2020; Zemel et al. 2013).

Bias in Decision Support Systems. As our increasinglydigitized world generates and collects more data, decision

makers are increasingly using AI based decision supportsystems. With this, the need to keep the decisions of thesesystems fair for people of diverse backgrounds becomes es-sential. Groups of interest are often characterized by sensi-tive attributes such as race, gender, affluence level, weight,and age to name a few. While machine learning based deci-sion support systems often do not consider these attributesexplicitly, biases in the data sets, coupled with the used per-formance measures can nevertheless lead to significant dis-crepancies in the system’s decisions. For example, as manyminorities have traditionally not participated in many do-mains such as loans, education, employment in high pay-ing jobs, receipt of health care, resulting datasets are of-ten highly unbalanced. Similarly, some domains like home-land security, refugee status determination, incarceration,parole, loan repayment etc., may be already riddled with biasagainst certain subpopulations. Thus human bias seeps intothe datasets used for AI based prediction systems which, inturn, amplify it further. Therefore, as we begin to use AIbased decision support system, it becomes important to en-sure fairness for all who are affected by these decisions.

Contributions. We propose a technique that uses BiasParity Score (BPS) measures to characterize fairness and de-velop a family of corresponding loss functions that are usedas regularizers during training of Neural Networks to en-hance fairness of the trained models. The goal here is to per-mit the system to actively pursue fair solutions during train-ing while maintaining as high a performance on the task aspossible. We apply the approach in the context of severalfairness measures and investigate multiple loss function for-mulations and regularization weights in order to study theperformance as well as potential drawbacks and deploymentconsiderations. In these experiments we show that, if usedwith appropriate settings, the technique measurably reducesrace-based bias in recidivism prediction, and demonstrateon the gender-based Adult Income dataset that the proposedmethod can outperform state-of-the art techniques aimed atmore targeted aspects of bias and fairness. In addition, weinvestigate potential divergence and stability issues that canarise when using these fairness loss functions, in particularwhen shifting significant weight from accuracy to fairness.

arX

iv:2

111.

0363

8v1

[cs

.LG

] 5

Nov

202

1

Page 2: Bhanu Jain, Manfred Huber Ramez Elmasri

NotationIn this paper we present the proposed approach largely in thecontext of class prediction tasks with subpopulations indi-cated by a binary sensitive attribute. However, the approachcan easily be extended to multi-class sensitive attributes aswell as to regression tasks. For the description of the ap-proach in our primary domain, we use the following notationto describe the classification problem and measures to cap-ture group performance measures to assess prediction bias.

Each element of the dataset, D = {(X,A, Y )i}, is rep-resented by an attribute vector (X,A), where X ∈ Rd rep-resents its features and A ∈ {0, 1} is the sensitive attributeindicating the group it belongs to. C = c(X,A) ∈ {0, 1}indicates the value of the predicted variable, and Y ∈ {0, 1}is the true value of the target variable.

To obtain fairness characteristics, we use parity over sta-tistical performance measures, m, across the two groups in-dicated by the sensitive attribute, where m(A= a) indicatesthe measure for the subpopulation where A = a .

Since this paper will introduce the proposed frameworkin the context of binary classification, the fairness relatedperformance measures used will center around elements ofthe confusion matrix, namely false positive, false negative,true positive, and true negative rates. In the notation usedhere, these metrics and the corresponding measures are:False positive rate (FPR): P (C = 1 | Y = 0).mFPR(A = a) = P (C = 1 | Y = 0, A = a)

False negative rate (FNR): P (C = 0 | Y = 1).mFNR(A = a) = P (C = 0 | Y = 1, A = a)

True positive rate (TPR): P (C = 1 | Y = 1).mTPR(A = a) = P (C = 1 | Y = 1, A = a)

True negative rate (TNR): P (C = 0 | Y = 0).mTNR(A = a) = P (C = 0 | Y = 0, A = a)

Strict parity in these or similar measures implies that theyare identical for both groups, i.e. if m(A= 0) = m(A= 1).

Related WorkSome recent works (Ntoutsi et al. 2020; Krasanakis et al.2018) summarize bias mitigation techniques into three cat-egories: i) preprocessing input data approaches (Calders,Kamiran, and Pechenizkiy 2009; Feldman et al. 2015;Feldman 2015), ii) in-processing approaches or training un-der fairness constraint that focuses on the algorithm (Celiset al. 2019; Zafar et al. 2017; Iosifidis and Ntoutsi 2020)and, iii) postprocessing approaches that seek to improve themodel (Hardt, Price, and Srebro 2016; Mishler, Kennedy,and Chouldechova 2021). Expanding on this, (Orphanou etal. 2021) sums up mitigating algorithmic bias as detection ofbias, fairness management, and explainability management.Their ”fairness management” step is the same as the afore-mentioned ”bias mitigation techniques” but they added ”Au-diting” (Fairness formalization) as an additional category.

The first bias mitigation technique involving preprocess-ing data is built on the premise that disparate impact in train-ing data results in disparate impact in the classifier. There-fore these techniques are comprised of massaging data la-bels and reweighting tuples. Massaging is altering the class

labels that are deemed to be mislabeled due to bias, whilereweighting increases weights of some tuples over othersin the dataset. Calders et. al. (Calders, Kamiran, and Pech-enizkiy 2009) assert that these massaging and reweightingtechniques yield a classifier that is less biased than withoutsuch a process. They also note that while massaging labelsis intrusive and can have legal implications (Barocas andSelbst 2016), reweighting does not have these drawbacks.

The second technique of in-processing fairness under con-straint (Celis et al. 2019; Zafar et al. 2017) chooses im-pact metrics (Iosifidis and Ntoutsi 2020; Zafar et al. 2017;Calders et al. 2013), followed by modifying the imposedconstraints during classifier training or by adding constraintsthat steer the model towards the optimization goals (Zafaret al. 2017; Calders et al. 2013; Iosifidis and Ntoutsi 2020;Oneto, Donini, and Pontil 2020). Similarly, (Iosifidis andNtoutsi 2020) change the training data distribution and mon-itor the discriminatory behavior of the learner. When thisdiscriminatory behavior exceeds a threshold, they adjust thedecision boundary to prevent discriminatory learning.

The third technique to improve the results involves a post-processing approach to comply with the fairness constraints.The approach can entail selecting a criterion for unfairnessrelative to a sensitive attribute while predicting some tar-get. In the presence of the target and the sensitive attribute,(Hardt, Price, and Srebro 2016), shows how to adjust thepredictor to eliminate discrimination as per their definition.

Our work uses supervised learning without balancing thedataset. The reason is to avoid issues from massaging labelsand the consideration that while reweighting can be effec-tive for simple sensitive attributes and predictions, in generalmultiple overlapping sensitive attributes and classes couldexist, making reweighting complex and potentially impos-sible. Some work employing unsupervised clustering alsoseek to maintain the original balance in the dataset (Abbasi,Bhaskara, and Venkatasubramanian 2021; Abraham, Sun-daram, and others 2019; Chierichetti et al. 2018).

Definitions of Fairness. Literature contains several con-cepts of fairness requiring one or more demographic or sta-tistical properties to be constant across subpopulations. De-mographic parity (or statistical parity), mandates that deci-sion rates are independent of the sensitive attribute (Noriega-Campero et al. 2019; Louizos et al. 2015; Calders, Kamiran,and Pechenizkiy 2009; Zafar et al. 2015) . For binary classi-fication this implies P (C = 1|A = 0) = P (C = 1|A = 1),where C ∈ {0, 1} is the system’s decision. This, how-ever, makes an equality assumption between subpopulationswhich might not hold. To address this, (Hardt, Price, andSrebro 2016) focus on error rate balance where fairness re-quires subpopulations to have equal FPR, FNR, or both.Another common condition is equality of odds demandingequal TPR and TNR. While perfect parity would be desir-able, it often is not achievable and thus measures of thedegree of parity have to be used (Jain et al. 2020a). Re-fer to (Mehrabi et al. 2019; Chouldechova and Roth 2018;Verma and Rubin 2018) for a more complete survey of com-putational fairness metrics.

Here we use a quantitative parity measure as our conceptof fairness in the form of Bias Parity Score (BPS) applied to

Page 3: Bhanu Jain, Manfred Huber Ramez Elmasri

metrics that include FPR, FNR, TPR, TNR, and FPR.

ApproachThis paper is aimed at providing a framework to allow adeep learning-based prediction system to learn fairer modelsfor known sensitive attributes by actively pursuing improvedfairness during training. For this, we first need a quantitativemeasure of fairness that can be widely addressed. We pro-pose to use Bias Parity Score (BPS) which evaluates the de-gree to which a common measure in the subpopulations de-scribed by the sensitive attribute is the same. Based on thisfairness measure we then derive a family of correspondingdifferentiable loss functions that can be used as a regulariza-tion term in addition to the original task performance loss.

Bias Parity Score (BPS)In machine learning based predictions, bias is the differentialtreatment of ”similarly situated” individuals. It manifests it-self in unfairly benefitting some groups (Jiang and Nachum2020; Angwin et al. 2016; Dressel and Farid 2018; Zeng,Ustun, and Rudin 2017; Jain et al. 2019; Jain et al. 2020b;Ozkan 2017). Bias in recidivism, for example, may be ob-served in a higher FPR and lower FNR for one subgroup,implying more frequent incorrect predictions to reoffend andthus denied parole. Similar bias may be observed when pre-dicting whom to invite for interviews or when making salarydecisions, resulting in preferential treatment of one cohort.

To capture this, the relevant property can be encoded asa measure and parity between subgroups can be defined asequality for this measure. In this work we capture the simi-larity of the measures for the subgroups quantitatively in theform of Bias Parity Score (BPS). Given that ms(A = 0)and ms(A = 1) are the values of any given statistical mea-sure, s, for the sensitive and non-sensitive subpopulations,respectively, BPS can be computed as:

BPSs = 100min(ms(A = 1),ms(A = 0))

max(ms(A = 1),ms(A = 0)). (1)

where the multiplication factor of 100 leads to a percent-age measure, making it easier to read.

A BPS of 100 represents perfect parity while a BPS of 0represents maximal bias between the groups. This formula-tion yields a symmetric measure for two groups and providesus with a universal fairness measure that can be applied toany property across subgroups to evaluate fairness in a pre-dictive model generated using a machine learning classifier.

We here defined BPS in terms of a binary sensitive at-tribute, i.e. in the context of fairness between two groups.While this is the case we will investigate in this paper, thisfairness measure can easily be expanded to a multi-valuedsensitive attribute A ∈ {a1, ...ak} in a form such as:

BPSs =∑ai

100

k

min(ms(A = ai),ms())

max(ms(A = ai),ms())

where ms() is the value of the underlying measure for theentire population. BPS here thus measures fairness as theaverage bias of all classes compared to the population.

Even though, BPS can be used for many statistical mea-sures as described in (Jain et al. 2020a) we will use it herewith FPR, FNR, TPR, TNR as these are often used for classi-fication systems. In addition we will employ it on predictionrate to obtain a BPS equivalent to quantitative statistical par-ity to facilitate comparisons with previous approaches in theAdult Income domain.

BPS-Based Fairness Loss FunctionsWhile the BPS score of a statistical entity represents a mea-sure of fairness, it does not lend itself directly to traininga deep learning system since it is not generally differen-tiable. To address this, we need to translate the underly-ing measure into a differentiable form and combine it intoa differentiable version of the BPS score that can serve asa training loss function. As we are using FPR, FNR, TPR,and TNR here we first have to define continuous versions ofthese functions. For this, we build our Neural Network clas-sifiers with a logistic activation function as the output (or asoftmax if using multi-attribute predictions), leading, whentrained for accuracy using binary cross-entropy to the net-work output, y, representing the probability of the positiveclass, y = P (Y = 1|X,A). In the experiments we will with-hold the sensitive attribute and thus y = P (Y = 1|X). Us-ing this continuous output, we can define a continuous mea-sure approximation mcs() for FPR, FNR, TPR, and TNR:

mcFPR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=0

yi∑(X,A,Y )i:Ai=k

yi

mcFNR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=1

(1−yi)∑(X,A,Y )i:Ai=k

(1−yi)

mcTPR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=1

yi∑(X,A,Y )i:Ai=k

yi

mcTNR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=0

(1−yi)∑(X,A,Y )i:Ai=k

(1−yi)

(2)

This is not equal to ms() as it is sensitive to deviationsin the exact prediction probability. For example if the out-put changes from 0.6 to 0.7 the continuous measure, mcs(),changes while the full measure, ms(), would not changesince both values would result in the positive class.

To reduce this discrepancy, we designed a second approxi-mation,mss(), that uses a sigmoid function, S(x) = 1

1+e−x ,to more closely approximate the full statistical measure byreducing intermediate prediction probabilities:

msFPR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=0

S(yi−0.5)∑(X,A,Y )i:Ai=k

S(yi−0.5)

msFNR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=1

S(0.5−yi)∑(X,A,Y )i:Ai=k

S(0.5−yi)

msTPR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=1

S(yi−0.5)∑(X,A,Y )i:Ai=k

S(yi−0.5)

msTNR(A = k) =

∑(X,A,Y )i:Ai=k,Yi=0

S(0.5−yi)∑(X,A,Y )i:Ai=k

S(0.5−yi)

(3)

Once measures are defined, a continuous approximationfor BPS fairness for both measures can be defined as:

BPScs =min(mcs(A=1),mcs(A=0))max(mcs(A=1),mcs(A=0))

BPSss =min(mss(A=1),mss(A=0))max(mss(A=1),mss(A=0))

(4)

Page 4: Bhanu Jain, Manfred Huber Ramez Elmasri

These, in turn can be inverted into loss functions that canbe used during training and further expanded by allowingto weigh small versus large biases by raising the loss to thekth power which depresses the importance of fairness lossesclose to 0 (i.e. when the system is almost fair).

LFc(s,k) = (1−BPScs)k

LFs(s,k) = (1−BPSss)k(5)

These loss functions are continuous and differentiable inall but one point, namely the point where numerator and de-nominator are equal and thus in the minimum of the lossfunction. This, however, can be easily addressed in the train-ing algorithm when optimizing the overall loss function.

Fairness Regularization for Neural NetworkTrainingThe approach in this paper is aimed at training Neural Net-work deep learning classifiers to obtain more fair resultswhile preserving prediction accuracy. The underlying taskis thus maximizing accuracy which is commonly encoded interms of a binary cross entropy loss function, LFBCE .

Starting from this, we utilize the fairness loss functionsderived in the previous sections as regularization terms re-sulting in an overall loss function, LFc and LFs for contin-uous and sigmoided fairness losses, respectively:

LFc(~α,~k) = LFBCE +∑

siαiLFc(si,k1)

LFs(~α,~k) = LFBCE +∑

siαiLFs(si,k1)

(6)

where ~α is a weight vector determining the contributionof each fairness loss function, ~k is a vector of powers to beused for each of the fairness losses, and s is the vector of theloss metrics, < FPR,FNR, TPR, TNR >. Setting an αi

to 0 effectively removes the corresponding fairness criterion.These loss function can be used to train a Neural Network

classifier where different values for ~α, ~k, and the choice ofsigmoided vs continuous loss puts different emphasis on dif-ferent aspects of the underlying fairness characteristics.

Figure 1 shows an overview of the basic model selectionprocess for the proposed approach. Based on selected fair-ness criteria and hyperparameter ranges, an architecture istrained in a grid search and the best model is selected.

Figure 1: Design of a fair classifier. Based on selected fair-ness criteria, loss function type, hyperparameter ranges, andnetwork architecture, the best performing classifier is pickedafter a grid search based on the tasks’s requirements.

ExperimentsTo study the applicability of the proposed use of fairnesslosses as regularization terms, we conducted experiments onthree different datasets, two in the the recidivism domain andone in the income prediction domain, and analyzed the be-havior and effects of different function and weight choices.

Dataset 1: This is the main dataset used and is the rawdata from the study“Criminal Recidivism in a Large Co-hort of Offenders Released from Prison in Florida, 2004-2008 (ICPSR 27781)” (United States Department of Jus-tice 2010) (Bhati and Roman 2014). It contains 156,702records with a 41:59 recidivist to non-recidivist ratio. Thisratio for our two subpopulations is 34:66 for Caucasians and46:54 for African Americans, making it very unbalanced andleading to significant bias in traditional approaches. In eachcrime category, the dataset has a higher proportion of non-recidivists Caucasians than African Americans. It covers sixcrime categories and provides a large range of demographicfeatures, including crime committed, age, time served, gen-der, etc. We employed one-hot encoding for categorical fea-tures and trained to predict reconviction within 3 years.

Dataset 2: This is a secondary dataset to validate results.This dataset ensued from the “Recidivism of Prisoners Re-leased in 1994” study (United States Department of Jus-tice 2014). It contains data from 38,624 offenders that werereleased in 1994 from one of 15 states in the USA, witheach record containing up to 99 pre and post 1994 crimi-nal history records, treatments and courses taken by offend-ers. As described in (Jain et al. 2020b; Jain et al. 2020a) ,each record was split to create one record per arrest cycle,resulting in approximately 442,000 records that use demo-graphic and history information to predict reconviction. Thisdataset is significantly more balanced, allowing to verify ef-fects from Dataset 1 in data with different characteristics.

Dataset 3: This is the commonly used “Adult IncomeData Set” (Kohavi and others 1996) from the UCI repository(Dua and Graff 2017) that was extracted from 1994 Censusdata. The dataset has 48,842 records, each representing anindividual over 16 who had an income greater than $100.The dataset holds socioeconomic status, education, and jobinformation pertaining to the individual and is thus largelydemographic, just like Dataset 1, but in a different domain.Unlike Datasets 1 and 2, where race is the sensitive attribute,gender is the sensitive attribute in Dataset 3 with the goal ofpredicting a salary higher than $50 K. This dataset is heremainly used to verify that effects translate to other domainsand to permit comparisons with state-of-the-art techniques.

Constructing Neural NetworksFor the recidivism datasets we trained networks with 2 hid-den layers with 41 units each. Each hidden layer used ReLUactivation and 10% dropout (Srivastava et al. 2014), fol-lowed by Batch Normalization (Santurkar et al. 2018). Theoutput layer had 1 logistic unit as indicated previously. Wetuned various hyper parameters to select a batch size of 256and 100 epochs and trained using the Adam (Kingma and Ba2014) optimizer. The hyperparameters were chosen based

Page 5: Bhanu Jain, Manfred Huber Ramez Elmasri

Figure 2: BPS Measures, Accuracy, and Loss function values as a function of regularization weight, α, for the continuous,linear case using LFc(FPR,1) (left), LFc(FNR,1) (middle), and LFc(FPR,1) + LFc(FNR,1) (right).

on previous experience with the datasets. (Jain et al. 2020b;Jain et al. 2020a; Jain et al. 2019).

Evaluation StudyTo evaluate the characteristics of fairness loss regularizationin recidivism prediction, we conducted experiments with 6measures for both continuous and sigmoided loss functions,employed 4 different exponents for the continuous case, andran experiments for 10 different weight settings. In particu-lar, we used the 4 statistics individually as well as FPR andFNR or TPR and TNR simultaneously with equal weights.A grid search varied weights α between 0.1 and 1 in stepsof 0.1, and for continuous loss used powers of 1, 2, 3, and4. The goal was to compare the effects of different settingson the behavior of the system in terms of accuracy, fairness,and stability. The Baseline used no fairness regularization.

To capture variance, Monte Carlos cross validation with10 iterations was used and the means of the metrics are re-ported. Stability was evaluated using the variance.

Results and DiscussionThe goal of these experiments is to evaluate whether the pro-posed approach can achieve the desired goal and to evaluatethe effect of different loss function and regularization weightchoices on the performance both in terms of accuracy andfairness. To perform this study we utilized mainly Dataset 1due to its higher imbalance and studied the effect of differentaspects of the loss function design. We then used Dataset 2to validate some of the results on a more balanced dataset.

Measuring Bias in Results: After conducting the exper-iments, we investigated the effect of the loss functions onresidual bias measured by Bias Parity Score for FPR, FNR,TPR, and TNR, as well as Accuracy. In addition we recordedAccuracy as well as the values for BCE loss and for each ofthe fairness loss functions used in the respective experiment.

Applicability and Effect of Regularization Weight Tostudy the basic performance we analyzed the experimentsfor the six continuous fairness measures in the linear case(i.e. with a power of 1). As indicated, the regularizationweight was increased step-wise, starting from the Baselinewith no regularization (α = 0) to equal contributions ofBCE and fairness (α = 1). Figure 2 shows the average accu-racy (solid grey line), BPS scores (solid lines), and loss func-tion values (corresponding dashed lines) as a function of the

regularization weight, α. Only results for FNR,FPR, andFNR+FPR-based regularization are shown. Behavior forTNR, TPR, and TNR+ TPR was similar.

These graphs show that introducing fairness regulariza-tion immediately increases the corresponding BPS score,and thus fairness, while only gradually decreasing accuracy.This demonstrates the viability of the technique. Moreover,in this case, regularization for one statistic also yields im-provement in other statistics, which can be explained withthe close relation of FPR, FNR, TPR, and TNR.

However, the experiments also show some differencesthat give information regarding important considerationsfor regularization weights. In particular, while the experi-ments with LFc(FNR,1) and LFc(FNR,1) + LFc(FPR,1)

show a relatively steady increase in fairness, the case ofLFc(FPR,1) shows that after an initial strong increase, theactive fairness measure, BPScFPR, starts to decrease onceα exceeds 0.2. At the same time the regularization loss,LFc(FPR,1), continues to decrease, showing a decouplingbetween the core fairness measure and the loss function atthis point. The reason is that in order to obtain a differ-entiable loss function it was necessary to interpret the net-work output as a probabilistic prediction and thus, for exam-ple, decreasing the output for one negative item from 0.4 to0.1 while simultaneously decreasing an output for a positivedata item from 0.51 to 0.49 here represent an improvementin loss function value while reducing the corresponding fair-ness BPS as the second item classification became incorrect.

Sigmoided Loss Function One way to address this de-coupling are the proposed sigmoided fairness loss func-tions. Figure 3 shows the corresponding results usingthe sigmoided version of the fairness loss LFs(FNR,1),LFs(FPR,1), and LFs(FNR,1) + LFs(FPR,1). Again, BPSvalues, Accuracy, and loss function values are shown.

While these results, compared to the linear results in Fig-ure 2, as expected show less decoupling between loss andfairness, tend to impose higher levels of fairness earlier, andmaintain fairness more reliably, they also show a strongerdegradation in accuracy and, when looking at regularizationloss for higher weights and corresponding variances also ex-hibit strong signs of instabilities. This implies that whilethere are advantages in terms of how well sigmoided lossfunctions represent fairness, optimizing them is significantlyharder , leading to less stable convergence. When choosing

Page 6: Bhanu Jain, Manfred Huber Ramez Elmasri

Figure 3: BPS Measures, Accuracy, and Loss function values as a function of regularization weight, α, for sigmoided lossfunctions LFs(FPR,1) (left), LFs(FNR,1) (middle), and LFs(FPR,1) + LFs(FNR,1) (right).

Figure 4: BPS Measures, Accuracy, and Loss function values as a function of regularization weight, α, for different powers ofthe continuous loss functions LFc(FPR,1) (left), LFc(FPR,2) (center-left), LFc(FPR,3) (center-right), and LFc(FPR,4) (right).

between these two options it is thus important to considerthis tradeoff and to have ways to monitor gradient stability.

Effect of Loss Function Power Another way to modifythe effect of regularization losses is to increase the loss func-tion power. This reduces the impact of small amounts of biasnear a fair solution while increasing importance if fairnesslosses are high. The goal is to reduce the small scale adjust-ments near a solution most responsible for decoupling. Fig-ure 4 shows the effect of different powers for continuous lossLFc(FPR,k) , the case where strong decoupling occurred.

These graphs show that as the power increases, improve-ment in fairness becomes smoother and decoupling of lossand fairness occurs significantly later. However, a largerweight is needed to optimize, with power 4 not reaching thebest fairness. This might be a problem in datasets with largerfairness loss. The best performance seems with power 3.

Effects in More Balanced Data Dataset 1 contains rela-tively biased data, reflected in the low base fairness scores of70 - 80 and accuracy of 65%. Thus it offers significant roomfor improvement in fairness without loss of accuracy. To seeif similar benefits can be achieved in less biased datasets,the experiments were repeated on Dataset 2 which has ini-tial fairness near 90 and accuracy of 89%. To see how sim-ilar settings work here, Figure 5 shows results for the bestcontinuous and sigmoided settings for FPR-based reguariza-tion. In particular, it shows the continuous case with power3, LFs(FPR,3) and the sigmoided case of LFs(FPR,1).

These results show that even for less biased datasetswhere fairness gain is more difficult, the proposed methodcan increase fairness without a dramatic drop in accuracy.However, improved fairness in one metric here yields degra-

Figure 5: BPS, Accuracy, and Loss values for Dataset 2 withLFc(FPR,3) (left), and sigmoided LFs(FPR,1) (right).

dations in others. This is not unexpected as the smaller im-provement potential is bound to require more tradeoffs.

Performance on Adult Income DataTo show applicability beyond recidivism and to compareperformance with other state of the art approaches, we ap-plied our approach on Dataset 3 and compared the resultswith those in (Krasanakis et al. 2018; Krasanakis et al. 2018;Beutel et al. 2017; Zafar et al. 2017; Kamishima et al. 2012).

Neural Network ArchitecturesTo find the best network we experimented with a number ofhyperparameters to find the network with the highest base-line accuracy. Starting from Architecture 1 from the recidi-vism datasets (two ReLU hidden layers, each with108 neu-rons), we derived Architecture 2 (two leaky ReLU hiddenlayers with 108 and 324 neurons, respectively), yielding a

Page 7: Bhanu Jain, Manfred Huber Ramez Elmasri

Figure 6: BPS Measures, Accuracy, and Loss function values as a function of regularization weight, α, for Architecture 2 onAdult income data for LFc(STP,1) (left), LFc(STP,2) (right), LFc(STP,3) (eft), and LFc(STP,4) (right)

Table 1: Adult Income Accuracy and pRule Comparison.Architecture 1: STP-Loss Function, k=4, α=0.8Architecture 2: STP-Loss Function, k=4, α=0.84

Fairness Technique Adult IncomepRule acc

(Krasanakis et al. 2018) (no fairness) 27% 85%(Krasanakis et al. 2018) 100% 82%(Zafar et al. 2017) 94% 82%(Kamishima et al. 2012) 85% 83%Baseline (current work) 34% 85%Architecture1 (current work) 99% 82%Architecture 2 (current work) 100% 83%

small baseline accuracy increase from 84.5% to 84.75%.

Results and ComparisonEffect of Loss Function Parameters Varying loss func-tion parameters again yielded very similar observations asshown in Figure 6 for Architecture 2 and various powers.

The graphs show results for positivity rate (P (C = 1|X))as the underlying measure which, as a BPS score corre-sponds to Statistical Parity. As previously, increasing thepower reduced decoupling. Studies with sigmoided loss fur-ther demonstrated the same behavior as for the recidivismdata, demonstrating the framework’s generality. Here, power4 on continuous loss achieved the best results.

Comparison with State of The Art Results To comparethe approach to recent previous work, we compared it to(Krasanakis et al. 2018) that optimized p-rule (i.e. statisti-cal parity) , and to (Zhang, Lemoine, and Mitchell 2018)that used adversarial training to achieve results with closelymatching FPR and FNR across genders.

To compare to p-rule, we utilized BPS on prediction rateas our fairness measure. As seen in the previous experi-ments, applying BPS-based fairness again changed the over-all accuracy only to a small degree, yielding a drop from84.5% to 82.3% using Architecture 1 while p-rule improvedfrom 33.9% to 99.128%. Using Architecture 2 with a power4 continuous loss and α = 0.84, p-rule and accuracy fur-ther improved to 99.9% and 83%, as shown in Table 1. Asshown, we achieved higher accuracy than (Krasanakis et al.2018) while maintaining a pRule of approximately 100%.

Table 2: Adult Income FPR and FNR Equality Comparison.Arch 1: FPR-FNR-Sigmoided-LF,k=4,α1=0.05, α2=0.05.Arch 2: FPR-FNR-Sigmoided-LF,k=3,α1=0.1, α2=0.125

Female Male BPSWithout With Without With

Beutel etal. 2017

FPR 0.1875 0.0308 0.1200 0.1778 17.3FNR 0.0651 0.0822 0.1828 0.1520 52.6

Zhang etal. 2018

FPR 0.0248 0.0647 0.0917 0.0701 92.0FNR 0.4492 0.4458 0.3667 0.4349 97.5

Arch 1 FPR 0.0319 0.0589 0.1203 0.0628 93.9FNR 0.4098 0.4431 0.3739 0.5105 86.8

Arch 2 FPR 0.0319 0.0610 0.1203 0.0708 93.3FNR 0.4098 0.4785 0.3739 0.4862 98.4

To compare to the work in (Zhang, Lemoine, and Mitchell2018) who aim to achieve similar FNR and FPR val-ues for both genders, we utilized a combined BPSFNR

and BPSFPR fairness loss. The FPR and FNR using ourtechnique as shown in Table 2 were approximately equalacross genders at 0.0589 versus 0.0628 and at 0.4431 versus0.5105, respectively with Architecture 1while Architecture2 further improved BPSFPR and BPSFNR while main-taining low absolute values of FPR and FNR of the two gen-der based cohorts and thus outperformed the BPSFPR andBPSFNR in (Zhang, Lemoine, and Mitchell 2018).

ConclusionsIn this work we proposed Bias Parity Score-based fairnessmetrics and an approach to translate them into correspond-ing loss functions to be used as regularization terms dur-ing training of deep Neural Networks to actively achieveimproved fairness between subpopulations. For this, we in-troduced a family of fairness loss functions and conductedexperiments on recidivism prediction to investigate the ap-plicability and behavior of the approach for different hper-parameters. We demonstrated the generality of the approachand discussed considerations for loss function choices. Thiswork does not depend on changing input or output labels tomake fair recommendations while not forsaking accuracy.

Additional experiments performed on a common AdultIncome dataset to compare the approach to more specializedstate-of-the-art approaches showed that the proposed regu-larization framework is highly flexible and, when provided

Page 8: Bhanu Jain, Manfred Huber Ramez Elmasri

with an appropriate BPS-based fairness measure can com-pete with and even outperform these methods.

References[Abbasi, Bhaskara, and Venkatasubramanian 2021] Abbasi,M.; Bhaskara, A.; and Venkatasubramanian, S. 2021.Fair clustering via equitable group representations. InProceedings of the 2021 ACM Conference on Fairness,Accountability, and Transparency, 504–514.

[Abraham, Sundaram, and others 2019] Abraham, S. S.;Sundaram, S. S.; et al. 2019. Fairness in cluster-ing with multiple sensitive attributes. arXiv preprintarXiv:1910.05113.

[Ahsen, Ayvaci, and Raghunathan 2019] Ahsen, M. E.; Ay-vaci, M. U. S.; and Raghunathan, S. 2019. When algorithmicpredictions use human-generated data: A bias-aware classi-fication algorithm for breast cancer diagnosis. InformationSystems Research 30(1):97–116.

[Angwin et al. 2016] Angwin, J.; Larson, J.; Mattu, S.; andKirchner, L. 2016. Machine bias: There’s software usedacross the country to predict future criminals. And it’s biasedagainst blacks. ProPublica.

[Barocas and Selbst 2016] Barocas, S., and Selbst, A. D.2016. Big data’s disparate impact. Calif. L. Rev. 104:671.

[Baudry and Robert 2019] Baudry, M., and Robert, C. Y.2019. A machine learning approach for individual claimsreserving in insurance. Applied Stochastic Models in Busi-ness and Industry 35(5):1127–1155.

[Beutel et al. 2017] Beutel, A.; Chen, J.; Zhao, Z.; and Chi,E. H. 2017. Data decisions and theoretical implica-tions when adversarially learning fair representations. arXivpreprint arXiv:1707.00075.

[Bhati and Roman 2014] Bhati, A., and Roman, C. G. 2014.Evaluating and quantifying the specific deterrent effects ofdna databases. Evaluation review 38(1):68–93.

[Calders et al. 2013] Calders, T.; Karim, A.; Kamiran, F.;Ali, W.; and Zhang, X. 2013. Controlling attribute effectin linear regression. In 2013 IEEE 13th international con-ference on data mining, 71–80. IEEE.

[Calders, Kamiran, and Pechenizkiy 2009] Calders, T.;Kamiran, F.; and Pechenizkiy, M. 2009. Building classifierswith independency constraints. In 2009 IEEE InternationalConference on Data Mining Workshops, 13–18. IEEE.

[Celis et al. 2019] Celis, L. E.; Huang, L.; Keswani, V.; andVishnoi, N. K. 2019. Classification with fairness constraints:A meta-algorithm with provable guarantees. In Proceed-ings of the conference on fairness, accountability, and trans-parency, 319–328.

[Chierichetti et al. 2018] Chierichetti, F.; Kumar, R.; Lat-tanzi, S.; and Vassilvitskii, S. 2018. Fair clustering throughfairlets. arXiv preprint arXiv:1802.05733.

[Chouldechova and Roth 2018] Chouldechova, A., andRoth, A. 2018. The frontiers of fairness in machinelearning. arXiv preprint arXiv:1810.08810.

[Dal Pozzolo et al. 2014] Dal Pozzolo, A.; Caelen, O.;Le Borgne, Y.-A.; Waterschoot, S.; and Bontempi, G. 2014.

Learned lessons in credit card fraud detection from a prac-titioner perspective. Expert systems with applications41(10):4915–4928.

[Dressel and Farid 2018] Dressel, J., and Farid, H. 2018. Theaccuracy, fairness, and limits of predicting recidivism. Sci-ence advances 4(1):eaao5580.

[Dua and Graff 2017] Dua, D., and Graff, C. 2017. UCI ma-chine learning repository.

[Feldman et al. 2015] Feldman, M.; Friedler, S. A.; Moeller,J.; Scheidegger, C.; and Venkatasubramanian, S. 2015. Cer-tifying and removing disparate impact. In proceedings of the21th ACM SIGKDD international conference on knowledgediscovery and data mining, 259–268.

[Feldman 2015] Feldman, M. 2015. Computational fairness:Preventing machine-learned discrimination. Ph.D. Disserta-tion.

[Gianfrancesco et al. 2018] Gianfrancesco, M. A.; Tamang,S.; Yazdany, J.; and Schmajuk, G. 2018. Potential biases inmachine learning algorithms using electronic health recorddata. JAMA internal medicine 178(11):1544–1547.

[Hardt, Price, and Srebro 2016] Hardt, M.; Price, E.; andSrebro, N. 2016. Equality of opportunity in supervisedlearning. In Advances in neural information processing sys-tems, 3315–3323.

[Huang, Chen, and Wang 2007] Huang, C.-L.; Chen, M.-C.;and Wang, C.-J. 2007. Credit scoring with a data miningapproach based on support vector machines. Expert systemswith applications 33(4):847–856.

[Iosifidis and Ntoutsi 2020] Iosifidis, V., and Ntoutsi, E.2020. Fabboo-online fairness-aware learning under class im-balance. In Discovery Science.

[Jain et al. 2019] Jain, B.; Huber, M.; Fegaras, L.; and El-masri, R. A. 2019. Singular race models: addressing biasand accuracy in predicting prisoner recidivism. In Proceed-ings of the 12th ACM International Conference on PErva-sive Technologies Related to Assistive Environments, 599–607. ACM.

[Jain et al. 2020a] Jain, B.; Huber, M.; Elmasri, R.; and Fe-garas, L. 2020a. Using bias parity score to find feature-richmodels with least relative bias. Technologies 8(4):68.

[Jain et al. 2020b] Jain, B.; Huber, M.; Elmasri, R. A.; andFegaras, L. 2020b. Reducing race-based bias and increasingrecidivism prediction accuracy by using past criminal his-tory details. In Proceedings of the 13th ACM InternationalConference on PErvasive Technologies Related to AssistiveEnvironments, 1–8.

[Jiang and Nachum 2020] Jiang, H., and Nachum, O. 2020.Identifying and correcting label bias in machine learning.In International Conference on Artificial Intelligence andStatistics, 702–712. PMLR.

[Kamishima et al. 2012] Kamishima, T.; Akaho, S.; Asoh,H.; and Sakuma, J. 2012. Fairness-aware classifier with prej-udice remover regularizer. In Joint European Conference onMachine Learning and Knowledge Discovery in Databases,35–50. Springer.

Page 9: Bhanu Jain, Manfred Huber Ramez Elmasri

[Kingma and Ba 2014] Kingma, D. P., and Ba, J. 2014.Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980.

[Kleinberg, Mullainathan, and Raghavan 2016] Kleinberg,J.; Mullainathan, S.; and Raghavan, M. 2016. Inherenttrade-offs in the fair determination of risk scores. arXivpreprint arXiv:1609.05807.

[Kohavi and others 1996] Kohavi, R., et al. 1996. Scalingup the accuracy of naive-bayes classifiers: A decision-treehybrid. In Kdd, volume 96, 202–207.

[Krasanakis et al. 2018] Krasanakis, E.; Spyromitros-Xioufis, E.; Papadopoulos, S.; and Kompatsiaris, Y. 2018.Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In Proceedings of the 2018 World WideWeb Conference, 853–862.

[Louizos et al. 2015] Louizos, C.; Swersky, K.; Li, Y.;Welling, M.; and Zemel, R. 2015. The variational fair au-toencoder. arXiv preprint arXiv:1511.00830.

[Mehrabi et al. 2019] Mehrabi, N.; Morstatter, F.; Saxena,N.; Lerman, K.; and Galstyan, A. 2019. A survey onbias and fairness in machine learning. arXiv preprintarXiv:1908.09635.

[Mishler, Kennedy, and Chouldechova 2021] Mishler, A.;Kennedy, E. H.; and Chouldechova, A. 2021. Fairnessin risk assessment instruments: Post-processing to achievecounterfactual equalized odds. In Proceedings of the2021 ACM Conference on Fairness, Accountability, andTransparency, 386–400.

[Noriega-Campero et al. 2019] Noriega-Campero, A.;Bakker, M. A.; Garcia-Bulle, B.; and Pentland, A. 2019.Active fairness in algorithmic decision making. In Proceed-ings of the 2019 AAAI/ACM Conference on AI, Ethics, andSociety, 77–83.

[Ntoutsi et al. 2020] Ntoutsi, E.; Fafalios, P.; Gadiraju, U.;Iosifidis, V.; Nejdl, W.; Vidal, M.-E.; Ruggieri, S.; Turini,F.; Papadopoulos, S.; Krasanakis, E.; et al. 2020. Bias indata-driven artificial intelligence systems—an introductorysurvey. Wiley Interdisciplinary Reviews: Data Mining andKnowledge Discovery 10(3):e1356.

[Oneto, Donini, and Pontil 2020] Oneto, L.; Donini, M.; andPontil, M. 2020. General fair empirical risk minimization.In 2020 International Joint Conference on Neural Networks(IJCNN), 1–8. IEEE.

[Orphanou et al. 2021] Orphanou, K.; Otterbacher, J.; Klean-thous, S.; Batsuren, K.; Giunchiglia, F.; Bogina, V.; Tal,A. S.; Kuflik, T.; et al. 2021. Mitigating bias in algorithmicsystems: A fish-eye view of problems and solutions acrossdomains. arXiv preprint arXiv:2103.16953.

[Ozkan 2017] Ozkan, T. 2017. Predicting RecidivismThrough Machine Learning. Ph.D. Dissertation.

[Santurkar et al. 2018] Santurkar, S.; Tsipras, D.; Ilyas, A.;and Madry, A. 2018. How does batch normalization helpoptimization? In Advances in neural information processingsystems, 2483–2493.

[Srivastava et al. 2014] Srivastava, N.; Hinton, G.;Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R.

2014. Dropout: a simple way to prevent neural networksfrom overfitting. The journal of machine learning research15(1):1929–1958.

[United States Department of Justice 2010] United StatesDepartment of Justice, Office of Justice Programs,Bureau of Justice Statistics. 2010. Criminal re-cidivism in a large cohort of offenders releasedfrom prison in florida, 2004-2008. Available athttps://www.icpsr.umich.edu/icpsrweb/NACJD/studies/27781/version/1/variables(2010/07/29).

[United States Department of Justice 2014] UnitedStates Department of Justice, Office of Justice Pro-grams, Bureau of Justice Statistics. 2014. Recidi-vism of prisoners released in 1994. Available athttps://www.icpsr.umich.edu/icpsrweb/NACJD/studies/3355/variables (2014/12/05).

[Verma and Rubin 2018] Verma, S., and Rubin, J. 2018.Fairness definitions explained. In 2018 IEEE/ACM Inter-national Workshop on Software Fairness (FairWare), 1–7.IEEE.

[Zafar et al. 2015] Zafar, M. B.; Valera, I.; Rodriguez, M. G.;and Gummadi, K. P. 2015. Learning fair classifiers. arXivpreprint arXiv:1507.05259 1(2).

[Zafar et al. 2017] Zafar, M. B.; Valera, I.; Gomez Ro-driguez, M.; and Gummadi, K. P. 2017. Fairness beyond dis-parate treatment & disparate impact: Learning classificationwithout disparate mistreatment. In Proceedings of the 26thinternational conference on world wide web, 1171–1180.

[Zemel et al. 2013] Zemel, R.; Wu, Y.; Swersky, K.; Pitassi,T.; and Dwork, C. 2013. Learning fair representations. InInternational Conference on Machine Learning, 325–333.

[Zeng, Ustun, and Rudin 2017] Zeng, J.; Ustun, B.; andRudin, C. 2017. Interpretable classification models for re-cidivism prediction. Journal of the Royal Statistical Society:Series A (Statistics in Society) 180(3):689–722.

[Zhang, Lemoine, and Mitchell 2018] Zhang, B. H.;Lemoine, B.; and Mitchell, M. 2018. Mitigating un-wanted biases with adversarial learning. In Proceedings ofthe 2018 AAAI/ACM Conference on AI, Ethics, and Society,335–340.