research-article

Open access

Counterfactual Explanation for Fairness in Recommendation

Authors:

Xiangmeng Wang,

Qian Li,

Dianer Yu,

Qing Li, and

Guandong XuAuthors Info & Claims

ACM Transactions on Information Systems, Volume 42, Issue 4

Article No.: 106, Pages 1 - 30

https://doi.org/10.1145/3643670

Published: 22 March 2024 Publication History

PDF eReader

Abstract

Fairness-aware recommendation alleviates discrimination issues to build trustworthy recommendation systems. Explaining the causes of unfair recommendations is critical, as it promotes fairness diagnostics, and thus secures users’ trust in recommendation models. Existing fairness explanation methods suffer high computation burdens due to the large-scale search space and the greedy nature of the explanation search process. Besides, they perform feature-level optimizations with continuous values, which are not applicable to discrete attributes such as gender and age. In this work, we adopt counterfactual explanations from causal inference and propose to generate attribute-level counterfactual explanations, adapting to discrete attributes in recommendation models. We use real-world attributes from Heterogeneous Information Networks (HINs) to empower counterfactual reasoning on discrete attributes. We propose a Counterfactual Explanation for Fairness (CFairER) that generates attribute-level counterfactual explanations from HINs for item exposure fairness. Our CFairER conducts off-policy reinforcement learning to seek high-quality counterfactual explanations, with attentive action pruning reducing the search space of candidate counterfactuals. The counterfactual explanations help to provide rational and proximate explanations for model fairness, while the attentive action pruning narrows the search space of attributes. Extensive experiments demonstrate our proposed model can generate faithful explanations while maintaining favorable recommendation performance.

1 Introduction

Recommendation system (RS) has been prevalent in many online services, e.g., e-commerce [10, 24]. It helps users discover their preferred content and benefits content providers by profiting from content exposures. RS serves a resource allocation role by allocating those contents (e.g., items) to multi-sided stakeholders; thus, whether the allocation is fair, i.e., recommendation fairness [31], becomes a critical problem that demands attention. Unfortunately, recent research has shown that RS can suffer from various unfairness issues [6, 16, 20, 30, 31]. For instance, an RS may differentiate items to allocate more exposure to popular items while overlooking niche items [3, 56], causing major concerns such as Matthew effect [38]. Fairness-aware recommendation [16] has emerged as a promising solution to prevent unintended discrimination and unfairness in RS. Most fairness-aware recommendation research focuses on defining fairness measurements [30, 70] and developing feasible approaches [3, 17] to reduce the fairness disparity of recommendation results. However, an even more fundamental problem is understanding why a model is unfair, i.e., what causes unfair recommendation results. Fairness explanation, as an important aspect of fairness research, would enhance the design of fairness-aware recommendation approaches by tracking unfair factors for model curation.

The research on fairness explanations is very limited. Existing works can be categorized as factual and counterfactual explainable methods. Factual explainable methods aim to estimate the importance of features for model fairness. Begley et al. [4] use Shapley value estimation to explore fairness explanations for machine learning models. They calculate Shapley values of features to reflect their significance and then generate explanations based on calculated values. Deldjoo et al. [11] propose to use explanatory modeling to investigate the relationship between data characteristics and the fairness of recommendation models. Through statistical significance tests with informed p-values, they determine the explanatory variables that are significant for recommendation models. However, estimating Shapley values and p-values should consider all features/characteristics across possible coalitions [31]. This poses a challenge of exponentially growing search space when the number of features/characteristics increases. More importantly, these two factual explainable methods only consider the relative importance of features/characteristics to the model fairness, while their removal does not guarantee a reduction in the model’s unfairness. This limitation makes the explanations less reliable regarding their effectiveness in mitigating unfairness.

Counterfactual explainable methods, on the contrary, learn fairness explanations as the “minimal” change of the input features that guarantee the reduction of unfairness [18, 49]. They provide “what-if” explanations to determine the most vital and essential (i.e., minimal) factors that change model fairness [49]. Unlike factual explainable methods with high complexity, counterfactual explanations have the advantage of being minimal w.r.t. the generated explanations and are faithful to model fairness changes. Unfortunately, there is only one counterfactual explainable method for recommendation fairness. Ge et al. [19] explain which item feature changes the item exposure fairness of the feature-based recommendation models. They perturb feature scores within pre-defined user-aspect and item-aspect matrices and feed the perturbed matrices into a recommendation model. Those perturbed features that change the fairness disparity are considered fairness explanations. The existing counterfactual explainable method suffers from the following limitations: (1) It conducts optimization on feature matrices, which require vector-wise perturbations. However, when dealing with high-dimensional features, vector-wise perturbations may not significantly change perturbed matrices as feature matrices would become large-scale. In such cases, the effectiveness of perturbing feature matrices to identify sensitive features can be limited. (2) It is primarily designed for feature-based models and is not well-suited for handling discrete attributes in attribute-aware recommendation models. Including context-based filtering [1, 42, 43] and knowledge-aware systems [26, 51], modern recommendation models are largely attribute-aware and rely on explicit attributes to generate recommendations. Those attributes are user and item demographics, which are generated through data tagging [22] on discrete attributes, e.g., genre, language, and location. However, the feature-level optimization in [19] can only deal with continuous values and cannot be directly applied to discrete attributes. For example, assigning a continuous value, such as gender=0.19, to the discrete gender attribute is impractical in constructing explanations and provides no valuable clue to improve the explanation. Consequently, feature-level optimizations have limited capability in handling discrete attributes that are frequently encountered in recommendation scenarios.

Unlike previous studies, our work focuses on exploring attribute-level counterfactual explanations for attribute-aware recommendation models. We aim to answer: what the fairness would be if a minimal set of attributes had been different. Driven by recent successes in Heterogeneous Information Networks (HINs)-enhanced recommendations [26, 42, 43, 53, 54, 60], we propose to leverage real-world attributes from a HIN for counterfactual reasoning when dealing with discrete attributes. In contrast to value-based features, real-world attributes residing in HINs are represented as discrete nodes, with edges denoting their connections. By utilizing attributes from HINs, we can overcome the limitation of feature-level optimizations to directly measure whether the removal of specific attributes changes the model’s fairness. We use a toy example in Figure 1 to illustrate how HIN assists in learning attribute-level counterfactual explanations. Given a HIN carrying attributes of recommended items \(i_1,i_2\) and users \(u_1, u_2\) , we want to know why \(i_1\) and \(i_2\) cause discrimination in recommendation results. The counterfactual explanation performs “what-if” reasoning by removing user and item attributes from the HIN and checking the fairness disparity of recommendation results. Both \(E_1\) and \(E_2\) are valid candidate explanations since they reduce (i.e., \(\downarrow\) ) fairness disparities of recommendations (i.e., \(i_3\) , \(i_4\) ) from 0.90 to 0.19. To determine which attributes are the primary reason for unfairness, the counterfactual explanation will use the minimal attribute changes, i.e., \(E_1\) , instead of utilizing attribute combinations in \(E_2\) . Thus, we could infer \(E_1\) is the most vital reason for model unfairness. Besides, since a counterfactual explanation \(E_1\) is minimal, it only reveals the essential attributes (i.e., “Female”) that effectively explain unfairness while discarding the less accurate explanations.

Fig. 1.

We propose a Counterfactual Explanation for Fairness (CFairER) method to find optimal attribute-level counterfactual explanations from a given HIN. Particularly, we focus on generating explanations for item exposure unfairness to promote the fair allocation of user-preferred but less exposed items. Note that the proposed approach maintains generalizability and can be integrated with different recommendation models to mitigate item exposure unfairness. Specifically, we use an off-policy reinforcement learning agent in CFairER to optimize a fairness explanation policy by uniformly exploring candidate counterfactuals from a given HIN. We devise the attentive action pruning over the HIN to reduce the search space of reinforcement learning. We propose a counterfactual reward to ensure the rationality of the learned fairness explanations. Finally, our CFairER optimizes the explanation policy using an unbiased counterfactual risk minimization objective, resulting in accurate attribute-level counterfactual explanations for fairness. The contributions of this work are:

—

We make the first attempt to leverage rich attributes in a Heterogeneous Information Network to offer attribute-level counterfactual explanations for recommendation fairness.

—

We propose an off-policy learning framework to identify optimal counterfactual explanations, which is guided by attentive action pruning to reduce the search space.

—

We devise a counterfactual risk minimization for off-policy correction to achieve unbiased policy optimization.

—

Comprehensive experiments show the superiority of our method in generating trustworthy explanations for fairness while preserving satisfactory recommendation performance.

2 Related Work

2.1 Fairness Explanation for Recommendation

Recommender systems have long dealt with major concerns of recommendation unfairness, which profoundly harm user satisfaction [16, 30] and stakeholder benefits [6, 20, 31]. Recent works on fairness-aware recommendation mainly discuss two primary topics, i.e., user-side fairness [2, 9, 16, 30, 32, 70] and item-side fairness [3, 13, 17, 33]. User-side fairness concerns whether the recommendation is fair to different users/user groups, e.g., retaining equivalent accuracy or recommendation explainability. Relevant approaches attribute the causes of user-side unfairness to discrimination factors, such as sensitive features (e.g., gender [9, 70], age [32]) and user inactiveness [16, 30], and the like. They mainly propose fairness metrics to constraint recommendation models (e.g., collaborative filtering [70]) to produce fair recommendations. For example, Yao and Huang [70] study the unfairness of collaborative filtering (CF)-based recommenders on gender-imbalanced data. They propose four metrics to assess different types of fairness, then add these metrics as constraints to the CF model learning objective to produce fair recommendations. Chen et al. [9] investigate gender-based inequalities in the context of resume ranking engines. They use statistical tests to prove the existence of direct and indirect gender discrimination in resume ranking results provided to job recruiters. Li et al. [30] investigate the unfair recommendation between active and inactive user groups, and provide a re-ranking approach to mitigate the activity unfairness by adding constraints over evaluation metrics of ranking. As modern content providers are more concerned about user privacy, it is generally not easy to access sensitive user features for the recommendation [40]. Meanwhile, users often prefer not to disclose personal information that raises discrimination [5], limiting the application of user-side fairness recommendations. Another topic of item-side fairness-aware recommendation [3, 13, 17, 33] is interested in examining whether the recommendation treats items fairly, e.g., similar ranking prediction errors for different items, fair allocations of exposure to each item. For instance, Abdollahpouri et al. [3] address item exposure unfairness in learning-to-rank (LTR) recommenders. They include a fairness regularization term in the LTR objective function, which controls the recommendations favored toward popular items. Ge et al. [17] consider the dynamic fairness of item exposure due to changing group labels of items. They calculate the item exposure unfairness with a fairness-related cost function. The cost function is merged into a Markov Decision Process to capture the dynamic item exposure for recommendations. Liu et al. [33] focus on item exposure unfairness in interactive recommender systems (IRS). They propose a reinforcement learning method to maintain a long-term balance between accuracy and exposure fairness in IRS. Other successful investigations of fairness can also be closely related to enhancing recommendation diversity [27, 35], e.g., increase aggregate diversity [35], and increasing fairness transparency of graph neural networks [14].

Despite the great efforts, fairness-aware recommendations mitigate user and item unfairness in a black-box manner but do not explain why the unfairness appears. Understanding the “why” is desirable for both model transparency [31] and facilitates data curation to remove unfair factors [61]. Limited pioneering studies are conducted to explain fairness. Begley et al. [4] estimate Shapley values of input features to search which features contribute more to the model unfairness. Deldjoo et al. [11] uses regression-based explanatory modeling to investigate the relationship between data characteristics and the fairness of recommendation models. They define data characteristics as explanatory variables (EVs) and the recommendation outcome as dependent variables (DVs). Through statistical significance tests with informed p-values, they determine the EVs that are significant for recommendation models. Ge et al. [19] develop a counterfactual explainable fairness model for recommendation to explain which item aspects influence item exposure fairness. They perform perturbations on item aspect scores, and then apply perturbed aspect scores on two pre-defined matrices to observe fairness changes. These prior efforts suffer from major limitations: (1) The high computational burden caused by the large-scale search space and the greedy nature of the explanation search process. Estimating Shapley values and p-values should consider all features/characteristics across possible coalitions [31]. This poses a challenge of exponentially growing search space when the number of features increases. As the feature space of a recommendation system often comprises a large number of user/item features, estimating Shapley values and p-values becomes computationally burdensome. Furthermore, statistical estimations only consider the relative importance of features for model fairness. Removing these features/characteristics based on their importance does not guarantee a reduction in the unfairness of the model. (2) Both existing methods generate explanations by feature-level optimizations that do not apply to discrete attributes. Different from existing works, our work focuses on generating attribute-level counterfactual explanations, aiming to provide faithful fairness explanations for attribute-aware models. We consider explaining recommendation unfairness based on attributes from a Heterogeneous Information Network. We also reduce the large search space by attentive action pruning in the off-policy learning environment.

2.2 Heterogeneous Information Network in Recommendation

Heterogeneous Information Network (HIN) is a powerful structure that allows for the heterogeneity of its recorded data, i.e., various types of attributes, thus providing rich information to empower recommendations [54, 60]. HINs have been widely adopted in recommendation models to boost their performance; representative works cover context-based filtering (e.g., SemRec [43], HERec [42]) and knowledge-based systems (e.g., MCrec [26], HAN [53]). For instance, HERec [42] embeds meta-paths within a HIN as dense vectors, then fuses these HIN embeddings with user and item embeddings to augment the semantic information for recommendations. MCrec [26] leverages a deep neural network to model meta-path-based contextual embeddings and propagates the context to user and item representations with a co-attention mechanism. Those recommendation models observe promising improvements by augmenting contextual and semantic information given by HINs. Despite the great efforts, prior works do not consider using the HIN to explain unfair factors in recommendations. Novel to this work, we first attempt to leverage rich attributes in a HIN to provide counterfactual explanations for item exposure fairness.

2.3 Counterfactual Explanation

Counterfactual explanations have been considered as satisfactory explanations [29, 65, 71] and elicit causal reasoning in humans [8, 55, 72, 73]. Works on counterfactual explanations have been proposed very recently to improve the explainability of recommendations. Xiong et al. [67] propose a constrained feature perturbation on item features and consider the perturbed item features as explanations for ranking results. Ghazimatin et al. [21] perform random walks over a Heterogeneous Information Network to look for minimal sets of user action edges (e.g., click) that change the PageRank scores. Tran et al. [46] identify minimal sets of user actions that update the parameters of neural models. Our work differs from prior works on counterfactual explanations by two key points: (1) In terms of problem definition, they generate counterfactual explanations to explain user behaviors (e.g., click [21, 46]) or recommendation (e.g., ranking [67]) results. Our method generates counterfactual explanations to explain which attributes affect recommendation fairness. (2) In terms of technique, our method formulates counterfactual reasoning as reinforcement learning, which can deal with ever-changing item exposure unfairness.

3 Preliminary

We first introduce the Heterogeneous Information Network, which offers real-world attributes for fairness explanation learning. We then give the key terminologies, including fairness disparity evaluation and counterfactual explanation for fairness.

3.1 Heterogeneous Information Network

In order to generate attribute-level fairness explanations, we use a Heterogeneous Information Network (HIN) to provide auxiliary attributes for finding potential fairness-related attributes, e.g., user gender. The HIN has shown its power in modeling various types of attributes, e.g., user social relations and item brands. In particular, suppose we have the logged data that records users’ historical behaviors (e.g., clicks) in the recommendation scenario. Let \(\mathcal {U} \in \mathbb {R}^{M}\) , \(\mathcal {I} \in \mathbb {R}^{N}\) denote the sets of users and items, respectively. We can define a user-item interaction matrix \(\boldsymbol {Y}=\left\lbrace y_{uv} \mid u \in \mathcal {U}, v \in \mathcal {I}\right\rbrace\) according to the logged data. We also have additional attributes from external resources that profile users and items, e.g., users’ genders and items’ genres. The connections between all attributes and users/items are absorbed in the relation set \(\mathcal {E}\) . Those attributes, with their connections with user-item interactions, are uniformly formulated as a HIN. Formally, a HIN is defined as \(\mathcal {G}=(\mathcal {V}^\prime ,\mathcal {E}^\prime)\) , where \(\mathcal {V}^\prime =\mathcal {U} \cup \mathcal {I} \cup \mathcal {V}_U \cup \mathcal {V}_I\) , and \(\mathcal {E}^\prime = \lbrace \mathbb {I}({y_{uv})}\rbrace \cup \mathcal {E}\) . \(\mathbb {I}(\cdot)\) is an edge indicator that denotes the observed edge between user u and item v when \(y_{uv} \in \boldsymbol {Y}=1\) . \(\mathcal {V}_U\) and \(\mathcal {V}_I\) are attribute sets for users and items, respectively. Each node \(n \in \mathcal {V}^\prime\) and each edge \(e \in \mathcal {E}^\prime\) are mapped into specific types through node type mapping function: \(\phi : \mathcal {V}^\prime \rightarrow \mathcal {K}\) and edge type mapping function: \(\psi : \mathcal {E}^\prime \rightarrow \mathcal {J}\) . \(\mathcal {G}\) maintain heterogeneity, i.e., \(|\mathcal {K}|+|\mathcal {J}| \gt 2\) .

3.2 Fairness Disparity

We consider explaining the item exposure (un)fairness in recommendations. We first split items in historical user-item interactions into head-tailed group \(G_{0}\) the long-tailed group \(G_{1}\) ¹. Note that items from the head-tailed group refer to the top popular items that dominate historical user-item interactions. Following previous works [17, 19], we use demographic parity (DP) and exact-K (EK) defined on item subgroups to measure whether a recommendation result is fair. In particular, DP requires that each item has the same likelihood of being classified into \(G_{0}\) and \(G_{1}\) . EK regulates the item exposure across each subgroup to remain statistically indistinguishable from a given maximum \(\alpha\) . By evaluating the deviation of recommendation results from the two fairness criteria, we can calculate the fairness disparity, i.e., to what extent the recommendation model is unfair. Formally, giving a recommendation result \(H_{u, K}\) , the fairness disparity \(\Delta (H_{u, K})\) of \(H_{u, K}\) is:

\(\begin{equation} \begin{array}{ll} \Delta (H_{u, K})=(1-\lambda)\left|\Psi _{D P}\right|+\lambda \left|\Psi _{E K}\right| \text{, } \\ \Psi _{D P}=\left|G_{1}\right| \cdot \text{ Exposure }\left(G_{0} \mid H_{u, K}\right)-\left|G_{0}\right| \cdot \text{ Exposure }\left(G_{1} \mid H_{u, K}\right), \\ \Psi _{E K}= \alpha \cdot \text{ Exposure }\left(G_{0} \mid H_{u, K}\right)-\text{ Exposure }\left(G_{1} \mid H_{u, K}\right) \end{array} \end{equation}\)

(1)

where K is the recommendation list length of \(H_{u, K}\) , \(\Delta (\cdot)\) is the fairness disparity metric that quantifies model fairness status. \(\lambda\) is the trade-off parameter between DP and EK, where \(\lambda\) is chosen from \(\lbrace 0,1\rbrace\) and \(\lambda =0\) representing DP is considered as the fairness measurement while \(\lambda =1\) denoting EK is used as the fairness measurement. The definition of \(\Delta (\cdot)\) enables us to choose the optimal disparity definition for a variety of recommendation tasks, e.g., music and movie recommendations. \(\text{Exposure}\left(G_{j} \mid H_{u, K}\right)\) is the item exposure number of \(H_{u, K}\) within \(G_j\) w.r.t. \(j \in \lbrace 0,1\rbrace\) .

3.3 Counterfactual Explanation for Fairness

This work aims to generate attribute-level counterfactual explanations for item exposure fairness. In particular, we aim to find the “minimal” changes in attributes that reduce the fairness disparity (cf. Equation (1)) of item exposure. Formally, given historical user-item interaction \(\boldsymbol {Y}\) , and user attribute set \(\mathcal {V}_U\) and item attribute set \(\mathcal {V}_I\) extracted from an external Heterogeneous Information Network (HIN) \(\mathcal {G}\) . Suppose there exists a recommendation model that produces the recommendation result \(H_{u, K}\) for user u. Given all user-item pairs \((u,v)\) in \(H_{u, K}\) , our goal is to find a minimal attributes set \(\mathcal {V}^{*} \subseteq \lbrace \lbrace e_u, e_v\rbrace \mid (u, e_u), (v, e_v) \in \mathcal {E}^\prime , e_u \in \mathcal {V}_U, e_v \in \mathcal {V}_I\rbrace\) . Each attribute in \(\mathcal {V}^{*}\) is an attribute entity from HIN \(\mathcal {G}\) , e.g., user’s gender and item’s genre. With a minimal set of \(\mathcal {V}^{*}\) , the counterfactual reasoning pursues to answer: what the fairness disparity would be, if \(\mathcal {V}^{*}\) is applied to the recommendation model. \(\mathcal {V}^{*}\) is recognized as a valid counterfactual explanation for fairness, if after applied \(\mathcal {V}^{*}\) , the fairness disparity of the intervened recommendation result \(\Delta (H_{u, K}^{cf})\) is reduced compared with original \(\Delta (H_{u, K})\) . In addition, \(\mathcal {V}^{*}\) is minimal such that there is no smaller set \(\mathcal {V}^{*^{\prime }}\in \mathcal {G}\) satisfying \(|\mathcal {V}^{*^{\prime }}| \lt |\mathcal {V}^{*}|\) when \(\mathcal {V}^{*^{\prime }}\) is also valid.

4 The CFairER Framework

We now introduce the framework of our Counterfactual Explanation for Fairness (CFairER). As shown in Figure 2, CFairER devises three major components: (1) graph representation module embeds users, items, and attributes among HIN as embedding vectors; (2) recommendation model learns user and item latent factors to produce recommendation results; and (3) our proposed counterfactual fairness explanation (CFE) model assisted by the graph representation module and the recommendation model to conduct counterfactual reasoning. This section discusses how the CFE model collaborates with the other two components, and then introduces the graph representation module and the recommendation model. We will elaborate on our proposed CFE model in the next section.

Fig. 2.

4.1 Counterfactual Fairness Explanation Model

As shown in Figure 2, our CFE model is crafted within an off-policy learning environment, in which an explanation policy \(\pi _E\) is optimized to produce attribute-level counterfactual explanations for fairness. At each state \(s_t\) , \(\pi _E\) produces actions \(a_t\) absorbing user and item attributes as potential counterfactual explanations. These actions are committed to the recommendation model and graph representation module to produce the reward \(r(s_t, a_t)\) for optimizing \(\pi _E\) . Specifically, the graph representation module provides dense vectors \(\mathbf {h}_{u}\) , \(\mathbf {h}_v\) , \(\mathbf {e}_{u}\) and \(\mathbf {e}_{v}\) as user, item, user attribute and item attribute embeddings, respectively. Those embeddings are used in the state representation learning (i.e., learn \(s_t\) ) and attentive action pruning (i.e., select \(a_t\) ) in our CFE model. Moreover, the attribute embeddings are fused with user or item latent factors learned by the recommendation model to explore the model fairness change. In particular, the fused embeddings of users and items are used to predict the intervened recommendation result \(H_{u, K}^{cf}\) . By comparing the fairness disparity (cf. Equation (1)) difference between \(H_{u, K}^{cf}\) and the original recommendation \(H_{u, K}\) , we determine the reward \(r(s_t, a_t)\) to optimize \(\pi _E\) , accordingly. The reward \(r(s_t, a_t)\) measures whether the current attribute (i.e., action) is a feasible fairness explanation responsible for the fairness change. Finally, \(\pi _{E}\) is optimized with a counterfactual risk minimization (CRM) objective \(\nabla _{\Theta } \widehat{R}\left(\pi _E\right)\) to balance the distribution discrepancy from the logging policy \(\pi _0\) .

4.2 Graph Representation Module

Our graph representation module conducts heterogeneous graph representation learning to produce dense vectors of users, items, and attributes among the HIN. Compared with homogeneous graph learning such as GraphSage [23], our graph representation injects both node and edge heterogeneity to preserve the complex structure of the HIN. In particular, we include two weight matrices to specify varying weights of different node and edge types.

In the following, we present the graph learning for user embedding \(\mathbf {h}_{u}\) . The embeddings of \(\mathbf {h}_{v}\) , \(\mathbf {e}_u\) , and \(\mathbf {e}_v\) can be obtained analogously by replacing nodes and node types during computations. Specifically, we first use Multi-OneHot [75] to initialize node embeddings at the 0-th layer, in which u’s embedding is denoted by \(\mathbf {h}_u^{0}\) . Then, at each layer l, user embedding \(\mathbf {h}_{u}^{l}\) is given by aggregating node u’s neighbor information w.r.t. different node and edge types:

\(\begin{equation} \mathbf {h}_{u}^{l}=\sigma \left(\operatorname{concat }\left[\mathbf {W}_{\phi (u)}^{l} \operatorname{D}_{p}\left[\mathbf {h}_{u}^{l-1}\right], \frac{\mathbf {W}_{\psi (e)}^{l}}{\left|\mathcal {N}_{\psi (e)}(u)\right|} \sum _{u^\prime \in \mathcal {N}_{\psi (e)}(u)} \operatorname{D}_{p}\left[\mathbf {h}_{u^\prime }^{l-1}\right] \right]+b^{l}\right) \end{equation}\)

(2)

where \(\sigma (\cdot)\) is LeakyReLU [68] activation function and \(\operatorname{concat}(\cdot)\) is the concatenation operator. \(\operatorname{D}_{p}[\cdot ]\) is a random dropout with probability p applied to its argument vector. \(\mathbf {h}_{u}^{l-1}\) is u’s embedding at layer \(l-1\) . \(\mathcal {N}_{\psi (e)}(u)=\left\lbrace u^{\prime } \mid \left(u, e, u^{\prime }\right) \in \mathcal {G} \right\rbrace\) is a set of nodes connected with user node u through edge type \(\psi (e)\) . The additionally dotted two weight matrices, i.e., node-type matrix \(\mathbf {W}_{\phi (u)}^{l}\) and edge-type matrix \(\mathbf {W}_{\psi (e)}^{l}\) , are defined based on the importance of each type \(\phi (u)\) and \(\psi (e)\) . \(b^{l}\) is an optional bias.

With Equation (2), we obtain u’s embedding \(\mathbf {h}_{u}^{l}\) at each layer \(l \in \lbrace 1,\cdots , L\rbrace\) . We then adopt layer-aggregation [69] to concatenate u’s embeddings at all layers into a single vector, i.e., \(\mathbf {h}_u=\mathbf {h}_u^{(1)} + \cdots + \mathbf {h}_u^{(L)}\) . Finally, we have user node u’s embedding \(\mathbf {h}_u\) through aggregation. The item embedding \(\mathbf {h}_{v}\) , user attribute embedding \(\mathbf {e}_u\) and item attribute embedding \(\mathbf {e}_v\) can be calculated analogously.

4.3 Recommendation Model

The recommendation model \(f_R\) is initialized using user-item interaction matrix \(\boldsymbol {Y}\) to produce the Top-K recommendation result \(H_{u, K}\) for all users. Here, we employ a linear and simple matrix factorization (MF) [39] as the recommendation model \(f_R\) . Particularly, MF initializes IDs of users and items as latent factors and uses the inner product of user and item latent factors as the predictive function:

\(\begin{equation} f_R(u,v)=\boldsymbol {U}_{u}^{\top } \boldsymbol {V}_{v} \end{equation}\)

(3)

where \(\boldsymbol {U}_{u}\) and \(\boldsymbol {V}_{v}\) denote d-dimensional latent factors for user u and item v, respectively. We use the cross-entropy [77] loss to define the objective function of the recommendation model:

\(\begin{equation} \mathcal {L}_R = -\sum _{u, v, y_{uv} \in \boldsymbol {Y}} y_{uv} \log f_R({u,v})+\left(1-y_{uv}\right) \log \left(1-f_R({u,v})\right) \end{equation}\)

(4)

After optimizing the loss function \(\mathcal {L}_R\) , we can use the trained user and item latent factors (i.e., \(\boldsymbol {U}\) , \(\boldsymbol {V}\) ) to produce the original Top-K recommendation lists \(H_{u, K}\) for all users \(u \in \mathcal {U}\) .

5 Reinforcement Learning for Counterfactual Fairness Explanation

We put forward our counterfactual fairness explanation (CFE) model (cf. Figure 3), assisted by graph representation module and recommendation model, to generate explanation policy \(\pi _E\) for item exposure fairness. The explanation policy \(\pi _E\) is optimized within off-policy learning to adaptively learn attributes responsible for fairness changes. In the following, we first introduce off-policy learning for our CFE model. Then, we detail each key element in the off-policy learning and give unbiased policy optimization.

Fig. 3.

5.1 Explaining as Off-policy Learning

We cast our CFE model in an off-policy learning environment, which is formulated as Markov Decision Process (MDP). The MDP is provided with a static logged dataset generated by a logging policy \(\pi _0\) ². The logging policy \(\pi _0\) collects trajectories by uniformly sampling actions from the user and item attribute space. We use the off-policy learning to optimize an explanation (i.e., target) policy \(\pi _E\) by approximating the counterfactual rewards of state-action pairs from all timestamps, wherein the logging policy \(\pi _0\) is employed for exploration while the target policy \(\pi _E\) is utilized for decision-making. In the off-policy setting, the explanation policy \(\pi _E\) does not require following the original pace of the logging policy \(\pi _0\) . As a result, \(\pi _E\) is able to explore the counterfactual region, i.e., those actions that haven’t been taken by the previous agent using \(\pi _0\) . Formally, at each timestamp \(t \in \lbrace 1,\cdots ,T\rbrace\) of MDP, the explanation policy \(\pi _E(a_t|s_t)\) selects an action (i.e., a candidate attribute) \(a_t \in \mathcal {A}_t\) conditioning on the user state \(s_t \in \mathcal {S}\) , and receives counterfactual reward \(r(s_t, a_t) \in \mathcal {R}\) for this particular state-action pair. Then the current state transits to the next state \(s_{t+1}\) with transition probability of \(\mathbb {P}\left(s_{t+1}\mid s_{t}, a_{t}\right)\in \mathcal {P}\) . The whole MDP has the key elements:

—

\(\mathcal {S}\) is a finite set of states \(\lbrace s_t \mid t\in [1,\cdots , T]\rbrace\) . Each state \(s_t\) is transformed into dense vectors (i.e., embeddings) by our state representation learning (Section 5.1.1).

—

\(\mathcal {A}_t\) is a finite set of actions (i.e., attributes) available at \(s_t\) . \(\mathcal {A}_{t}\) is select from attributes \(\mathcal {V}_t \in \mathcal {G}\) by our attentive action pruning (Section 5.1.2) to reduce the search space.

—

\(\mathcal {P}: \mathcal {S} \times \mathcal {A} \rightarrow \mathcal {S}\) is the state transition, which absorbs transition probabilities of the current states to the next states. Given action \(a_t\) at state \(s_t\) , the transition to the next state \(s_{t+1}\) is determined as \(\mathbb {P}\left(s_{t+1}\mid s_{t}, a_{t}\right)\in \mathcal {P} =1\) .

—

\(\mathcal {R}: \mathcal {S} \rightarrow \mathcal {R}\) is the counterfactual reward measures whether a deployed action (i.e., an attribute) is a valid counterfactual explanation for fairness. \(\mathcal {R}\) is used to guide the explanation policy learning and is defined in Section 5.1.3.

We now introduce the implementation of each key component.

5.1.1 State Representation Learning.

The state \(\mathcal {S}\) describes target users and their recommendation lists from the recommendation model. Formally, at step t, the state \(s_t\) for a user u is defined as \(s_t=(u, H(u,K))\) , where \(u \in \mathcal {U}\) is a target user and \(H(u,K)\) is the recommendation produced by \(f_R\) . The initial state \(s_0\) is \((u, v)\) and v is the interacted item of u, i.e., \(y_{uv} \in \boldsymbol {Y}=1\) . Our state representation learning maps user state \(s_t=(u, H(u,K))\) into dense vectors for latter explanation policy learning. Specifically, given \(s_t\) that absorbs current user u and its recommendation \(H(u,K)=\lbrace v_1,v_2,...,v_K\rbrace\) . We first acquire the embedding \(\mathbf {h}_{v_k}\) of each item \(v_k \in H(u,K)\) from our graph representation module. The state \(s_t\) then receives the concatenated item embeddings (i.e., \(\operatorname{concat}\left[\mathbf {h}_{v_k}|\forall v_k \in H(u,K)\right]\) ) to update its representation. Considering states within \(\mathcal {S}\) have sequential patterns [59], we resort to Recurrent Neural Networks (RNN) with a gated recurrent unit (GRU) [12] to capture the sequential state trajectory. We firstly initialize the state representation \(s_0\) with an initial distribution \(s_{0} \sim \rho _{0}\) ³. Then we learn state representation \(s_t\) through the recurrent cell:

\(\begin{equation} \begin{aligned}\mathbf {u}_{t} &=\sigma _{g}\left(\mathbf {W}_{1} \operatorname{concat}\left[\mathbf {h}_{v_k}|\forall v_k \in H(u,K)\right]+\mathbf {U}_{1} s_{t-1}+b_1\right) \\ \mathbf {r}_{t} &=\sigma _{g}\left(\mathbf {W}_2 \operatorname{concat}\left[\mathbf {h}_{v_k}|\forall v_k \in H(u,K)\right]+\mathbf {U}_{2} s_{t-1}+{b}_{2}\right) \\ \hat{s}_{t} &=\sigma _{h}\left(\mathbf {W}_{3} \operatorname{concat}\left[\mathbf {h}_{v_k}|\forall v_k \in H(u,K)\right]+\mathbf {U}_{3}\left(\mathbf {r}_{t} \cdot s_{t-1}\right)+{b}_3\right) \\ s_{t} &=\left(1-\mathbf {u}_{t}\right) \cdot s_{t-1}+\mathbf {u}_{t} \odot \hat{s}_{t} \end{aligned} \end{equation}\)

(5)

where \(\mathbf {u}_{t}\) and \(\mathbf {r}_{t}\) denote the update gate and reset gate vector generated by GRU and \(\odot\) is the element-wise product operator. \(\mathbf {W}_{i}\) , \(\mathbf {U}_{i}\) are weight matrices and \({b}_{i}\) is the bias vector. Finally, \(s_{t}\) serves as the state representation at time step t.

5.1.2 Attentive Action Pruning.

Our attentive action pruning is designed to reduce the action search space by specifying the varying importance of actions for each state. As a result, the sample efficiency can be largely increased by filtering out irrelevant actions to promote an efficient action search. Our method defines actions as candidate attributes selected from a given HIN that potentially impact the model fairness. In particular, given state \(s_t=(u, H(u,K))\) , we can distill a set of attributes \(\mathcal {V}_t\) of the current user u and items \(v \in H(u,K)\) from the HIN. Intuitively, we can directly use \(\mathcal {V}_{t}\) as candidate actions for state \(s_t\) . However, the user and item attribute amount of the HIN would be huge, resulting in a large search space that terribly degrades the learning efficiency [58]. Thus, we propose an attentive action pruning based on attention mechanism [48] to select important candidate actions for each state. Formally, given the embedding \(\mathbf {e}_{i}\) for an attribute \(i \in \mathcal {V}_t\) from Equation (2), and the state representation \(s_{t}\) from Equation (5), the attention score \(\alpha _{i}\) of attribute i is:

\(\begin{equation} \alpha _{i}=\operatorname{ReLU}\left(\mathbf {W}_{s} s_{t}+\mathbf {W}_{h} \mathbf {e}_{i}+{b}\right) \end{equation}\)

(6)

where \(\mathbf {W}_{s}\) and \(\mathbf {W}_{h}\) are two weight matrices and \({b}\) is the bias vector.

We then normalize attentive scores of all attributes in \(\mathcal {V}_{t}\) and select attributes with n-top attention scores into \(\mathcal {A}_t\) :

\(\begin{equation} \mathcal {A}_t=\left\lbrace i \mid i \in \text{Top-n} \left[\frac{\exp \left(\alpha _{i}\right)}{\sum _{i^\prime \in \mathcal {V}_{t}} \exp \left(\alpha _{i^{\prime }}\right)}\right] \text{ and } i \in \mathcal {V}_{t} \right\rbrace \end{equation}\)

(7)

where n is the candidate size, and n is chosen based on the parameter analysis made in the ablation study (cf. Section 6.4). To the end, our candidate set \(\mathcal {A}_t\) is of high sample efficiency since it filters out irrelevant attributes while dynamically adapting to the user state shift.

5.1.3 Counterfactual Reward Definition.

The counterfactual reward \(r(s_t, a_t) \in \mathcal {R}\) measures whether a deployed action \(a_t \in \mathcal {A}_t\) is a valid counterfactual explanation for fairness at the current state \(s_t\) . Inspired by [57], the reward is defined based on two criteria: (1) Rationality [49]: deploying action (i.e., attribute) \(a_t\) should cause the reduction of fairness disparity regarding the item exposure fairness. The fairness disparity change is measured by the fairness disparity difference between the recommendation result before (i.e., \(\Delta (H_{u, K})\) ) and after (i.e., \(\Delta (H_{u, K}^{cf})\) ) fusing the action \(a_t\) to the recommendation model \(f_R\) , i.e., \(\Delta (H_{u, K})- \Delta (H_{u, K}^{cf})\) . (2) Proximity [15, 50]: a counterfactual explanation is a minimal set of attributes that changes the fairness disparity.

For the Rationality, we fuse the embedding of \(a_t\) with user or item latent factors from the recommendation model to learn updated user and item latent vectors, so as to get the \(\Delta (H_{u, K}^{cf})\) . Specifically, for a state \(s_t=(u, H(u,K))\) , the embedding \(\mathbf {e}_{t}\) of action \(a_t\) is fused to user latent factor \(\boldsymbol {U}_{u}\) for user u and item latent factors \(\boldsymbol {V}_{v_i}\) for all items \(v_i \in H(u,K)\) by an element-wise product fusion. As a result, we can get the updated latent factors \(\boldsymbol {U}_{u}^{cf}\) and \(\boldsymbol {V}_{v}^{cf}\) :

\(\begin{equation} \begin{aligned}\boldsymbol {U}_{u}^{cf} & \leftarrow \boldsymbol {U}_{u} \odot \left\lbrace \mathbf {e}_{t} \mid \forall t \in [1, \cdots , T]\right\rbrace , \text{if } a_t \in \mathcal {V}_U \\ \boldsymbol {V}_{v_i}^{cf} & \leftarrow \boldsymbol {V}_{v_i} \odot \left\lbrace \mathbf {e}_{t} \mid \forall t \in [1, \cdots , T]\right\rbrace , \text{if } a_t \in \mathcal {V}_I \end{aligned} \end{equation}\)

(8)

where \(\odot\) represents the element-wise product (a.k.a. Hadamard product). T is the total training iteration. At the initial state of \(t=0\) , user and item latent factors \(\boldsymbol {U}_{u}\) and \(\boldsymbol {V}_{v}\) are learned form Equation (3). Through Equation (8), the updated user and item latent vectors are then used to generate the intervened recommendation result \(H_{u, K}^{cf}\) .

For the Proximity, we compute whether \(a_t\) returns a minimal set of attributes that changes the recommendation model fairness. In other words, applying \(a_t\) on users or items should have a minimal impact on user and item latent factors, i.e., regulating user and item latent factors before (i.e., \(\boldsymbol {U}_{u}\) , \(\boldsymbol {V}_{v}\) ) and after (i.e., \(\boldsymbol {U}_{u}^{cf}\) , \(\boldsymbol {V}_{v}^{cf}\) ) fusing \(a_t\) to be as similar as possible.

Combine the two criteria, the counterfactual reward can be defined as the following form:

\(\begin{equation} r(s_t, a_t)=\left\lbrace \begin{array}{l}1+\operatorname{dist} (\boldsymbol {U}_{u}, \boldsymbol {U}_{u}^{cf})+ \operatorname{dist} (\boldsymbol {V}_{v}, \boldsymbol {V}_{v}^{cf}), \text{ if } \Delta (H_{u, K})- \Delta (H_{u, K}^{cf}) \ge \epsilon \\ \operatorname{dist} (\boldsymbol {U}_{u}, \boldsymbol {U}_{u}^{cf})+ \operatorname{dist} (\boldsymbol {V}_{v}, \boldsymbol {V}_{v}^{cf}), \text{ otherwise } \end{array}\right. \end{equation}\)

(9)

where \(\operatorname{dist}(\cdot)\) is the distance metric defined as cosine similarity [36], i.e., \(\operatorname{dist} (a,b)=\frac{\langle a, b\rangle }{\Vert a\Vert \Vert b\Vert }\) . \(\Delta (\cdot)\) is the fairness disparity evaluation metric defined in Equation (1). \(\epsilon\) is the disparity change threshold that controls the model flexibility.

5.2 Unbiased Policy Optimization

Using state \(s_t \in \mathcal {S}\) from Equation (5), candidate action \(a_t \in \mathcal {A}_t\) from Equation (7), and counterfactual reward \(r(s_t, a_t)\) defined in Equation (9) for each timestamp t, we aim to learn an explanation policy \(\pi _E\) by maximizing the expected cumulative reward \(R(\pi _E)\) over total iteration T. Formally,

\(\begin{equation} \max _{\pi _E} \quad R(\pi _E) = \mathbb {E}_{\tau \sim \pi _E}\left[\sum _{t=0}^{T} \gamma ^{t} r\left(s_{t}, a_{t}\right)\right] \end{equation}\)

(10)

where \(\tau =\left(s_0, a_0, s_1, a_1, \ldots \right)\) are trajectories and \(\tau \sim \pi _E\) means the distribution over trajectories is induced by the explanation policy \(\pi _E\) . The explanation policy \(\pi _E: \mathcal {S} \mapsto \mathcal {P}(\mathcal {A}_t)\) is a map from states \(\mathcal {S}\) to a probability distribution over actions \(\mathcal {A}_t\) , where \(\mathcal {A}_t\) are selected from the attentive action pruning. \(\mathcal {P}\) are state transitions and \(\mathbb {P}\left(s_{t+1}\mid s_{t}, a_{t}\right)\in \mathcal {P} = 1\) .

Intuitively, we can directly use \(R(\pi _E)\) in Equation (10) to guide the optimization of \(\pi _E\) . However, this would lead to a biased policy optimization [58] due to the existence of policy distribution discrepancy between \(\pi _E\) and the logging policy \(\pi _0\) . In particular, in the off-policy learning setting, the logging policy \(\pi _0\) focuses on explorations, in which a uniform-based \(\pi _0\) is adopted to explore attributes from the attribute space uniformly at random. In contrast, the target explanation policy \(\pi _E\) aims to exploit the learned knowledge and take actions that maximize the cumulative rewards given by Equation (10). As a result, the data collected under \(\pi _0\) may not align with the explanation policy’s distribution. Directly using the logged data collected by \(\pi _0\) for learning can lead to biased estimates, which has been proved in previous works [34, 41].

5.2.1 Bias Alleviation with Counterfactual Risk Minimization.

Fortunately, Counterfactual Risk Minimization (CRM) [45] could alleviate the bias by correcting the distribution discrepancy between \(\pi _E\) and \(\pi _0\) . The core idea under CRM is to use inverse propensity scoring (IPS) [64] to balance the distribution discrepancy. Specifically, IPS reweights the collected data using the importance weight, which is defined as the ratio of the target policy’s distribution to the logging policy’s distribution for each timestamp. By reweighting the data with importance weights, the distribution discrepancy can be corrected, enabling unbiased learning. Formally, IPS calculates the importance weight \(\beta ^t\) for each timestamp t:

\(\begin{equation} \beta ^t = \frac{\pi _E(a_t \mid {s}_{t})}{\pi _0(a_t \mid {s}_{t})} \end{equation}\)

(11)

where \(\beta ^t\) is the importance weight for reweighing at timestamp t. Applying \(\beta ^t\) to Equation (10), we can alleviate the policy distribution bias by maximizing the CRM-based expected cumulative reward \(\widehat{R}(\pi _E)\) :

\(\begin{equation} \max _{\pi _E} \quad \widehat{R}(\pi _E) = \mathbb {E}_{\tau \sim \pi _E}\left[\sum _{t=0}^{T} \gamma ^{t} \beta ^t r\left(s_{t}, a_{t}\right)\right] = \mathbb {E}_{\tau \sim \pi _E}\left[\sum _{t=0}^{T} \gamma ^{t} \frac{\pi _E\left(a_{t} \mid s_t\right)}{\pi _{0}\left(a_{t} \mid s_t\right)} r\left(s_{t}, a_{t}\right)\right] \end{equation}\)

(12)

Finally, the policy gradient of the explanation policy learning w.r.t. model parameter \(\Theta\) is achieved by the REINFORCE [63]:

\(\begin{equation} \nabla _{\Theta } \widehat{R}\left(\pi _E\right)=\frac{1}{T} \sum _{t=0}^{T}\gamma ^{t} \frac{\pi _E\left(a_{t} \mid s_t\right)}{\pi _{0}\left(a_{t} \mid s_t\right)} r\left(s_{t}, a_{t}\right) \nabla _{\Theta } \log \pi _E(a_t \mid s_t) \end{equation}\)

(13)

where T is the total training iteration. By optimizing the Equation (13), the learned explanation policy \(\pi _E\) generates minimal sets of attributes responsible for item exposure fairness changes, so as to find the plausible fairness-related factors leading to unfair recommendations.

6 Experiments

We conduct extensive experiments to evaluate the proposed CFairER for explaining item exposure fairness in recommendations. We aim to particularly answer the following research questions:

—

RQ1. Whether CFairER produces attribute-level explanations that are faithful to explaining recommendation model fairness compared with existing approaches?

—

RQ2. Whether explanations provided by CFairER achieve better fairness-accuracy trade-off than other methods?

—

RQ3. Do different components (i.e., recommendation model, attentive action pruning, counterfactual risk minimization-based optimization) help CFairER to achieve better generalizability, sample efficiency and bias alleviation, respectively?

—

RQ4. How do hyper-parameters impact CFairER?

—

RQ5. What is the time complexity and computation cost of CFairER?

6.1 Experimental Setup

6.1.1 Datasets.

We use logged user behavior data from three datasets Yelp,⁴ Douban Movie⁵ and LastFM⁶ for evaluations. Each dataset is considered an independent benchmark for different tasks, i.e., business, movie, and music recommendation tasks. The Yelp dataset records user ratings on local businesses and business compliment, category and city profiles. The Douban Movie is a movie recommendation dataset that contains user group information and movie actor, director and type details. The LastFM contains music listening records of users and artist tags. The details of both datasets are given in Table 1, which depicts statistics of user-item interactions, user-attribute and item-attribute relations. All datasets constitute complex user-item interactions and diverse attributes, thus providing rich contextual information for fairness explanation learning. Following previous works [28, 58, 60], we adopt a 10-core setting, i.e., retaining users and items with at least ten interactions for both datasets to ensure the dataset quality. Meanwhile, we binarize the explicit rating data by interpreting ratings of 4 or higher as positive feedback, otherwise negative. Then, we sort the interacted items for each user based on the timestamp and split the chronological interaction list into train/test/valid sets with a proportion of 60%/20%/20%.

Table 1.

Dataset	Attribute:	Relation:	Attribute:	Relation:
	Original		Preprocessed
(Density)	Number of Attributes	Number of Relations	Number of Attribute	Number of Relation
	User (U): 16,239	U-B: 198,397	User (U): 8,533	U-B: 181,199
	Business (B): 14,284	B-U: 198,397	Business (B): 9,370	B-U: 181,199
Yelp	Category (Ca): 511	B-Ca: 40,009	Category (Ca): 469	B-Ca: 26,887
(0.086%)	City (Ci): 47	B-Ci: 14,267	City (Ci): 46	B-Ci: 9,363
	Compliment (Co): 11	U-Co: 76,875	Compliment (Co): 11	U-Co: 41,149
	Friend (F): 10,580	U-F: 158,590	Friend (F): 9,609	U-F: 123,959
	User (U): 13,367	U-M: 1,068,278	User (U): 10,075	U-M: 1,059,305
	Movie (M): 12,677	M-U: 1,068,278	Movie (M): 9,122	M-U: 1,059,305
Douban Movie	Actor (A): 6,311	M-A: 33,572	Actor (A): 5,960	M-A: 26,247
(0.63%)	Director (D): 2,449	M-D: 11,276	Director (D): 2,281	M-D: 8,490
	Type (T): 38	M-T: 27,668	Type (T): 38	M-T: 20,420
	Group (G): 2,753	U-G: 570,047	Group (G): 2,753	U-G: 475,445
	Friend (F): 2,294	U-F: 4,085	Friend (F): 2,108	U-F: 3,651
	User (U): 1,892	U-A: 92,834	User (U): 1,879	U-A: 77,473
	Artist (A): 17,632	A-U: 92,834	Artist (A): 10,734	A-U: 77,473
LastFM	Tag (T): 9,718	A-T: 184,941	Tag (T): 6,392	A-T: 103,260
(0.28%)	Similar artist (S): 13,569	A-S: 153,399	Similar artist (S): 9,038	A-S: 107,987
	Friend (F): 1,749	U-F: 18,802	Friend (F): 1,747	U-F: 18,776

Table 1. Statistics of Three Datasets: Original Statistics and Preprocessed Statistics after the 10-core Setting

We also study the long-tail distribution of user-item interactions in the three datasets. We present the visualization results of the distribution in Figure 4. In particular, we first calculate the popularity of each item among user-item interactions, and sort the item index by the calculated popularity in descending order. Then, we plot the distribution of user-item interactions for both datasets: the x-axis represents the item index given by sorting the item popularity, and the y-axis represents the number of interactions (i.e., item popularity) in the dataset. We also calculate the split percentage of head-tailed items and long-tailed items and show the split as the vertical line in Figure 4. The head-tailed items are the top-ranked items based on popularity, positioned to the left (blue) of the vertical line; the long-tailed items are the remaining items positioned to the right (orange) of the vertical line. Upon analyzing Figure 4, we find that the user-item interactions in both datasets exhibit a skewed distribution. A small fraction of items are frequently interacted with by users, as evidenced by the head-tailed distribution in the blue plot area. Besides, we observe a long-tailed distribution in the orange plot area, where a majority of items receive a disproportionately small number of interactions. This suggests that while popular items capture a significant portion of user attention, a wide range of less popular items collectively contribute to a large fraction of interactions. In summary, the analysis in Figure 4 highlights the presence of a skewed distribution in the user-item interactions of both datasets, with a small fraction of popular items accounting for most of the user interactions in both datasets. The skewed distribution would result in serious item exposure unfairness issues in recommendations, such as the well-known filter-bubble problem [37] and Matthew effect [38].

Fig. 4.

6.1.2 Baselines.

We adopt three heuristic approaches and two existing fairness-aware explainable recommendation methods as baselines. In particular, the three heuristic approaches RDExp, PopUser and PopItem generate explanations from user and item attributes that are chosen by random selection, top popular user attributes and top popular item attributes, respectively. The two fairness-aware explainable methods are FairKGAT [16] and CEF [19]. FairKGAT [16] mitigates the unfairness of explanation for a SOTA knowledge graph-enhanced recommender KGAT [51]. CEF [19] is the first work that explains fairness, in which aspect-based explanations are generated from aspect perturbation sets.

—

RDExp: We randomly select attributes from the attribute space for each user-item interaction and generate explanations based on the selected attributes. Note that the selected attributes can be both user and item attributes. We fix the explanation length N as \(N=20\) for RDExp.

—

PopUser and PopItem: We separately calculate the exposure number of attributes for each user-item interaction, then sort each attribute chronologically based on the exposure number. We devise a baseline PopUser, in which the top user attributes are selected as explanations. Analogously, we build PopItem that produces the top item attributes for the explanation. We fix the explanation length N as \(N=20\) for PopUser and PopItem.

—

FairKGAT: uses FairKG4Rec [16] to mitigate the unfairness of explanations for a knowledge graph-enhanced recommender KGAT [51]. FairKG4Rec [16] is a generalized fairness-aware algorithm that controls the unfairness of explanation diversity in the recommendation model. KGAT [51] is a state-of-the-art knowledge graph-enhanced recommendation model that gives the best fairness performance in the original FairKG4Rec paper. We shortly denote the FairKG4Rec with KGAT as FairKGAT. The explanation length in FairKGAT is also fixed as \(N=20\) .

—

CEF [19]: is the first work that explains fairness in recommendation. It generates feature-level explanations for item exposure unfairness by perturbing user and item features and searches for features that change the fairness disparity.

Note that to the best of our knowledge, FairKGAT [16] and CEF [19] are the only two existing methods designed for explainable fairness recommendation tasks.

6.1.3 Explanation Faithfulness Evaluation.

We adopt the widely used erasure-based evaluation criterion [18] in Explainable AI to evaluate the explanation faithfulness. Note that the erasure-based evaluation is also used by CEF [19]. The erasure-based evaluation identifies the contributions of explanations by measuring model performance changes after these explanations are removed. As a result, one can tell whether the model actually relied on these particular explanations to make a prediction, i.e., faithful to the model. In our experiments, we use the erasure-based evaluation to test (I) the recommendation performance change and (II) the recommendation fairness change after a set of attributes from the generated explanation is removed. By doing so, we can identify whether our explanations are faithful to recommendation performance and fairness disparity.

Following [18], we remove certain attributes from the generated explanations and evaluate the resulting recommendation performance. Therefore, in the starting evaluation point, we consider all attributes and add them to the user and item embeddings. We then remove certain attributes from the generated explanations to observe recommendation and fairness changes at later evaluation points. In particular, we first use historical user-item interactions to train a recommendation model through Equation (4) to generate user and item embeddings. Then, we fuse all attribute embeddings from Equation (2) with the trained user and item embeddings. The user and item embeddings after fusion are used to generate recommendation results at the starting evaluation point. Thereafter, we conduct counterfactual reasoning using our CFairER to generate attribute-level counterfactual explanations for model fairness. Those generated explanations are defined as the erasure set of attributes for each user/item. Finally, we exclude the erasure set from attribute space, and fuse the embeddings of attributes after erasure with the trained user and item embeddings to generate new recommendation results. Given the recommendation results at each evaluation point, we use two widely adopted recommendation evaluation metrics, namely, Normalized Discounted Cumulative Gain (NDCG)@K and Hit Ratio (HR)@K, to measure the recommendation accuracy. As this work focuses on explaining item exposure fairness in recommendations, we use two item-side evaluation metrics, i.e., Head-tailed Rate (HT)@K and Gini@K, for fairness evaluation. HT@K refers to the ratio of the head-tailed (i.e., popular) item number in each recommendation to the recommendation list length K. Those head-tailed items are the top-ranked items selected based on the calculated item popularity for each dataset, and are derived from the statistical analysis in Section 6.1.1. Intuitively, the head-tailed items in Yelp, Douban Movie, and LastFM datasets are those items that are positioned to the left of the vertical lines in Figure 4(a) (b) (c), respectively. Having obtained the defined head-tailed items and the recommendation list length K, HT@K is calculated by,

\(\begin{equation} HT@K= \frac{N_{\text{head-tailed}}}{K} \end{equation}\)

(14)

where \(N_{\text{head-tailed}}\) is the number of head-tailed items. Larger HT@K indicates that the model suffers from a more severe item exposure disparity by favoring items from the head-tailed (i.e., popular) group. Gini@K measures the inequality within the two subgroups (i.e., the head-tailed group and the long-tailed group) among the Top-K recommendation list. Larger Gini@K indicates recommendations are of higher inequality between the head-tailed and the long-tailed group.

6.1.4 Implementation Details.

To demonstrate our CFairER, we employ a simple matrix factorization (MF) as our recommendation model. We train the MF using train/test/validate sets split from user-item interactions in datasets with 60%/20%/20%. The same data-splitting method is applied in all baselines. We optimize the MF using stochastic gradient descent (SGD) [7]. The same gradient descent method is leveraged for baselines when required. Our graph representation module employs two graph convolutional layers with \(\lbrace 64, 128\rbrace\) output dimensions. FairKGAT baseline also keeps 2 layers. The graph representation module outputs embeddings for all user and item attributes with the embedding size \(d=128\) . The embedding size for FairKGAT and CEF is also fixed as \(d=128\) . The number of latent factors (as in Equation (3)) of MF is set equal to the embedding size of our graph representation module. To generate the starting evaluation point of erasure-based evaluation, we fuse attribute embeddings with the trained user and item latent factors based on element-wise product fusion. The fused user and item embeddings are then used to produce Top-K recommendation lists.

We train our counterfactual fairness explanation model with SGD based on the REINFORCE [44] policy gradient. For baseline model compatibility, we follow CEF [19] to use “Sentires”⁷ for constructing user-feature attention matrix and item-feature quality matrix for CEF. The hyper-parameters of training all baselines are chosen by the grid search, including learning rate, \(L_2\) norm regularization, and so on. The batch size for our CFairER and baselines are set as 256. The training epochs for our CFairER and baselines are set to a maximum of 400, and an early stopping strategy is performed to stop the training process if the loss fails to decrease for consecutive 5 epochs. After all models have been trained, we freeze the model parameters and generate explanations accordingly. We report the erasure-based evaluation results by recursively erasing top E attributes from the generated explanations. The erasure length E is chosen from \(E=[5, 10, 15, 20]\) . The recommendation and fairness performance of our CFairER and baselines under different E is reported in Table 2. For the unique parameters of our CFairER, we determine the discount factor \(\gamma ^{t} \in [0,1]\) in Equation (13) by a grid search. The total reinforcement timestamp T is set to \(T=100\) . The disparity change threshold \(\epsilon \in [0,1]\) in Equation (9), and the fairness disparity trade-off \(\lambda \in \lbrace 0,1\rbrace\) in Equation (1) is determined by performing a grid search on the validation set. This enables us to choose the optimal value for a variety of recommendation tasks, including but not limited to business (Yelp dataset), movie (Douban Movie dataset), and music (LastFM dataset) recommendations.

Table 2.

Method	NDCG@40 \(\uparrow\)			Hit Ratio (HR)@40 \(\uparrow\)			Head-tailed Rate (HT)@40 \(\downarrow\)			Gini@40 \(\downarrow\)
Method	\(E=5\)	\(E=10\)	\(E=20\)	\(E=5\)	\(E=10\)	\(E=20\)	\(E=5\)	\(E=10\)	\(E=20\)	\(E=5\)	\(E=10\)	\(E=20\)
Yelp
RDExp	0.0139	0.0125	0.0118	0.1153	0.1036	0.1029	0.1994	0.1976	0.1872	0.3870	0.3894	0.3701
PopUser	0.0141	0.0136	0.0128	0.1183	0.1072	0.1067	0.1776	0.1767	0.1718	0.3671	0.3642	0.3495
PopItem	0.0147	0.0139	0.0131	0.1182	0.1093	0.1084	0.1793	0.1846	0.1848	0.3384	0.3370	0.3359
FairKGAT	0.0153	0.0141	0.0138	0.1384	0.1290	0.1277	0.1802	0.1838	0.1823	0.3671	0.3508	0.3542
CEF	0.0254	0.0247	0.0231	0.1572	0.1608	0.1501	0.1496	0.1455	0.1420	0.3207	0.3159	0.3088
CFairER	0.0316	0.0293	0.0291	0.1987	0.1872	0.1868	0.1345	0.1322	0.1301	0.2366	0.2068	0.1974
Douban Movie
RDExp	0.0390	0.0346	0.0351	0.1278	0.1170	0.1172	0.1932	0.1701	0.1693	0.3964	0.3862	0.3741
PopUser	0.0451	0.0352	0.0348	0.1482	0.1183	0.1174	0.1790	0.1674	0.1658	0.3684	0.3591	0.3562
PopItem	0.0458	0.0387	0.0379	0.1523	0.1219	0.1208	0.1831	0.1458	0.1383	0.3664	0.3768	0.3692
FairKGAT	0.0534	0.0477	0.0421	0.1602	0.1377	0.1308	0.1654	0.1573	0.1436	0.3590	0.3472	0.3483
CEF	0.0831	0.0795	0.0809	0.1949	0.1973	0.1901	0.1043	0.0998	0.0945	0.3079	0.2908	0.3001
CFairER	0.1290	0.0921	0.0901	0.2706	0.2441	0.2238	0.0841	0.1183	0.1101	0.2878	0.2648	0.2593
LastFM
RDExp	0.0857	0.0568	0.0592	0.8436	0.7915	0.7831	0.7884	0.7732	0.7691	0.3707	0.3531	0.3540
PopUser	0.0786	0.0432	0.0431	0.7787	0.7697	0.7604	0.7979	0.8064	0.7942	0.3862	0.3729	0.3761
PopItem	0.0792	0.0479	0.0435	0.7803	0.7961	0.7914	0.7689	0.7638	0.7573	0.3673	0.3602	0.3618
FairKGAT	0.0832	0.0594	0.0621	0.8063	0.7938	0.8165	0.7451	0.7342	0.7408	0.3580	0.3458	0.3433
CEF	0.0962	0.1037	0.1001	0.8592	0.8408	0.8509	0.6873	0.6092	0.5601	0.3308	0.3298	0.3375
CFairER	0.1333	0.1193	0.1187	0.9176	0.8867	0.8921	0.6142	0.5865	0.5737	0.2408	0.2371	0.2385

Table 2. Recommendation and Fairness Performance after Erasing Top \(E=[5,10,20]\) Attributes from Explanations

\(\uparrow\) represents larger values are desired for better performance, while \(\downarrow\) indicates smaller values are better. Bold and underlined numbers are the best results and the second-best results, respectively.

6.2 Explanation Faithfulness (RQ1, RQ2)

Figure 5 explicitly shows how the fairness (i.e., Head-tailed Rate@K) and recommendation (i.e., NDCG@K) performance of our CFairER and baselines changes along with erasure, where the x-axis shows the erasure iteration and the y-axis demonstrates the corresponding fairness and recommendation performance at \(K=\lbrace 5,20\rbrace\) . Each data point in Figure 5 is generated by cumulatively erasing a batch of attributes. Those erased attributes are selected from, at most⁸, the top 10 (i.e., \(E=10\) ) attribute sets of the explanation lists provided by each method⁹. As PopUser and PopItem baselines enjoy very similar data trends, we choose not to present them simultaneously in Figure 5. Besides, we plot the relationship between fairness and recommendation performance at each erasure iteration in Figure 6, showcasing the fairness-accuracy trade-off of our CFairER and baselines. We also use Table 2 to present the final recommendation and fairness performance of all methods after erasing \(E = [5, 10, 20]\) attributes in explanations. Note that in both Figure 5, Figure 6 and Table 2, larger NDCG@K and Hit Ratio @K values indicate better recommendation performance while smaller Head-tailed Rate@K and Gini@K values represent better fairness. Analyzing Figure 5, Figure 6 and Table 2, we have the following findings.

Fig. 5.

Fig. 6.

From Table 2, it is observed that our CFairER achieves the best recommendation and fairness performance amongst all methods after erasing attributes from our explanations. For instance, CFairER beats the strongest baseline CEF by 25.9%, 24.4%, 8.3% and 36.0% for NDCG@40, Hit Ratio@40, Head-tailed Rate@40 and Gini@40 with erasure length \(E=20\) on Yelp. This indicates that explanations generated by CFairER are more faithful to explaining unfair factors while not harming recommendation accuracy compared to all baseline methods. Unlike heuristic approaches (i.e., RDExp, PopUser, and PopItem) that rely on random attribute selection, CFairER incorporates a more complex and rational attribute selection mechanism for fairness explanation generation. Particularly, CFairER prioritizes relevant attributes that change the model fairness by using a counterfactual reward (cf. Equation (9)), incorporating two criteria of Rationality and Proximity to ensure the quality of generated counterfactual explanations. Besides, FairKGAT considers mitigating the unfairness of explanation diversity caused by different user activeness, which ignores the impact of item exposure imbalance on explanation fairness. Our CFairER mitigates the unfairness of item exposure to promote the fair allocation of user-preferred but less exposed items, thus achieving better recommendation and fairness performance than FairKGAT. Regarding CEF, although CEF generates counterfactual explanations as in our CFairER, it conducts feature-level optimizations and does not apply to discrete attributes such as gender and age. Our CFairER uses an off-policy learning agent to directly visit attributes in a given HIN, adapting to both discrete and continuous attributes when finding the counterfactual explanations. As a result, our CFairER outperforms CEF due to its generalizability to discrete attributes. Another interesting finding is that PopUser and PopItem perform even worse than RDExp (i.e., randomly selecting attributes) on LastFM. Early recommendation models largely recommend items that have popular attributes favored by users. Though this is intuitive, recommending items with popular attributes would deprive the exposure of less-noticeable items, causing serious model unfairness and degraded recommendation performance. This further highlights the importance of mitigating item exposure unfairness in recommendations.

From Figure 5, the fairness of all models consistently improves while erasing attributes from explanations, shown by the decreasing trend of Head-tailed Rate@K values. Besides, Figure 5 also shows the decreasing trend of NDCG@K values, demonstrating a decreased recommendation performance. The improved fairness performance of all methods is reasonable, as erasing attributes, even those selected randomly from attribute sets, could potentially remove unfair factors to mitigate the representation gap between popular and long-tailed items. This finding is also consistent with the finding in CEF [19]. Unfortunately, we can observe the downgraded recommendation performance of all models in Figure 5, as also evidenced by CEF [19]. For example, in Figure 5, the NDCG@5 of CEF drops from approximately 1.17 to 0.60 on LastFM at erasure iteration 0 and 50. This is due to the well-known fairness-accuracy trade-off issue, in which the fairness constraint could be achieved with a sacrifice of recommendation performance. Facing this issue, both baselines suffer from huge declines in recommendation performance. On the contrary, our CFairER still enjoys favorable recommendation performance and outperforms all baselines. Besides, the decline rates of our CFairER are much slower than baselines on both datasets in Figure 5. We hence conclude that the attribute-level explanations provided by our CFairER can achieve a much better fairness-accuracy trade-off than other methods. This is because our CFairER finds minimal but vital attributes as explanations for model fairness. Those attributes produced by CFairER are fairness-related factors but not the ones that affect the recommendation accuracy. As a result, our CFairER could alleviate the item exposure unfairness while maintaining stable recommendation performance. Another finding is that our CFairER may not outperform FairKGAT and PopUser in fairness evaluations when the number of erasure iterations is insufficient. In Figure 5(a), it is observed that the HT@5 of FairKGAT and PopUser exhibit more quick degradation compared to CFairER’s degradation at the beginning of the erasure iterations. This can be attributed to the fact that FairKGAT and PopUser construct explanations using a fixed length (i.e., \(N=20\) ), which may absorb more attributes in each explanation than CFairER’s adaptive explanation length, i.e., minimal for each explanation. Consequently, CFairER may not have a fair opportunity to hit the attributes that explain model unfairness compared to FairKGAT and PopUser, resulting in slower degradation of HT@5 during the initial erasure iterations. However, as more erasure iterations are performed, CFairER exhibits stable performance and eventually surpasses FairKGAT and PopUser in fairness evaluations. This indicates that CFairER consistently discovers suitable explanations that align with model fairness during the learning process. These explanations generated by CFairER prioritize simple yet essential attributes, in contrast to the complex combinations of attributes utilized by FairKGAT and PopUser.

Figure 6 provides deeper insights into the fairness-accuracy trade-off issue, specifically the relationship between fairness and recommendation performance metrics, i.e., Head-tailed Rate@K and NDCG@K. By analyzing Figure 6, we find that our CFairER achieves the best fairness-accuracy trade-off compared to the baseline methods on the Douban Movie and LastFM datasets. This is evident from the blue curves positioned to the left-hand side of the diagonals of Figure 6(c) (d) (e) (f). Moreover, we observe that RDExp performs the poorest on the Douban Movie dataset, while PopUser exhibits the worst performance on the LastFM dataset. In our experiments, the trade-off is caused by the disagreement between the goals of item exposure fairness and user preference. While we aim to align fair allocations to item exposures, the recommendation model primarily focuses on selecting items similar to those in users’ historical interactions. We thus conclude the attributes found by our CFairER are not necessarily similar to the attributes of historical items. Instead, they are sensitive attributes that cause the recommendations to favor the historically popular items. Another finding is that our CFairER initially does not outperform other baselines during the early erasure process on the Yelp dataset, as depicted in Figure 6(a) (b). We attribute this sub-optimal performance on Yelp to the extremely sparse nature of the dataset, i.e., a density of only 0.086%. Compared with Douban Movie (0.63% density) and LastFM (0.28% density), Yelp records a larger number of users that have few interactions with items. While other baseline methods may rely on popular attributes to predict the preferences of those users, CFairER takes a different approach. It aims to identify sensitive attributes that may not necessarily be the most popular ones. Consequently, at the beginning of the erasure process, CFairER may struggle to adapt to the data sparsity of the Yelp dataset, resulting in sub-optimal performance. However, as more erasure iterations are performed, CFairER surpasses baseline models and achieves the best fairness-accuracy trade-off. This demonstrates the effectiveness of CFairER in progressively refining its attribute selection process and achieving a favorable fairness-accuracy trade-off. With more iterations, CFairER can identify the appropriate attributes that better explain item exposure fairness and align with user preferences.

6.3 Ablation Study (RQ3)

We conduct an in-depth ablation study on the ability of our CFairER to achieve favorable generalizability, sample efficiency and bias alleviation. In particular, our CFairER includes three important components that contribute to the good performance of CFairER, i.e., recommendation model (cf. Section 4.3), attentive action pruning (cf. Section 5.1) and counterfactual risk minimization-based optimization (cf. Section 5.2). We evaluate our CFairER with different variant combinations and show our main findings below.

6.3.1 Generalizability to Recommendation Models.

We first do the ablation study on recommendation models, showcasing the generalizability of our CFairER. In particular, we consider three different benchmark recommendation models, i.e., BPR [39], NeuMF [25] and NGCF [52]:

—

BPR [39]: a well-known matrix factorization-based model with a pairwise ranking loss to enable recommendation learning from implicit feedback.

—

NeuMF [25]: extends collaborative filtering to neural network architecture. It maps users and items into dense vectors and feeds user and item vectors into a multi-layer perception to predict user preferences.

—

NGCF [52]: a graph-based model that incorporates two graph convolutional networks to learn user and item embeddings. The learned embeddings are passed to a matrix factorization to capture the collaborative signal for recommendations.

We substitute our MF-based recommendation model (Section 4.3) by BPR, NeuMF and NGCF, respectively, utilizing the user and item embeddings produced by the three models for the latter explanation policy learning. The fairness and recommendation performance of our CFairER under the three recommendation models is reported in Table 3. We have the following findings. Overall, when utilizing NeuMF and NGCF, CFairER consistently demonstrates improved performance, evident from the increased NDCG@20 and HR@20 values, as well as the decreased HT@20 and Gini@20 values. NeuMF and NGCF are two recent state-of-the-art (SOTA) recommendation models, which extend collaborative filtering (CF) to neural network architecture and graph-based modeling, respectively [66]. NeuMF makes a notable extension of CF to neural networks, leveraging the power of deep learning to achieve the SOTA performance. NGCF enhances NeuMF by modeling complex user-item interactions in graph-structured data, capturing higher-order relationships among users and items. Compared to the simple MF used in CFairER, these advanced models capture intricate user behaviors, such as collaborative behavior in NeuMF and higher-order relationships in NGCF. Thus, the increased performance of CFairER with these advanced models highlights its potential for achieving better recommendations. Additionally, even when employing the basic MF model, our CFairER outperforms baseline approaches, as shown in Table 2. This highlights the effectiveness of CFairER regardless of the reliance on recommendation models. Another finding is that BPR leads to downgraded recommendation performance for CFairER. This could be attributed to BPR’s reliance on binary implicit feedback, which may lack the expressive ability present in the rating data used by MF. In summary, the ablation study demonstrates the generalizability of CFairER and its potential to achieve better performance when combined with advanced recommendation models.

Table 3.

Recommendation Model	NDCG@20 \(\uparrow\)	HR@20 \(\uparrow\)	HT@20 \(\downarrow\)	Gini@20 \(\downarrow\)
Yelp
MF (CFairER)	0.0238	0.1871	0.1684	0.1990
BPR [39]	0.0273 ( \(+14.71\%\) )	0.1869 ( \(-0.11\%\) )	0.1579 ( \(-6.23\%\) )	0.2083 ( \(+4.67\%\) )
NeuMF [25]	0.0279 ( \(+17.23\%\) )	0.2021 ( \(+8.02\%\) )	0.1563 ( \(-7.19\%\) )	0.1832 ( \(-7.94\%\) )
NGCF [52]	0.0284 ( \(+19.33\%\) )	0.1927 ( \(+ 2.99\%\) )	0.1530 ( \(-9.16\%\) )	0.1801 ( \(-9.50\%\) )
Douban Movie
MF (CFairER)	0.0583	0.2043	0.1149	0.2871
BPR [39]	0.0592 ( \(+1.54\%\) )	0.1992 ( \(-2.50\%\) )	0.1084 ( \(-5.66\%\) )	0.2860 ( \(-0.38\%\) )
NeuMF [25]	0.0611 ( \(+4.80\%\) )	0.2101 ( \(+2.89\%\) )	0.1070 ( \(-6.86\%\) )	0.2771 ( \(-3.48\%\) )
NGCF [52]	0.0638 ( \(+9.44\%\) )	0.2178 ( \(+6.66\%\) )	0.1073 ( \(-6.61\%\) )	0.2708 ( \(-5.67\%\) )
LastFM
MF (CFairER)	0.1142	0.7801	0.6914	0.2670
BPR [39]	0.1138 ( \(-0.35\%\) )	0.7792 ( \(-0.12\%\) )	0.6908 ( \(-0.09\%\) )	0.2672 ( \(+0.07\%\) )
NeuMF [25]	0.1209 ( \(+5.87\%\) )	0.7983 ( \(+2.34\%\) )	0.6831( \(-1.20\%\) )	0.2580 ( \(-3.37\%\) )
NGCF [52]	0.1271 ( \(+11.30\%\) )	0.8031 ( \(+2.95\%\) )	0.6744 ( \(-2.46\%\) )	0.2461 ( \(-7.82\%\) )

Table 3. Ablation Study on Recommendation Models

Erasure length E is fixed as \(E=20\) . \(\pm\) indicates the increase or decrease percentage of the variant compared with CFairER.

6.3.2 Sample Efficiency of Attentive Action Pruning.

Our attentive action pruning reduces the action search space by specifying varying importance of attributes for each state. As a result, the sample efficiency can be increased by filtering out irrelevant attributes to promote an efficient action search. To demonstrate our attentive action pruning, we test CFairER without ( \(\lnot\) ) the attentive action pruning (i.e., CFairER \(\lnot\) Attentive Action Pruning), in which the candidate actions set absorbs all attributes connected with the current user and items. We report the performance of CFairER \(\lnot\) Attentive Action Pruning in Table 4. Through Table 4, we observed that removing the attentive action pruning downgrades CFairER performance. We attribute the downgraded performance of CFairER to the exponentially increased computational costs after removing the attentive action pruning. These costs require careful model optimization by needing a sufficient number of iterations and samples during training, as well as longer training times. Since our ablation study is conducted in a fair setting, wherein training iterations and samples remain the same, the downgraded performance after removing the attentive action pruning is reasonable. This is due to the limited training environment prepared for CFairER, which leads to a sub-optimal explanation policy facing such a high computational cost. This sheds light on the importance of the attentive action pruning module, as it alleviates the computational burden by filtering out irrelevant actions, resulting in enhanced sample efficiency. Besides, without pruning, the model should theoretically have a more extensive search space. However, a larger search space would introduce more irrelevant actions that can bias the model from finding appropriate actions. The attentive action pruning module plays a crucial role in filtering out these irrelevant actions, thereby facilitating more efficient searching for counterfactual explanations. By filtering out irrelevant actions from consideration, the model can focus on identifying the most impactful actions for generating accurate and meaningful explanations. Moreover, the performance of CFairER after removing the attentive action pruning downgrades severely on Douban Movie. This is because Douban Movie has the largest number of attributes compared with the other two datasets (cf. Table 1), which challenges our CFairER to find suitable attributes as fairness explanations. These findings suggest the superiority of applying attentive action pruning in fairness explanation learning, especially when the attribute size is large.

Table 4.

Variants	NDCG@20 \(\uparrow\)	HR@20 \(\uparrow\)	HT@20 \(\downarrow\)	Gini@20 \(\downarrow\)
Yelp
CFairER	0.0238	0.1871	0.1684	0.1990
CFairER \(\lnot\) Attentive Action Pruning	0.0164( \(-31.1\%\) )	0.1682( \(-10.1\%\) )	0.1903( \(+13.0\%\) )	0.2159( \(+8.5\%\) )
CRM loss \(\rightarrow\) Cross-entropy [77] loss	0.0197( \(-17.2\%\) )	0.1704( \(-8.9\%\) )	0.1841( \(+9.3\%\) )	0.2101( \(+5.6\%\) )
Douban Movie
CFairER	0.0583	0.2043	0.1149	0.2871
CFairER \(\lnot\) Attentive Action Pruning	0.0374( \(-35.9\%\) )	0.1537( \(-24.8\%\) )	0.1592( \(+38.6\%\) )	0.3574( \(+24.5\%\) )
CRM loss \(\rightarrow\) Cross-entropy [77] loss	0.0473( \(-18.9\%\) )	0.1582( \(-22.6\%\) )	0.1297( \(+12.9\%\) )	0.3042( \(+6.0\%\) )
LastFM
CFairER	0.1142	0.7801	0.6914	0.2670
CFairER \(\lnot\) Attentive Action Pruning	0.0987( \(-13.6\%\) )	0.7451( \(-4.5\%\) )	0.7833( \(+13.3\%\) )	0.2942( \(+10.2\%\) )
CRM loss \(\rightarrow\) Cross-entropy [77] loss	0.0996( \(-12.8\%\) )	0.7483( \(-4.1\%\) )	0.7701( \(+11.4\%\) )	0.2831( \(+6.0\%\) )

Table 4. Ablation Study on Core Modules

Erasure length E is fixed as \(E=20\) . \(\lnot\) represents the corresponding module is removed. \(A \rightarrow B\) represents A is replaced by B. \(\pm\) indicates the increase or decrease percentage of the variant compared with CFairER.

6.3.3 Bias Alleviation with Counterfactual Risk Minimization.

Our CFairER is optimized with a counterfactual risk minimization (CRM) loss to achieve unbiased policy optimization. The CRM loss (cf. Equation (12)) corrects the discrepancy between the explanation policy and logging policy, thus alleviating the policy distribution bias in the off-policy learning setting. To demonstrate the CRM loss, we apply our CFairER with cross-entropy (CE) [77] loss (i.e., CRM loss \(\rightarrow\) Cross-entropy loss) to show how it performs compared with CFairER on the CRM loss. We report the performance of CRM loss \(\rightarrow\) Cross-entropy loss in Table 4. We observe our CFairER with CRM loss consistently outperforms the counterpart with CE loss on both fairness and recommendation performance. The sub-optimal performance of our CFairER with CE loss indicates that the bias issue in the off-policy learning can lead to downgraded performance for the learning agent. On the contrary, our CFairER takes advantage of CRM to learn a high-quality explanation policy. Hence, we conclude that performing unbiased optimization with CRM is critical to achieving favorable fairness explanation learning.

6.4 Parameter Analysis (RQ4)

We analyze how erasure length E (cf. Section 6.1.3) and candidate size n (as in Equation (7)) impact the performance of CFairER. Erasure length E is the number of erased attributes selected from each explanation, which determines the erasure size for evaluations. Candidate size n is the number of candidate actions selected by our attentive action pruning. We present the evaluation results of CFairER under different E and n on Yelp and LastFM in Figure 7. Since the results on Douban Movie draw similar conclusions to the other datasets, we choose not to present them here.

Fig. 7.

Apparently, the performance of CFairER demonstrates decreasing trends from \(E=5\) , then becomes stable after \(E=10\) . The decreased performance is due to the increasing erasure of attributes found by our generated explanations. This indicates that our CFairER can find valid attribute-level explanations that impact fair recommendations. The performance of CFairER degrades slightly after the bottom, then becomes stable. This is reasonable since the attributes number provided in datasets are limited, while increasing the erasure length would allow more overlapping attributes with previous erasures to be found.

By varying candidate size n from \(n=[10, 20, 30, 40, 50, 60]\) in Figure 7(c) (d), we observe that CFairER performance first improves drastically as candidate size increases on both datasets. The performance of our CFairER reaches peaks at \(n=40\) and \(n=30\) on Yelp and LastFM, respectively. After the peaks, we can witness a downgraded model performance by increasing the candidate size further. We consider the poorer performance of CFairER before reaching peaks is due to the limited candidate pool, i.e., insufficient attributes limit the exploration ability of CFairER to find appropriate candidates as fairness explanations. Meanwhile, a too-large candidate pool (e.g., \(n=60\) ) would offer more chances for the agent to select inadequate attributes as explanations. Based on the two findings, we believe it is necessary for our CFairER to carry the attentive action search, such as to select high-quality attributes as candidates based on their contributions to the current state.

6.5 Time Complexity and Computation Costs (RQ5)

For time complexity, our recommendation model (cf. Section 4.3) performs matrix factorization with a complexity of \(O(|\mathcal {O}|)\) . For the graph representation module (cf. Section 4.2), establishing node representations has complexity \(O(\sum _{l=1}^L (|\mathcal {G}|+|\mathcal {O}^{+}|) d_l d_{l-1})\) . For the off-policy learning process (cf. Section 5.1), the complexity is mainly determined by the attention score calculation, which has a time complexity of \(O(2T|\mathcal {O}^{+}| |\tilde{\mathcal {N}}_e| d^2)\) . The total time complexity is \(O(|\mathcal {O}|+ \sum _{l=1}^L(|\mathcal {G}|+|\mathcal {O}^{+}|) d_l d_{l-1}+2T|\mathcal {O}^{+}| n_2 d^2)\) . We evaluated the running time of FairKGAT and CEF baselines on the large-scale Yelp dataset. The corresponding results are 232s and 379s per epoch, respectively. CFairER has a comparable cost of 284s per epoch to these baselines. Considering that our CFairER achieves superior explainability improvements compared to the baselines, we believe that the increased cost of, at most, 52s per epoch is a reasonable trade-off.

7 Conclusion

The proposed method. We propose CFairER, a reinforcement learning-based fairness explanation learning framework that operates over a heterogeneous information network (HIN). CFairER generates counterfactual explanations in the form of minimal sets of real-world attributes to provide insights into item exposure unfairness. Unlike previous approaches, CFairER takes a distinct approach to abandon feature-level optimizations and instead focuses on searching for real-world attributes from the HIN. This unique characteristic of CFairER allows it to adapt to both continuous and discrete attributes, providing flexibility in capturing various types of attribute information. In CFairER framework, we incorporate a counterfactual fairness explanation model and leverage attentive action pruning to reduce the search space. We define the counterfactual fairness explanation model as an off-policy learning agent, employing a counterfactual reward to enable the counterfactual reasoning for attribute-level counterfactual explanations. Finally, CFairER discovers high-quality counterfactual explanations that effectively capture fairness factors and meanwhile align with user preferences. Extensive evaluations conducted on three benchmark datasets validate the effectiveness of CFairER. The evaluation results demonstrate CFairER’s ability to generate faithful explanations for fairness while balancing the fairness-accuracy trade-off.

The relation with CEF. CEF [19] can be viewed as a special case of CFairER where only feature-level explanations are considered, and no reinforcement learning agent is employed. CFairER, as a more comprehensive fairness explanation learning framework, goes beyond feature-level explanations by leveraging the rich information present in heterogeneous information networks (HINs). CFairER aims to identify real-world attributes from the HIN that provide meaningful insights into item exposure fairness. It employs a reinforcement learning agent to guide the attribute selection and generate counterfactual explanations. On the other hand, CEF focuses solely on feature-level explanations, considering attributes that are in the form of score-based features. While CEF is the first approach that explains fairness in recommendations, it does not harness the full potential of the HIN or utilize reinforcement learning for attribute selection. Therefore, CFairER can be seen as an extension of CEF, expanding the scope by incorporating a broader range of continuous and discrete attributes from the HIN.

Future Work. We plan to better alleviate the fairness-accuracy trade-off issue when presenting with extreme data sparsity for our future work. According to our experiments, both methods perform worse on the sparse dataset, i.e., Yelp compared with the counterparts on more dense ones, i.e., Douban Movie and LastFM. The same situation happens for our CFairER, as the limited interactions record for inactive users hinders searching for attributes that explain unfairness while aligning with users’ preferences. For the future solution, we could regard those attributes in the generated explanations at each training iteration as the counterfactual instances [67]. Those counterfactual instances, as demonstrated in prior works [62, 74, 76], could offer auxiliary signals for learning unfairness factors and user preferences since they cause changes in fairness and recommendation results. As a result, we could augment the training of CFairER with those counterfactual instances through data augmentation or adversarial training [47], capitalizing our ability to provide meaningful explanations and recommendations for inactive users.

Footnotes

Following [19], we consider the top 20% items with the most frequent interactions with users as \(G_{0}\) , while the remaining 80% belongs to \(G_{1}\) .

We adopt the uniform-based logging policy as \(\pi _0\) . It samples attributes as actions from the attribute space with the probability of \(\pi _0(a_t \mid s_t)=\frac{1}{|\mathcal {V}_U+\mathcal {V}_I|}\) .

In our experiment, we used a fixed initial state distribution, where \(s_{0} = 0 \in \mathbb {R}^{d}\) .

⁴

https://www.yelp.com/dataset/

⁵

https://movie.douban.com/

⁶

https://github.com/librahu/HIN-Datasets-for-Recommendation-and-Network-Embedding

⁷

https://github.com/evison/Sentires

⁸

Our CFairER generates counterfactual explanations; thus, the explanation length is adaptive and may not reach \(E=10\) . In such a case, we select all attributes as erased attributes.

⁹

For example, given n explanation lists, the number of erasure attributes is \(n \times 10\) . We cumulatively erase m attributes in one batch within in total \((n \times 10) / m\) iterations.

References

[1]

Jinzhen Zhang, Qinghua Zhang, Zhihua Ai, and Xintai Li. 2021. Context-based user typicality collaborative filtering recommendation. Human-Centric Intelligent Systems 1, 1-2 (2021), 43–53.

Abstract

1 Introduction

2 Related Work

2.1 Fairness Explanation for Recommendation

2.2 Heterogeneous Information Network in Recommendation

2.3 Counterfactual Explanation

3 Preliminary

3.1 Heterogeneous Information Network

3.2 Fairness Disparity

3.3 Counterfactual Explanation for Fairness

4 The CFairER Framework

4.1 Counterfactual Fairness Explanation Model

4.2 Graph Representation Module

4.3 Recommendation Model

5 Reinforcement Learning for Counterfactual Fairness Explanation

5.1 Explaining as Off-policy Learning

5.1.1 State Representation Learning.

5.1.2 Attentive Action Pruning.

5.1.3 Counterfactual Reward Definition.

5.2 Unbiased Policy Optimization

5.2.1 Bias Alleviation with Counterfactual Risk Minimization.

6 Experiments

6.1 Experimental Setup

6.1.1 Datasets.

6.1.2 Baselines.

6.1.3 Explanation Faithfulness Evaluation.

6.1.4 Implementation Details.

6.2 Explanation Faithfulness (RQ1, RQ2)

6.3 Ablation Study (RQ3)

6.3.1 Generalizability to Recommendation Models.

6.3.2 Sample Efficiency of Attentive Action Pruning.

6.3.3 Bias Alleviation with Counterfactual Risk Minimization.

6.4 Parameter Analysis (RQ4)

6.5 Time Complexity and Computation Costs (RQ5)

7 Conclusion

Footnotes

References

Cited By

Index Terms

Recommendations

Explainable Fairness in Recommendation

Counterfactual Explainable Recommendation

Prototype-Guided Counterfactual Explanations via Variational Auto-encoder for Recommendation

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations