What 'The Voice' Auditions Reveal About Gender Bias in Hiring

30 Jun 2024


(1) Anuar Assamidanov, Department of Economics, Claremont Graduate University, 150 E 10th St, Claremont, CA 91711. (Email: anuar.assamidanov@cgu.edu).






Discussion and Conclusion, and References

A Appendix Tables and Figures


Gender discrimination in the hiring process is one significant factor contributing to labor market disparities. However, there is little evidence on the extent to which gender bias by hiring managers is responsible for these disparities. In this paper, I exploit a unique dataset of blind auditions of The Voice television show as an experiment to identify own gender bias in the selection process. The first televised stage audition, in which four noteworthy recording artists are coaches, listens to the contestants “blindly” (chairs facing away from the stage) to avoid seeing the contestant. Using a difference-in-differences estimation strategy, a coach (hiring person) is demonstrably exogenous with respect to the artist’s gender, I find that artists are 4.5 percentage points (11 percent) more likely to be selected when they are the recipients of an opposite gender coach. I also utilize the machine-learning approach in Athey et al. (2018) to include heterogeneity from team gender composition, order of performance, and failure rates of the coaches. The findings offer a new perspective to enrich past research on gender discrimination, shedding light on the instances of gender bias variation by the gender of the decision maker and team gender composition.

Gender discrimination in the hiring process is one significant factor contributing to the poor labor market (Blau and Kahn, 2007). Gender differences in hiring are challenging to measure, making it difficult to distinguish its effect on employment from confounding variables, such as differences in human capital and relevant skills. However, Baert (2018) have cataloged studies on evidence of gender discrimination in hiring using experimental methods where confounding factors were disentangled to estimate the true effect on the labor market outcomes.

Studies have found hiring discrimination practices against women in the labor market, but most of this research has utilized experimental methods(Bertrand and Duflo, 2017; Baert, 2018; Neumark, 2018) . Within these experiments, pairs of fictitious job applications, only differing by the gender of the candidate, are sent to actual job openings. Discrimination is identified with the subsequent call-back from the employer and the gender of the candidate. The correspondence testing methodology is the gold standard for estimating hiring discrimination in the labor market (Baert, 2015). However, in this hiring literature, the gender of hiring decision-makers is unobservable, or decisions are made collectively. Furthermore, these experiments only capture call-back rates and do not go beyond the initial stage of the hiring process. In addition, assuming employers’ hiring decisions are based on a cut-off rule, there could be unobserved variable differences in the productivity between the two groups, which can cause biased discrimination measures (Neumark, 2012).

Another part of the literature on gender discrimination is own gender bias (i.e., favoritism toward people of one’s own gender) in hiring decisions. Laboratory experiments often find that women show own gender bias when information is processed automatically, but these results are not found for men (Rudman and Goodwin, 2004). However, it is unclear whether these results hold in real-world hiring decisions, which are likely to be characterized by a reflective (non-automatic) process. The existing studies on real-world hiring decisions typically consider particular segments of the labor market and find mixed evidence for own gender bias (Booth and Leigh, 2010; Bagues and Esteve-Volart, 2010; Bagues et al., 2017). Because of the data limitations in these settings, it is hard to distinguish if there is favoritism of the gender of the employer over the gender of the candidate (i.e., own-gender bias). This paper’s primary purpose is to test whether own-gender bias exists in the hiring process. Potential findings could inform our understanding of the current labor market at the individual level and the subsequent large-scale implications.

The main difficulty in testing gender bias in the hiring process is that hiring decision-makers and applicants are not the outcome of a random selection process to claim a causal factor in generating own gender bias. Also, the fundamental identification challenge is that researchers only observe real-world data on already hired workers—if the applicant is not employed, it is never recorded. Due to these limitations, there is an open question in this literature if the gender of the hiring decision-maker determines the gender of applicants being hired. In this paper, I exploit a unique dataset of blind auditions of the Voice TV show as a natural experiment to identify the own-gender bias in the selection process.

I assess own-gender bias in The Voice show, a setting in which a coach (hiring person) is demonstrably exogenous with respect to the artist’s gender. The first televised stage in the Voice TV show is the blind auditions. The four coaches, all noteworthy recording artists, listen to the contestants in chairs facing away from the stage to avoid seeing the contestants. If a coach likes the contestant’s voice, they press a button to rotate their chairs to signal that they are interested in working with that contestant. The advantage of these estimations is that the coaches only observe the voice of the contestants and conclude from the voice the probable gender of the contestant. So, due to the ”blind” decision process, other characteristics related to the artist and coach will be eliminated in this setting. I believe this is a plausible environment to use a generalized difference-in-difference identification strategy with random assignment of coaches to participants. Like Price and Wolfers (2010) study on racial bias by NBA referees, I compare differences in the probability of female and male contestants being chosen by female and male coaches.

By drawing several types of inferences, the analysis shows systematic evidence of an opposite-gender bias (favoritism toward the opposite gender). Coaches that are men predominantly prefer contestants that are women, while female coaches prefer male contestants. Contestants are 4.5 percentage points (11 percent) more likely to be selected by a coach that is the opposite gender. To examine heterogeneity more systematically, I adapt the machine-learning approach of Athey and Wager (2019). I have extended the analysis by including heterogeneity in the gender composition of each coach’s team, the coaches’ failure rates, and the performance order. The results provide intriguing insights into gender bias that varies widely across team composition during the selection of new contestants.

This paper offers one of the first explorations of hiring practices using random variation in gender to examine the effects on the selection process. The random variation in gender enables me to overcome potential concerns that observed disparities could be due to unobserved differences across contestant gender. Second, this paper applies the Athey and Wager (2019) method of causal forest machine learning inference on heterogeneous treatment effects to a quasi-experimental difference-in-difference identification strategy.

This paper is closely related to work by Carlsson and Eriksson (2019). Their work provides related evidence suggesting a role for in-group gender preferences has been documented in many other contexts. They investigated in-group and own-gender bias in real-world hiring decisions by combining data on the gender of the recruiter and the share of women employees in many firms with data from a large-scale field experiment on hiring. The results suggested that women (female recruiters or firms with a high share of female employees) favor women in the recruitment process. Nevertheless, this study provides only a tiny fraction of the participating firms’ information regarding recruiters’ gender, which might result in a considerable measurement error.

Similar to this work, Goldin and Rouse (2000) studied blindness in the orchestra context to examine sex-biased orchestra member selection as a policy intervention. This paper differs from their work in that I can observe the gender of the hiring and contestant, and coaches can observe the gender of the performer based on their voice. The coaches’ decisions will also be made individually, whereas, in the blind audition in the orchestra, there was a collective decision. Finally, the other uniqueness of this setting is that I could analyze the market structure where each coach could be assumed as a firm. I was able to illustrate the market power of the coaches by leveraging the order of the performance and the genre of the song they performed. Controlling for all these factors gives a new dimension to gender discrimination in the competitive market.

The rest of the paper is organized as follows. Section 1 provides a brief background on The Voice, summarizes the blind audition stage of the show, and gives a preview of our estimation strategy. Section 2 describes the data. Section 4 describes the empirical strategy in detail. Section 5 presents results, and Section 5 concludes.

This paper is available on arxiv under CC 4.0 license.