Race and Data: Race as a Predictive Factor

“Hi Siri - what’s the weather today?”

“Okay Google, is there life on Mars?”

“Alexa, add toothpaste to my cart.”

It is hard to argue that data and technology have not fundamentally changed our day-to-day lives, in many ways for the better. Therefore, the growing application of data in the social sector has created general excitement across various stakeholders - government officials, service providers, and philanthropic partners. From assessing policy decisions to determining resource allocations, the use of data and evaluation is slowly becoming the norm rather than the exception.

At Third Sector, we partner with state and county governments to enhance how social services are contracted, aligning funding to the achievement of positive life outcomes. As part of this movement towards greater outcomes orientation, data access and evaluation have been a core tenet of our theory of change. Access to data has allowed us to evaluate the unmet need across jurisdictions, identify the characteristics of the intended beneficiary population, and define outcomes that help us quantify impact for the community.

At a quick glance, you are probably thinking this sounds great; we should all strive to be more data-driven. However, similar to how we are susceptible to availability bias in decision-making, there is an overreliance on accessible data in evaluation which exacerbates the following issues:

  • Machine Bias: More than 60 American police departments have adopted a form of “predictive policing” to guide where police patrol, whom they target, and how crime is investigated. Even with efforts to control for racial disparities, machine bias can distort policing and build upon historical data that contains human prejudices through the omission of key data fields.
  • Barriers to Inclusive Participation: For individuals to fully participate in state or federal data collection efforts (e.g., U.S. Census), they must trust the government. Consequentially the lack of trust has led to an underrepresentation of historically marginalized communities which inadvertently create biased counts that are used to form policy decisions.
  • Human-encoded Bias: With the growing reliance on artificial intelligence, it is easy to assume that unbiased data will produce reliable and trustworthy results. However, it is important to remember that algorithms are designed by humans who bring prior experiences and contextual beliefs. As a result, the biases embedded in the questions asked and the data used are inadvertently codified in the analysis.
  • Inattention to Culturally Sensitive Metrics: The data collected tend to reflect the needs and wants of the individuals designing the dataset. For most administrative datasets, the designers generally assume the perspective of the government and their measures of impact or compliance needs. If the community has different conceptions of success, those viewpoints are often left out.

Given the existence of structural biases in datasets and an inadequate attention to inclusion, Third Sector is in the process of taking the time to critically examine how we can be more deliberate when working with data in our local communities. Whether you have been to the doctor, filled out a customer survey, or applied for a loan, you have probably come across a question related to your race or ethnicity. Although these fields are usually optional, they are often times readily available in datasets and therefore, commonly leveraged when evaluating events, services, or ideas. Correlations of outcomes with race can reinforce stereotypes, justify differential policing, or create other negative effects. Not only do we need to exercise caution when distinguishing between correlation and causation, but also continue to ask “why” and seek out other factors even when those data elements are harder to access. For example, de facto neighborhood segregation creates surrogate measures of race in geography. Correlates, such as this, obscure the degree to which race is a predictor of negative outcomes. Thus, we must be careful to avoid unintentionally continuing to over-rely on available data.

This past year, Third Sector has made a commitment to diversity, equity, and inclusion (DEI) by deliberately thinking about the role of DEI in shaping our organizational identity, as well as its implications on our work in the field. As part of this three-part series, the next two blog posts will dive deeper into how we are applying the lessons we have learned about balancing intentionality with impact, especially as it relates to broadening our scope, disaggregating data, and embedding community voice.