This post by Lonneke van Kampen is part of our series “Global Digital Cultures in times of COVID-19”, written by students of the research master Media Studies at the University of Amsterdam.
The COVID-19 pandemic has brought attention to global health- and socio-economic vulnerabilities, as certain groups are found to be in a more precarious situation than others. Gaining insight into these inequalities through data, however, remains complicated, as this requires an understanding of the infrastructures through which data is produced. Categories such as race and ethnicity often lack a clear definition in medical research and population statistics and carry rich and complex histories. In the Dutch medical context, the terminology around race and ethnicity often remains vague and outdated. I concluded my previous blog post discussing the possibilities of a race-conscious approach to the registration of data on COVID-19 patients. The objective is to reveal the impact of racism, rather than race itself, with regards to health inequalities and COVID-19. Taking this approach, however, one still has to grapple with the complicated terminology of race and its histories.
Part of this history is the connection between data and colonialism, or what Flavia Dzodan refers to as ‘the coloniality of the algorithm’. For Dzodan this refers to the use of contemporary technologies as a tool for racial, gender and class exclusions that can be traced back to the early days of modern capitalism in the eighteenth century. As Anna Carlson argues in a blog post for the DATACTIVE research project: ‘The gathering of data has long been a strategy of colonialism’. Statistics, the ability to gather the characteristics of a population based on various categories, is central to the development of the modern state in the early 19th century (Yanow et. al., 2016, p. 188). An example of such a big data project is the online Surinamese slavery register, which grants an insight into the way that data was collected and registered.
The result of this colonial past is the emergence of race as what Amande M’Charek, Katharina Schramm and David Skinner (2014) refer to as an absent presence: ‘a shadowy and slippery object’ (p. 462). As an absent presence it ‘oscillates between reality and nonreality because it is not a singular object but rather a pattern of various elements, some of which are made present and others absent’ (p. 462). Race in the Netherlands is not only an absent present normatively, remaining a taboo, but also methodologically, as it continues to emerge as a slippery concept that ‘might come in many different guises’ (p. 462).
The historical legacy of data classification
Taxonomy, the science of classification, has been used to categorize data since colonial times. According to Dzodan, this history means that classification inherently cannot be separated from race, gender, and class hierarchies. Its history continues in what Nick Couldry and Ulises Meijas (2018) have referred to as data colonialism, which ‘combines the predatory extractive practices of historical colonialism with the abstract quantification methods of computing’ (p. 336). Understanding the importance of historical context, Samarth Gupta, writing for FemLab, discusses bias in AI. Gupta highlights how digital technology itself is not solely responsible. The issue of bias predates the digital and can be traced back to large scale data collection and classification systems. Understanding these issues requires a historical perspective on the way that social categories are constructed and standardized.
These histories are important for any discussion on data gathering in the battle against COVID-19. According to Milan et. al. (2021), writing on Covid-19 From The Margins, we are experiencing ‘the first pandemic narrative of the datafied society’ (p. 14). While data analysis has played a large part in the response to COVID-19, for the Dutch public issues around data collection have probably been brought more to the attention by the recent tax subsidy scandal, which led to the resignation of the government. Wrongful accusations of tax fraud caused financial ruin for thousands of families. An independent inquiry showed that these accusations stemmed from ethnic profiling. Ethnic registration is, however, forbidden in the Netherlands. Profiling took place as a result of the over-monitoring of citizens with a second nationality. This is striking given that the registration of second nationalities stopped in 2014 yet tax authorities failed to remove existing data. Ethnic profiling then became possible despite this, through an assemblage of data that is available (birthplace, postcode, dual nationality), according to Chris de Ploeg in an article for OneWorld.
Race in the Netherlands: Allochtoon/Autochtoon
Important for data collection in the Dutch context are the categories used to discuss race and ethnicity, namely ‘allochtoon’ (migrant) and ‘autochtoon’ (immigrant). This terminology has been the main method to separate ethnic groups since the 1990s, along with the additional subcategories of ‘Western’ and ‘non-Western’. These categorizations are part of the data sets used to train predictive algorithms and bias against certain groups has been known to influence the way that algorithms function. More clarity is needed to assess how algorithms make certain decisions. The categorical vagueness of autochtoon/allochtoon has allowed race to continue to exist in the Dutch context as an absent presence, and the slipperiness of the subject is potentially what aided its (mis)use in the tax system.
The situationality of the allochtoon/autochtoon binary is discussed by Helberg-Proctor et. al. (2018) as the way ‘scientific practices involved in research on race and ethnicity in health and genomics have been shown to be situated in and shaped by various specific national, sociopolitical, and historical contexts’ (p. 412). Their analysis shows how categories are often used interchangeably, remaining largely undefined. Ethnicity is constantly allowed to be recategorized and renamed. While racial categories are not included in the official Dutch categorization system, the status of ethnicity as an absent presence has allowed it to be enacted as biological race in health research. Helberg-Proctor et. al. (2017) have also described the political invention of the term ‘allochtoon’ in the 1990s. It came into being as a replacement for what previously was referred to as foreign workers and ethnic minorities. The distinction between Western and non-Western was made ‘based on perceived socioeconomic and cultural similarities and differences [italics in original]) (p. 6). The way that ethnicity and race are represented in health policy and discourse is contingent on political discourse, therefore it is inherently a ‘co-production between society and science’ (p. 10), but of course, their institutionalization does have a real material impact.
The tax subsidy scandal and ethnic profiling
Noting in this context that categories are ‘being created through administrative practices in everyday (or quasi-everyday) life’ (p. 189), as Yanow et. al. (2016) do, makes sense. Yet, that also assumes that once they are no longer ‘being created’, they also stop having an impact. However, as the tax subsidy scandal shows, this is definitely not the case. The previous availability of certain sets of data that has now become unavailable continues to influence how data is being used. Data assemblages fill the gaps of knowledge upon which the tax authorities are used to operate, despite the changes in categorizations. The operationalization of these data assemblages shows that it does not suffice to discuss a singular category, as it is in the combination of categories and information that ethnic profiling has remained (partly) possible. When it comes to data, the sum is clearly greater than the parts.
As an example of issues around data collection and ethnicity, the tax subsidy scandal has resurfaced ethnic profiling as something that many have hinted at but that remained difficult to prove. Given the importance of population statistics for the development of the modern state, it is difficult to imagine a way to transform the current system without addressing its past. The tax subsidy scandal further displays the importance of an intersectional approach to data, as categories of race and ethnicity clearly cannot be discussed without noting the impact of poverty, gender, (dis)ability, age, etc. because it is important to consider what different data assemblages reveal when combined. The same also applies to COVID-19. Fighting the spread of the virus, medical professionals have noted inequalities along with race and ethnicity. Discussing the large impact of social determinants of health remains difficult given the status of race as an absent presence. Instead, health outcomes are assumed to be based only on ‘biological factors’, which in the case of race (and others), begs the question of how one even goes about making such clear distinctions. What would a dataset that addresses these intersections look like? And how do we prevent institutions from being able to profile and surveil groups by combining data in such a way that new information becomes available that should not be available?
Lonneke van Kampen is a research master’s student in Media Studies at the University of Amsterdam with a background in Cultural Studies. Lonneke is interested in Disability Studies, environmental humanities, glitch art, bias of AI and narratives around AI. She is also interested in experimental methodologies and exploring how disability activist practices can enrich more abstract and theoretical research (Twitter @Lonnieflex).