“When you look into them, you start to realize that they almost universally intersect with the interests of the most vulnerable.” (Mimi Onuoha, 2017)
While reading D’Ignazio and Klein’s (2020) Data Feminism, I came across the work of Mimi Onuoha, specifically her project The Library of Missing Datasets. When first seeing the work I was reminded of the blocky steel solidity of the framed filing cabinet. In the digital era, we see less and less of these prototypical filing cabinets and it is easy to forget just how weighty and authoritative they are.
It’s an interactive work, so once opening the cabinet the audience is able to see the labelled files inside: ‘People excluded from public housing because of criminal records’; ‘Public list of citizens on domestic violence list’; ‘Muslim mosques/communities surveilled by the FBI/CIA’. You might presume these files would provide the impetus for improving community justice, however, these are datasets that do not exist – the files are empty. In addition to issues of data justice, Onuoha’s empty files highlight other uncomfortable facts that we often choose to overlook, such as ‘Sales and prices in the art world (and relationships between artists and gallerists)’ and ‘How much Spotify pays each of its artists per play of song’.
While The Library of Missing Datasets is essentially an empty library, like the work of Trevor Paglen, what might appear empty ends up signalling quite loudly. Onuoha’s work reveals the social indifference and bias that exists across society and draws attention to the fact that when it comes to data, we measure what we value. That is why you might find datasets that highlight issues to do with the maternal health of white women in the US, but the same datasets do not exist on black women (D’Ignazio & Klein, 2020). With public health increasingly understood through the collection and analysis of data, omissions such as these are deeply troubling. Onuoha’s work highlights the fact that data often intensifies the structural and systemic inequality that already exists in society.
From the outset, the work also offers a rebuttal to the goals of what Andrejevic (2020) calls ‘framelessness’. Framelessness is data collection with no frames or limits – it is total information capture. As Andrejevic explains, the goal of framelessness is to overcome ‘the biases of partiality’ through ‘greater range, depth, and functions of monitoring’ (p.117). So how does framelesness and total information capture articulate with Onuoha’s work and the concerns of other feminist and data justice scholars (i.e. D’Iganzio & Klein, 2020 and Hintz, Dencik & Wahl-Jorgenson, 2019)? Does framelessness mean that the issues that intersect with the interests and concerns of the marginalised and the vulnerable will be addressed?
The answer highlights the highly political process of datafication. It is quite likely that we already have the data to help address important community and social justice issues like those raised by Onuoha. This is possibly why the work is called The Library of Missing Datasets and not The Library of Missing Data. However, whether those with the expertise and authority to collate, process and draw inferences from computational analysis will turn (or be paid to turn) their attention to these issues is another matter entirely. The solution does not lie with collecting more data. More data will not lead to the end of bias and impartiality. Instead it hinges on two different parts of the datafication processes: first, whether there is the motivation to address these issues, and second, that algorithms are designed to process data in a way that addresses the concerns of the most vulnerable.
Onuoha’s work is a physical rendering of the authority and durability of ‘datasets’. Paradoxically, the work materialises the data not collected by documenting its absence. Indeed, Onuoha’s work implicitly draws attention to the material conditions of data and data processing. The empty datasets that are collected in the files and listed on the project website, are only a very small proportion of the data that is continuously collected about individuals. With total information capture looming on our digital horizon, attention must be turned to the curation and processing of data, as well as what and how it is collected.
Images: Mimi Ohuoha, 2018, The Library of Missing Datasets. http://mimionuoha.com/the-library-of-missing-datasets-v-20
Andrejevic, M. (2020). Automated Media. London and New York: Routledge.
D’Ignazio, C., & Klein, L. (2020). Data Feminism. Cambridge, MA: The MIT Press.
Hintz, A., Dencik, L., & Wahl-Jorgensen, K. (2019). Digital Citizenship in a Datafied Society. Cambridge, UK: Polity Press.
Onuoha, M. (2017) Mimi Onuoha – Missing Datasets, Data + Society. https://vimeo.com/219717800