Attuning to Data Doubles

I travelled to Transmediale 2015 to chair a panel, in part as commemoration of AVANT’s launch in 20141
and in part to elaborate on a topic close to my active research interest in data subjectivities. Transmediale is a Berlin-based festival and year-round project that has been running for 28 years. In the course of its history media art has emerged as its central concern. This was my third time in attendance and a back-to-back visit from last years ‘Afterglow’ festival. Elvia Wilk has a great review of the festival on Rhizome that details Transmediale’s return to form following a dip in 2014. This years Transmediale was themed CAPTURE ALL, and its conference programme, screenings, performances, exhibition and admirably robust workshop programme all responded to this theme.

When quantification becomes part of our daily routines, we more or less consciously contribute to a permanent capture of life into data. Value can now potentially be extracted from everything, and productivity can be measured in all fields of life … Life is increasingly governed by a logic of CAPTURE ALL as a never-ending enterprise of predictive control.

Transmediale Festival

The entire festival ostensibly followed a three threaded specification of the Capture All thematic, into the threads Capture Life, Capture Work, Capture Play. In my experience this division only played out in conference and panel discussions. The panel which I proposed didn’t fit neatly into any one of the aforementioned categories and instead set out to map and explore data doubles, (or as they are elsewhere designated; data shadows, data dopplegangers or digital dossiers2). I define “data doubles” as aggregated data profiles, abstractions construed through, and which operate by, probabilistic analysis of population scale cohorts. These data doubles come to bear upon, and exert agency over, a given person’s range of action. In short, data doubles are actors in informatic society.

Heath Bunting: Status Project

I consider this a non-controversial statement, especially for a media art audience: artworks which have addressed this terrain include seminal database artworks like Heath Bunting’s Status Project. That noted I felt that this panel responded to some of the stated ambitions of the festival:

“The constant datafication of physical and online interactions enables new modes of identification and normalisation. … CAPTURE ALL … responds to the asymmetries of a datafying world… and focuses on the ambiguous relationship and uncanny tension between the user and the algorithm, the self and the constantly evolving apparatus.”

Transmediale Exhibition Curators3

The only caveat that I would add is my panel was concerned with the tension between user, algorithm and data structure – which is also the fundamental recipe for human-computer-interaction. There were also questions which I was keen to air that didn’t have an explicit root in the concerns of the festival. These included whether data doubles were useful to exploring systems subjectivities, which I loosely define as the individual dispositions requisite to living as a planetary species coupled with distributed technological systems. I also wanted to determine what statistical literacy is requisite to agential participation within informatic systems.

YouGov Profiler

As the scope for encounters with these abstractions increases (in step with the range of enterprises and disciplines interested in dissecting our digital dossiers) the question of ‘how do we relate to them?’ has grown more pressing. How much attention should we (or can we) pay to the data-shadows which inflect our agency within data systems? Might we consider our acquaintance with these abstractions as a form of intimacy? From my vantage point it is clear that we all participate in databases. The terms of access implicit to that participation, or what is perhaps better framed as asymmetries of access are important starting points for critique. But so too is the question of how we experience that which we are partaking in. How do you experience something you participate in, but which provides no points of purchase for your sensory and or affective mechanisms?

The panelists I invited to speak were Matthew Plummer Fernandez, Ilaria Biotti and Andrea Núñez Casal, and I encouraged each to elaborate either on their practice or field research. Matt is a researcher and artist, currently conducting a praxis based PhD in addition to his ongoing artistic interventions. Ilaria is a media artist, part of the UdK KontextSchule and a member of the artist association Peninsula. Andrea is a “la Caixa”-funded PhD candidate at Goldsmiths, University of London. Each of them took up the challenge of responding to a panel theme that connected to facets of their work, rather than being explicitly about their central thesis. Part of my motivation in writing up this reflection is that I felt like I did a poor job as panel moderator in explicating the links between each of my panelists. Those connections were crystal clear in my head but I could have better conveyed them to the audience assembled at the Cafe Stage. In addition each panelist presented some genuinely insightful reasoning that I’d like to share with you all.


Ilaria Biotti’s Appytopia

The running order of the panel aspired to traverse the scales of data capture and representation which are conducive to producing data doubles. Matt commenced proceedings, followed subsequently by Andrea and Ilaria. Matt drew upon his excellent algopop Tumblr to deliver an overview of proximate (some intentional, others less so) encounters with these ‘doubles’. Ilaria elaborated upon the theory behind her conceptual app ‘Appytopia’, begging questions such as “how do mappings constrain our ability to relate to place’? How do benchmarks dictate our affective experiences in lived space?” To conclude, the focus returned again to the intimate – arguably the the most intimate – as Andrea Nunez Casal reflected on her ethnographic research of microbiome sequencing sites.

Matt used anecdotes to amusing and engaging effect, and the selection of many of them reflected on Matts ability to use software programming in his art praxis. Some of the data double encounters he outlined were unintentional:
…while others exhibited greater intentionality, like John Matrix. Internet security consultant Peter Fillmore generated short midi clips using WolframTones, uploaded the tracks to Spotify and then generated a myriad of bot fans. This automated audience amassed ‘John Matrix’ thousands of dollars in royalties. Such instances illustrate how those who know how data capture systems anticipate us are capable of gaming the systems, injecting duplicitous data doubles back into the system to do the bidding of their human creators. The most interesting such example Matt selected was Chris McKinlay’s intervention with OkCupid. The OkCupid data insights blog used to be one of the most interesting data mining reads back in its heyday between 2009 and 2011 (it recently resumed publishing late last summer). OkCupid’s demographics and ‘partner preference’ drop down boxes are supplemented by their compulsive ‘questions’ section. Correlating these sources of data enabled many humorous and occasionally poignant insights. That entire process is one way by which data doubles can be extracted from aggregate analysis. What McKinley had done was to parasite the existing doppleganger affordances of OkCupid and, in his own words:

(conduct) a slightly more algorithmic, large-scale, and machine-learning-based version of what everyone does on the site

Automation and data scraping were the tools which McKinley amassed his data substrate with. After three weeks he’d harvested 6 million questions and answers from 20,000 women all over the country. Matt explained how a statistical analysis tool initially developed to study diseased soy beans enabled McKinley to categorise the responses into seven statistically distinct clumps, or groupings. He identified Dog and Tattoo (for the full context on this, be sure to read Wired’s feature on McKinley), mostly made up of creative professionals and young arty types respectively. He then designed his doppleganger accounts, profiles tailored to match the desires of Dog and Tattoo. Further automation was leveraged to visit every individual profile in the sets Dog and Tattoo, before McKinley went on a marathon 88 dates, found the love of his life and then published a book on his method (of course).


The one detailed technical digression which Matt entertained was explaining Bayesian inference, and the basis of Bayesian logic. This helped explain how Facebook ad targeting work (and thus added information, if not explication to the aformentioned gaydar in Facebooks advertisements ) while also permitting mention of a strategy of resistance adopted by Kevin Ludlow – Bayesian flooding.

Over the past several months I have entered a myriad of life-events to my Facebook profile… Some of those life-events are true, and some of them are not. In my fictitious life I’ve explored a dozen different religions, had countless injuries and broken bones, suffered twice through cancer, been married, divorced, fathered children all around the world, and have even fought for numerous foreign militaries. My intent was to coin the term Bayesian Flooding within the same sphere as Bayesian Filtering, a common method of filtering junk email by word analysis (emphasis added)

(Ludlow’s experiments followed a spate of such tinkering with Facebook, as variously explored by Caleb Garling, Elan Morgan and Mat Honan – 4) Though there wasn’t time during the panel to tease out yet another important role of Bayesian logic this post is the perfect opportunity to do so as it bridges neatly between Matt and Ilaria’s presentations.


xkcd comic 1132

Though beyond the scope of this post to surmise, there is a body of research out there which contends that the mathematics of Bayesian probability is inherent to how we humans act and make decisions in the world 5. Bayesian reasoning is attractive to a generation of computational theory of mind scholars who have marvelled at its ability to ape human decision making processes. Its also appealing to those researchers confounded that the notion of “humans = rational actors” ever held sway, and who are labouring to overturn just such a presumption. Part of Ilaria’s research dealt with some famous economists aligned with the irrational humans opinion namely Amos Tversky and Daniel Kahneman prize winning work on ‘rational choice':

“Because of imperfections of human perception and decision, changes of perspective often reverse the relative apparent size of objects and the relative desirability of options”

Kahneman and Tversky6

Kahnemanns theory was developed into Nudgenomics by Richard Thaler and Cass R. Sunstein. The examples of dark patterns highlighted by that discipline as ‘best practice’ practically illustrate what Kahneman and Tversky were getting at regarding human imperfections. Decision making is important to Ilarias question of “whether mappings constrain our ability to relate to place” because more often than not mappings contain many implicit decisions. When figuring data mining and data modelling (mapping by any other name) of a given space (or domain) discourse often draws on the metaphor of Borges ‘ map exceeding the territory’. Ilaria’s talk invited us to imagine what a cartesian coordinate on just such a map might make of its surroundings.

“My central question is, how do new informations shift the boarders between social constructions and personal perception?”

Ilaria Biotti

Emotive factors can also distort the ‘relative desirability of options’ and this was something Ilaria expounded upon with reference to a combination of Thomas Moore’s Utopia &DeBord’s psychogeography and Bhutan’s Happiness Index!


In contrasting Moore and DeBord Ilaria deemed that:

“a precise choice has been made in order to guide the user’s perception through a representation of … affectivity within a normative frame”.

Ilaria Biotti

In Moore’s Utopia we have an imaginary “mapping (of) a (non)place who’s inhabitants are normatively living a happy life”. Whereas within DeBords urban defamiliarisation we witness “a practice devoted to breaking of normativities contained within urban models and social codes”. The imaginary of Moore contextualised the site where Ilaria’s app intervenes, whereas psychogeography designates its strategy for action. Ilaria framed the formulation of the Happiness index as instituting a ‘happiness race’ between countries:


“How does the representation of a ‘happy race’ between countries shift the perception of the country itself in terms of value and status? Does this operate a concrete change in the everyday of it’s inhabitants?”

Ilaria Biotti

Revisit that map exceeding the territory. Constrains its borders, but add dimensions in the form of happiness benchmarks. Make the question concrete for citizen-participant with the ‘Ladder of Life’. In the words of Hunter & Werbach:

“think about your participants as player. They are participating because they want to. They get out of it value and enjoyment. This is how you can squeeze more out of individuals”

Hunter & Werbach

‘Squeezing more’ speaks to the capitalist urge for growth, an urge which the imperative of logistics is aligned. There are always gains to be made, once the correct benchmarks, indices and requisite data mapping and extraction infrastructure can be set in place.


Ilaria’s presentation toured us through the perspectives on happiness which are conducive to “Happiness”s abstraction, and subsequent rendering as a benchmark. The Bhutan Index operates at the level of nation state, allowing novel (or additional) comparison and sorting of countries. Appytopia intervenes in this logic to toy with it on the urban scale, but is attentive to the phenomenological pliability of the urban quotidian in a manner which feels absent from much Smart City discourse or self-tracking apps that enumerate ones subjective happiness and index it to geographic coordinates. The tension between the artifice and seeming imperative of the Bhutan Index contrasted with what Andrea detailed; microbiome sequencing – a biotechnology that can detect and measure microbial diversity within us and in our surrounding environment, diversity which has evaded our knowledge until very recently. The microbiome is vitality that can only be empirically understood through data (in my opinion anway).

Andreas presentation drew upon her ethnographic appraisal of Professor Rob Knight’s (University of Colorado Boulder) lab, which sampled both microbes from the Amazon basin and Microbes from American Gut. As Andrea remarks:

In both instances, the molecular and fleshy lives of microbes become informational codes of DNA within data repositories, its final destination.

Andrea Núñez Casal


The transit between environmental microbiome and the microbiome resident upon and within humans is a key part of Andrea’s research. It is also an area where comprehending the entanglement of human and non-human agency is perhaps easier7 than with algorithms. Be that as it may Casal made important contributions to problematicising microbiome research as instantiated in its current trajectory. Based on her research findings Casal argued that the microbiome is presently figured in predominantly and problematically Western terms.

“Biomedicine’s general assumption is that bodies are the same and they can be normalised through biomedical technologies”

Andrea Núñez Casal

In that respect microbiome sequencing moves in step with the urge to capture data, which is in large part due to the urge to have a means to compare across and within populations. In spite of this Casal cautioned that “data doubles’ or ‘data shadows’ are not universal realities. Data doubles are more present in some geographies (Western) than in others (non-western)” and instance of ‘bioinequalities’

In both of their presentations Ilaria and Andrea made mention of a 2014 study by Turiano et al which I had used in the description of my panel:

Research by Turiano, N et al 8 correlated categorical data (how ‘in control’ a given point of data, i.e. a person, feels) with longitudinal epidemiological data. Less education tallied with a greater sense of mastery, and less risk of mortality in comparison to well educated members of the same populace. All caveats with respect to generalising noted, this instance highlights the problem of sufficient information when it comes to acting in accordance with ‘risk profile’ medicinal predictions.

Stephen Fortune

It was this notion of ‘acting with sufficient information ‘ with respect to systems of data capture and inference that prompted me to ask the panelists what sort of literacy did they think was necessary? Andrea referred to the potential pitfalls of what ‘is read’. Regarding accessibility, the data provided in American Gut is comprehensively represented and thus quite accessible. However, medical data is not provided. The question of who can access American Gut however becomes one of economics: ‘microbiome projects are generally oriented towards an individual approach to healthcare which is likely to benefit sectors with a higher socioecomical status within western populations. (Casal)’ Matt continued this considered line of reasoning pointing out that questions of comprehension carry an implicit assumption of agency residing with the human actor. The issue of agency was reframed towards comprehension of what decisions are being made, and how they are being made in accordance with tools, or subsequent to a tools labour (or execution, to stay specific to software). As Natalie Kane, (referencing both Dan Williams and Matt) surmises:

‘algorithms’ (are) ‘a set of instructions written by someone’ … confronting complex computation systems demands a socio-technical, not a purely technical solution.’

Sascha Pohflepp made an intervention with respect to the multiplicity of data doubles, that there exists many databases where a trace of you resides. This raised an idea of data migration (whether in open data or in terms of 3rd party cookie data harvested then licensed to a bigger vendor like Axiccom), which in a way connected to the themes of Andrea’s work which was about transit and travel of disembodied data.


For my part I best liked the notion provoked by Matt’s mention of a rhino’s data double, forced upon it by snap-happy tourists. Acting responsibly towards others (human and non-human) within these systems is certainly a productive step, and raises an interesting question: if we have dispersed agency with respect to data systems, is our responsibility similarly dispersed?


  1. Last year AVANT launched with a panel including myself, Julian Oliver, editor Sam Hart and former editor Lisa Baldini.
  2. Solove, D. (2004). The digital person technology and privacy in the information age. New York :: New York University Press
  3. Daphne Dragona and Robert Sakrowski
  4. the latter two being particularly interesting in how their respective responses were and introspective and extrospective
  5. Griffiths, T.L and
    Tenenbaum, J.B ‘Optimal Predictions in Everyday Cognition’ Psychological Science. 2006 Sep;17(9):767-73 and Chater, N. & Oaksford, M. (eds.) ‘The Probabilistic Mind: Prospects for Bayesian Cognitive Science’ 2008 to pick but two
  6. Amos Tversky and Daniel Kahneman: “The framing of decision and the Psychology of Choice”, Science vol. 211, 30 January 1981
  7. and more startling, see recent findings about phenotypes transmitted through microbiome
  8. Turiano NA, Chapman BP, Agrigoroaei S, Infurna FJ, Lachman M. “Perceived control reduces mortality risk at low, not high, education levels.” Health Psycholology Aug;33(8):883-90

Have your say