MintingFacesinLatentSpaces:HowSMPLverseisinvertingtoolsformachinecognition
Minting Faces in Latent Spaces: How SMPLverse is inverting tools for machine cognition
Repurposing avatars generated for the infrastructure of the metaverse reveals its embedded preferences in the process
When people in Web3 curate pseudonymous identities, they do so to decorate their cryptographic transactions; avoid being the product in free, centralized servers; or for some, virtually transmogrify into their non-fungible token profile picture. The aim is to make deciphering who’s behind the monitor more difficult. But in artist and researcher William Wiebe’s SMPLverse, a matrix of synthetic faces (cue various balding patterns and skin tones; facial adornments and accessories; identities and expressions), what might seem uncanny or meaningless to a human is actually an over-identifier for machines—an idealized model that is part of the computational infrastructure to better interface with humans, interpret our expressions, and understand our behavior.
“Machine learning models trained on synthetic data, which is diverse and perfectly labeled by design,” writes Wiebe, in a Mirror article authored this May, “generalize better than models trained on real data, enabling avatars to represent users in granular detail.” What now may appear to be a “funny face” could soon be a tool used to control or predict human behavior. Moving beyond merely representing human faces in immersive reality, facial recognition technologies have long been weaponized by law enforcement to facilitate arrests, limit migration, and perpetuate incarceration.
“Such computational culture is the front along which contemporary power shapes itself, engaging formal logics and the insights of experts in adjacent fields,” writes Nora Khan, “to disappear extractive goals.” She aptly locates these technologies of recognition and identification within a dark history of racist, hegemonically minded pseudosciences like phrenology. “That it works to seem rational, logical, without emotion, when it is also designed to have deep, instant emotional impact, is one of the greatest accomplishments of persuasive technology design.”
While the face images that will be spit out by SMPLverse don’t necessarily appear rational, we must learn to recognize their potential utility. SMPLverse NFTs employ facial recognition to obtain data from HoloLens’s synthetic training dataset (data synthesized by Microsoft researchers to inform face-tracking algorithms for their mixed-reality headsets). After minting, token-holders will receive an NFT with which they can submit a single image via webcam. The submitted image will be compressed and written into the token as a hash, and the facial model takes the hash to match the input to a SMPL. Once an image is matched, it receives one attribute: Confidence. Confidence determines the likelihood that the input image matches the SMPL token-holders receive. Mechanics, statistics, and system theory aside, Wiebe’s endgame materializes in a redistribution model. “By fractionalizing Microsoft’s dataset into NFTs,” he says, “SMPLverse inverts its function, redistributing the metaverse’s centralized assets through its paradigmatic decentralized property form.”
Leading up to SMPLverse’s launch this Friday, Zora Zine Contributing Editor Blaine O’Neill spoke with Wiebe on the legacy of institutional critique, post-corporate responses to racial and gender bias in machine learning, and over-identification.
Blaine O’Neill (ZORA): You are an artist who works with machine learning. Is this your first NFT/Web3-specific project? Could you explain a little bit about your practice and what led you here?
William Wiebe: Yeah, I'm an artist based in New York. My work addresses how communities, organizations, and technologies all generate their own interpretive frameworks. I work by constructing formats or scenarios that activate these frameworks in order to elaborate on how they produce a certain kind of reality.
In 2018, I did a project looking at ShotSpotter, an acoustic surveillance system in Chicago that uses sound as forensic evidence. Like many vaporware products, the technology hasn't quite caught up to the premise. So, in the case of ShotSpotter, the bulk of the labor is actually done by in-person engineers. This was a particular instance in which a provisional way of looking at the world—when applied on a larger scale—actually starts to move the world towards its interpretive framework, whether or not it was correct to begin with.
Then as part of a Fulbright program in Cyprus, I focused on a provisional framework the EU had set up as part of Cyprus’s accession to the Union. I was interested in how this long-running territorial dispute was being adjudicated from afar by the EU. So I exploited a loophole within this legal framework that allowed one to bring goods over the border from Northern Cyprus into the Republic temporarily if they were to be used for a public exhibition.
BO: Did the research you did in Cyprus result in writing or an artwork?
WW: The main artwork was an exhibition of these artifacts or things that I had brought into the state on this temporary dispensation, because displaying them was one of the conditions for their entry. Since I had the perspective of an outsider, on provisional terms in Cyprus myself, I was more interested in looking at how other outsiders understood this conflict, namely EU legislators in Brussels.
BO: How did you go from there to thinking about the blockchain as a place where you might want to make or show work?
WW: I was doing the Independent Study Program at the Whitney when COVID hit and then, when the George Floyd protests happened, I felt this internal/external ambivalence. Being the most proximate I’d ever been to the art world, while watching this double failure to address what was happening. I had always felt my audience wasn't necessarily in that world, but witnessing the institutional malaise—everyone waiting with bated breath to be able to go back to doing what they were doing before—was disconcerting to me. So I wrote an essay that tried to formulate what other types of cultural practices are available, considering these 20th century institutions arose to support a particular type of practice and then became the horizon of possibility for many artists.
BO: You mean, institutional critique?
WW: Yeah. Or, some artists formulated their practices in response to these institutions that were provisional to begin with rather than finding new ways to address audiences or deal with the world in which they lived.
BO: Institutional critique seems to be a way an artist could find purpose and locate their practice within a neoliberal landscape yet also the stability afforded by institutional affiliation. The essay that you wrote differs from this mode (IC) in that instead of a call for reform or a conceptual conceit, you end with a call to experiment with decentralization. The authority (or existence) of the cultural institution is not taken for granted.
WW: I mean, in my particular experience—“all that is solid melts into air”—everything kind of vaporized instantly when the pandemic hit. So, for a moment, that institutional context was not even there to respond to. We were often on the streets looking at these cultural institutions from the outside. There wasn’t this kind of formal barrier where discourse is contained within an institutional vessel. Immediately, that very physical experience prompts you to start thinking outside of the institution.
BO: Was your SMPLverse project a result of this experience? How did you arrive at working with machine learning and how did you learn about this synthetic training data?
WW: It arose out of a larger project I am still working on that looks at attention, visual cognition and immersive environments. I’m interested in ”visual saliency prediction” which is like teaching robots or human-approximating machines to see or interpret a scene in a similar way to how a human eye sees it. Triangulating from “porting” human vision into machine vision, new articulations of how humans see, and ways to manipulate visual perception that are being developed within the framework of immersive technology. There's so much variability in terms of how to develop technology that can accurately track someone's gaze.
Originally I was looking at a synthetic eye model that is being used to train eye tracking algorithms to be able to more granularly identify where the pupil is, since a lot of immersive technology is dependent on knowing where someone is directing their attention at any given millisecond in order to reformulate the environment. One of the researchers who developed the model went on to Microsoft, where he developed this synthetic training dataset for human faces.
BO: Is this particular data set developed specifically for the HoloLens for Microsoft? Does it include the entire human skeleton or just the face?
WW: Yes, it was developed specifically for the HoloLens, a mixed-reality headset. The Hololens uses entirely synthetic hand and face data and then kind of projects someone's avatar just from knowing the spatial coordinates of their hands and head. The way the Hololens “stitches” these sparse inputs together is through a human body model called SMPL, which is a learned body model based on a bunch of 3D scans of a large population of humans, which can then be extrapolated into different use cases for avatars.
It's just the face and it's just the outputs. Like any PFP project, it's procedurally generated. It takes in a number of parameters. 27,000 expression parameters that the face model has learned from 511 individual 3D scans. There are 11 types of eyewear and 37 types of headwear. Like a Bored Ape or whatever, the data set mixing and matching these parameters to create a diverse data set. Thinking about diversity not only in terms of physical appearance, but also diversity of the ocular conditions for image generation. When you're making a synthetic image you can model any type of camera setting, lens, or perspective; you can “grow out” diversity to a lot of different parameters.
BO: Would you say this dataset is an example of a corporate response to racial and gender bias in machine learning?
WW: They explicitly address this in the dataset, claiming to have “unprecedented realism.” You look at some of these images and you're like, "neither looks realistic, nor diverse” because it's procedurally generated without human oversight over the outputs. You can see it as an instance of corporate overreaction to these critiques. From one perspective, the critique is that the data set isn't diverse enough. A counterpoint to that is: do we really want an equality of surveillance? It never asks whether the technology is something that we want.
BO: Is the idea that the synthetic faces in the dataset will allow developers to better create virtual avatars of users in the metaverse?
WW: I would say it's not so much to create better avatars, but to make the avatars more responsive to the gestures or facial expressions that the real-life person is making. So, it's less about how you appear in the virtual environment, it's more about how you appear to it.
BO: You're exploring a system that may predetermine how machines understand individual subjects.
WW: I'm trying to tease out the difference between the elective avatar or the elective prompt that you give the machine and what is already embedded in the machine's grammar. These images are avatars for the infrastructure of the metaverse, a proxy for the machine, because they are ultimately indifferent to human vision. They weren't produced for people to look at them yet they tempt this kind of over-identification. You project onto them even though they fundamentally or foundationally refuse that identification.
BO: Interesting. It's funny how that phenomenon of identification aligns with the popularity of PFPs. Whether or not they are pseudonymous, people in certain corners of the Web3 are proud to identify with portraits that don’t usually represent their IRL selves.
WW: I hope the work can kind of occupy both sides of that divide between identity and pseudonymity insofar as neither of them are really commensurate with our subjectivity, thinking about how the concept of pseudonymity is reconfigured or broadened by its reference to the presentations we make of ourselves online.
I would also add, on a technical note, what's happening in the NFT minting process. You will both receive this approximation of the image that you've submitted, but also the image you submit is stored in the token as a hash. The output image mimics the form of visual appearance onto which we over-identify, but the SHA-256 hash stored in the token preserves its content. Because we’re using a one-way compression function, the original input image can never really be deciphered, but its vestige is maintained as a cipher inside the token.
BO: Can you elaborate more on the project’s minting process? In designing this project, are you looking at minting as a character creation process based on someone's face or whatever image they choose to upload?
WW: The minting process is a character creation process but all of the character creation happens in the real world and not in the virtual space. It depends on how much you actually look like one of these images that weren't meant to approximate a real person to begin with. When you match the image, you get the confidence that you were/are the image. You're assigned to the highest confidence image available and then you get the attribute of that confidence level.
BO: Do you think the secondary market will be a little bit less active since the people who make these NFTs may feel reluctant to part with them because they feel that they're connected to their own faces or identities?
WW: I'm curious about that—if confidence becomes a parameter of value. It's not guaranteed that it's going to look anything like you but yes, the marketplace for avatars is difficult when they internally link to a bodily identity.
BO: Is there anyone you’d like to thank, or any inspirations towards this project that you want to mention?
WW: I’d like to thank Piotr Ostrowski, the Solidity/back-end developer I collaborated with on this project. I was really inspired by mannys.game, which is kind of the opposite of my project, where everybody was acceding to the same identity. I learned a lot being a part of Friends with Benefits both in terms of technical challenges and participating in a discourse around NFTs. I was inspired by Holly+, thinking about different ways of conceiving intellectual property and the idea of constructing a community as distributed identity.
I've been listening to the NEW MODELS podcast and their project is very inspiring in terms of how to assemble new institutional frameworks. And, of course, ZORA for building a permissionless protocol and public code. Creating more fluid community dynamics that aren't centralized and organized around an algorithm or attention metrics or profit motives—that’s inspiring to me.
BO: Anything else?
WW: I’d also just say that I don't think it's particularly productive for an artwork to polemicize against different speculative dystopias. So I'm more interested in using art to read the real existing technology prismatically or understanding the preferences that are embedded in the technology and how those preferences contradict the narratives that are told about it.