Computer vision and emotional privacy
A study published last week (and summarized here and here) demonstrated that a computer could be trained to detect real versus faked facial expressions of pain significantly better than humans. Participants were shown video clips of the faces of people actually in pain (elicited by submerging their arms in icy water) and clips of people simulating pain (with their arms in warm water). The participants had to indicate for each clip whether the expression of pain was genuine or faked.
Whilst human observers could not discriminate real expressions of pain from faked expression better than chance, a computer vision system that automatically measured facial movements and performed pattern recognition on those movements attained 85% accuracy. Even when the human participants practiced, accuracy only increased to 55%.
The authors explain that the system could also be trained to recognize other potentially deceptive actions involving a facial component. They say:
In addition to detecting pain malingering, our computer vision approach maybe used to detect other real-world deceptive actions in the realm of homeland security, psychopathology, job screening, medicine, and law. Like pain, these scenarios also generate strong emotions, along with attempts to minimize, mask, and fake such emotions, which may involve dual control of the face. In addition, our computer vision system can be applied to detect states in which the human face may provide important clues about health, physiology, emotion, or thought, such as drivers’ expressions of sleepiness and students’ expressions of attention and comprehension of lectures, or to track response to treatment of affective disorders.
The possibility of using this technology to detect when someone’s emotional expressions are genuine or not raises interesting ethical questions. I will outline and give preliminary comments on a few of the issues:
Identifying the primary concern
The overarching concern about widespread use of this technology is related to the idea of emotional privacy. Whilst some of our emotional responses are so strong that we cannot hide them, most of the time we are able to exert considerable control over the extent to which our inner emotional lives are made visible to those with whom we interact. The main worry about the use of emotion recognition technology is likely to be that our private emotions are made public. However, at this point it is important to make sure that we are clear about what the system actually does. It is not the case that it reads minds. It does not have direct access to our thoughts and feelings. Rather, it detects micro movements of the face, and particularly identifies whether it is likely that they are produced by the cortical pyramidal motor system, which enables humans to simulate facial expressions of emotions not actually experienced. It thus reads facial cues to make predictions about underlying emotional experiences (or lack thereof). Although this is somewhat less troubling than the idea of mindreading, many people would nonetheless object to the use of a system that correctly identified whether their emotional expressions were being produced by the ‘genuine’ or ‘fake’ system (the subcortical extrapyramidal motor system and the cortical pyramidal motor system, respectively).
Would all applications raise emotional privacy concerns?
The authors suggest a few applications for their computer vision approach. I suggest that some of these are more problematic than others. For example, the idea that the system could be used to detect sleepiness (arguably not an emotion) in drivers seems less problematic than using it in job screening or in the classroom, and I don’t think the difference is accounted for by an argument about the potential gains in safety on the roads. If this intuition is widespread, I think it can instead be accounted for by the fact that sleepiness is a physiological state that reveals nothing about the agent’s values or preferences. Everyone gets tired if they do not sleep. If there were to be objections to the use of computer vision to assess driver sleepiness, they would most likely derive from resistance to excessive paternalism (‘we’re going to tell you when you’re not fit to drive’) or concerns about accuracy and individual differences in ability to function when fatigued, if data were to be used as evidence in negligent driving cases.
The use of the system in job interviews, however, would potentially reveal (or at least predict as being likely) information relating to the agent’s values and preferences. Imagine, for example, that the system could distinguish between real and faked enthusiasm, or could identify anxiety in the interviewee that would otherwise be imperceptible. Although this would understandably be of interest to employers, whether access to and use of this privileged information would be permissible is not clear. So, whilst, detecting sleepiness tells us nothing about the character or values of a person, reading involuntary cues about their anxieties and interests is more – and possibly too – intrusive.
Humans can be trained to recognize micro expressions so why is a machine different?
One of the most interesting challenges raised by this technology is to try to discern whether there is any relevant difference between computer detection of micro expressions and the human ability to do something comparable. The ability of the computer vision system is superior due to its much higher temporal resolution, and so human vision is unlikely ever to be able to parse facial movements with such detail and precision. This being said, some individuals who are trained in micro expressions (perhaps psychologists trained in Ekman’s Micro Expression Training Tool, or those involved in espionage) have the capacity to seen involuntary movements not visible to the untrained observer. Neither computer vision nor micro expression training involves ‘seeing’ anything more of a person’s face than is already open to observation. It is the higher temporal resolution and knowledge of how to interpret what is thereby revealed that allows the computer or trained human to ‘see more’. It is somewhat analogous to using a microscope to see features not visible to the naked eye, except that trained humans do not use any comparable temporal magnification device.
If it is possible for some humans to train themselves to see micro expressions, does this mean that we can’t object to the use of computer vision for the same purpose? If we were still to object to computer vision, would be have also to object to the use of enhanced human capacities in order to remain consistent? A comparison might be drawn with casino bans that are imposed on people who are really good at counting cards: their unacceptable behavior consists only in deploying well-honed perceptual and mathematical skills, not in using external devices or enhancing drugs or some other aid. However, whilst we might similarly be able to ban micro expression experts from poker tournaments (if such training was thought to undermine the spirit or interest of the game), they obviously cannot be banned from wider social interactions and nor can they stop seeing what they’ve learnt to see. If some individuals have acquired the capacity to read people’s emotions, it makes it more difficult to explain why a machine that does the same should be a cause for concern. Might the difference simply be a matter of accuracy?
Further, we can ask whether it would be more or less problematic for such an individual to sit on the interview panel discussed above. Whereas informed consent could be required for the use of computer vision – in the same way that consent must be given to carry out lie detector tests – it is more difficult to assess whether certain levels of micro expression detection ability in people should have to be declared, and under what circumstances.
The ability to simulate emotion expressions is important for social cohesion
The authors of the computer vision study do not envisage a world in which these systems are ubiquitous. However, we can consider another, social reason why one might resist their more widespread use. It is good for smooth social functioning that people are able to simulate emotions. It will not be the case that all of our colleagues elicit genuine smiles from us, or that we never find a conversation with a family member boring. But pretending to smile and pretending to be interested are crucial for positive social dynamics at work, when socializing and even at home.
Further, by reducing emotional expression to existing on a ‘genuine’ or ‘faked’ dichotomy, more subtle, contextualizing emotions/attitudes are ignored: people have emotions about, and attitudes towards, their emotions. Someone might be upset by her anger towards a loved one or ashamed of her jealousy of a friend. It’s possible to find an interlocutor both interesting and frustrating and to not feel enthused about topics one believes to be of great importance. Thus, although it provides information about subtleties of emotional physiology, computer vision will be unlikely to reveal everything that is necessary to thoroughly understand how a person feels.