ConceptFusion: Open-set Multimodal 3D Mapping
Jatavallabhula1, K., Kuwajerwala, A., Gu, Q., Omama, M., Chen, T., Maalouf, A., Li, S., Iyer, G., Saryazdi, S., Keetha, N., Tewari, A., Tenenbaum, J., de Melo, C., Krishna, M., Paull, L., Shkurti, F., Torralba, A., Proceedings of Robotics: Science and Systems (RSS), 2023
Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approaches that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent work, using text prompts. We address both these issues with ConceptFusion, a scene representation that is: (i) fundamentally open-set, enabling reasoning beyond a closed set of concepts (ii) inherently multi-modal, enabling a diverse range of possible queries to the 3D map, from language, to images, to audio, to 3D geometry, all working in concert. ConceptFusion leverages the open-set capabilities of today’s foundation models pre-trained on internet-scale data to reason about concepts across modalities such as natural language, images, and audio. We demonstrate that pixel-aligned open-set features can be fused into 3D maps via traditional SLAM and multi-view fusion approaches. This enables effective zero-shot spatial reasoning, not needing any additional training or finetuning, and retains long-tailed concepts better than supervised approaches, outperforming them by more than 40% margin on 3D IoU. We extensively evaluate ConceptFusion on a number of real-world datasets, simulated home environments, a real-world tabletop manipulation task, and an autonomous driving platform. We showcase new avenues for blending foundation models with 3D open-set multimodal mapping.
Open-Set Automatic Target Recognition
Safaei, B., VS, V.; de Melo, C.; Hu, S.; Patel, V., Proceedings of ICASSP 2023, 2023
Automatic Target Recognition (ATR) is a category of computer vision algorithms which attempts to recognize targets on data obtained from different sensors. ATR algorithms are extensively used in real-world scenarios such as military and surveillance applications. Existing ATR algorithms are developed for traditional closed-set methods where training and testing have the same class distribution. Thus, these algorithms have not been robust to unknown classes not seen during the training phase, limiting their utility in real-world applications. To this end, we propose an Open-set Automatic Target Recognition framework where we enable open-set recognition capability for ATR algorithms. In addition, we introduce a plugin Category-aware Binary Classifier (CBC) module to effectively tackle unknown classes seen during inference. The proposed CBC module can be easily integrated with any existing ATR algorithms and can be trained in an end-to-end manner. Experimental results show that the proposed approach outperforms many open-set methods on the DSIAC and CIFAR-10 datasets. To the best of our knowledge, this is the first work to address the open-set classification problem for ATR algorithms. Source code is available at: https://github.com/bardisafa/Open-set-ATR.
Synthetic-to-real domain adaptation for action recognition: A dataset and baseline performances
Reddy, A., Shah, K., Paul, W., Mocharla, R., Hoffman, J., Katyal, K., Manocha, D., de Melo, C., & Chellappa, R., Proceedings of International Conference on Robotics and Automation (ICRA), 2023
Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data has shown promise as a way to avoid the substantial costs, and potential practical and ethical issues associated with collecting and labeling enormous amounts of data in the real-world. However, synthetic data may differ from real data in important ways. This phenomenon, known as domain shift, can limit the utility of synthetic data in robotics applications. To mitigate the effects of domain shift, substantial effort is being dedicated to the development of domain adaptation (DA) techniques. Yet, much remains to be understood on how best to develop these techniques. In this paper, we introduce a new dataset, called Robot Control Gestures (RoCoG-v2), composed of corresponding real and synthetic videos, to support the study of synthetic-to-real domain shift in video action recognition. Our work expands upon existing datasets by focusing the action classes on gestures for humanrobot teaming, as well as by enabling investigation of domain shift in both ground and aerial views. We present baseline results using state-of-the-art action recognition and domain adaptation algorithms and offer initial insight on tackling the synthetic-to-real and ground-to-air domain shifts. A link to the dataset and corresponding documentation can be found at https://github.com/reddyav1/RoCoG-v2.
4. AZTR: Aerial video action recognition with auto zoom and temporal reasoning
Wang, X., Xian, R., Guan, T., de Melo, C., Nogar, S., Bera, A., & Manocha, D., Proceedings of International Conference on Robotics and Automation (ICRA), 2023
We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also present an efficient temporal reasoning algorithm to capture the action information along the spatial and temporal domains within a controllable computational cost. Our approach has been implemented and evaluated both on the desktop with high-end GPUs and on the low power Robotics RB5 Platform for robots and drones. In practice, we achieve 6.1-7.4% improvement over SOTA in Top-1 accuracy on the RoCoG-v2 dataset, 8.3- 10.4% improvement on the UAV-Human dataset and 3.2% improvement on the Drone Action dataset.
Multi-view action recognition using contrastive learning
Shah, K., Shah, A., Lau, C., de Melo, C., & Chellappa, R., Proceedings of Winter Conference on Applications of Computer Vision (WACV), 2023
In this work, we present a method for RGB-based action recognition using multi-view videos. We present a supervised contrastive learning framework to learn a feature embedding robust to changes in viewpoint, by effectively leveraging multi-view data. We use an improved supervised contrastive loss and augment the positives with those coming from synchronized viewpoints. We also propose a new approach to use classifier probabilities to guide the selection of hard negatives in the contrastive loss, to learn a more discriminative representation. Negative samples from confusing classes based on posterior are weighted higher. We also show that our method leads to better domain generalization compared to the standard supervised training based on synthetic multi-view data. Extensive experiments on real (NTU-60, NTU-120, NUMA) and synthetic (RoCoG) data demonstrate the effectiveness of our approach.
The influence of emotional expressions of an industrial robot on human collaborative decision-making
Usui, K., Terada, K., & de Melo, C., Proceedings of Affective Computing and Intelligent Interaction (ACII), 2022
In recent years, robots have been equipped with the ability to express emotions and have begun building social relationships with people. However, the significance and effectiveness of incorporating emotion in industrial robots, which have a strong instrumental nature, is not fully understood. We investigated how emotional expressions of an industrial robot influence human collaborative decision-making. The participants (n=52), in a laboratory experiment, engaged in a dessert survival task with an arm robot in a 2 (emotion expression: present vs. absent) × 2 (competence: high vs. low) between-participants study. Emotion was expressed using color through a LED strip of lights - e.g., anger was conveyed by flashing red. The results showed that emotion expression and competence did not influence the final agreement and, in fact, emotion expressions made the interaction longer, emphasizing the difficulty in communicating emotion and the reason for those expressions. We discuss lessons learnt and provide insight on improving the value of emotion expression in industrial robots.
Not just streaks: Towards ground truth for single image deraining
Ba, Y., Zhang, H., Yang, E., Suzuki, A., Pfahnl, A., Chandrappa, C., de Melo, C., You, S., Soatto, S., Wong, Al, & Kadambi, A., Proceedings of European Conference on Computer Vision (ECCV), 2022
We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image. As there exists no real-world dataset for deraining, current state-of-the-art methods rely on synthetic data and thus are limited by the sim2real domain gap; moreover, rigorous evaluation remains a challenge due to the absence of a real paired dataset. We fill this gap by collecting a real paired deraining dataset through meticulous control of non-rain variations. Our dataset enables paired training and quantitative evaluation for diverse real-world rain phenomena (e.g. rain streaks and rain accumulation). To learn a representation robust to rain phenomena, we propose a deep neural network that reconstructs the underlying scene by minimizing a rain-robust loss between rainy and clean images. Extensive experiments demonstrate that our model outperforms the state-of-the-art deraining methods on real rainy images under various conditions. Project website: https://visual.ee.ucla.edu/gt_rain.htm/.
The impact of partner expressions on felt emotion in the iterated prisoner’s dilemma: An event-level analysis
Angelika-Nikita, M., de Melo, C., Terada, K., Terada, K., & Gratch, J., Proceedings of the Ninth Annual Conference on Advances in Cognitive Systems (ACS), 2021
Social games like the prisoner’s dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual’s feelings are shaped by their partner’s expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual’s feelings shape how they treat their partner, or whether these feelings are authentically expressed. Research on expressed-emotion has focused on how an individual’s decisions are shaped by their partner’s expressions, without regard for whether these expressions actually evoke feelings. Here, we use computer-generated characters to examine how an individual’s moment-to-moment feelings are shaped by (1) how they are treated by their partner and (2) what their partner expresses during this treatment. Surprisingly, we find that partner expressions are far more important than actions in determining self-reported feelings. In other words, our partner can behave in a selfish and exploitive way, but if they show a collaborative pattern of expressions, we will feel greater pleasure collaborating with them. These results also emphasize the importance of context in determining how someone will feel in response to an expression (i.e., knowing a partner is happy is insufficient; we must know what they are happy-at). We discuss the implications of this work for cognitive-system design, emotion theory, and methodological practice in affective computing.
Vision-based gesture recognition in human-robot teams using synthetic data
de Melo, C., Rothrock, B., Gurram, P., Ulutan, O., & Manjunath, B. S., Proceedings of International Conference on Intelligent Robots and Systems (IROS), 2020
Building successful collaboration between humans and robots requires efficient, effective, and natural communication. Here we study a RGB-based deep learning approach for controlling robots through gestures (e.g., “follow me”). To address the challenge of collecting high-quality annotated data from human subjects, synthetic data is considered for this domain. We contribute a dataset of gestures that includes real videos with human subjects and synthetic videos from our custom simulator. A solution is presented for gesture recognition based on the state-of-the-art I3D model. Comprehensive testing was conducted to optimize the parameters for this model. Finally, to gather insight on the value of synthetic data, several experiments are described that systematically study the properties of synthetic data (e.g., gesture variations, character variety, generalization to new gestures). We discuss practical implications for the design of effective human-robot collaboration and the usefulness of synthetic data for deep learning.
Reducing task load with an embodied intelligent virtual assistant for improved performance in collaborative decision making
Kim, K., de Melo, C., Norouzi, N., Bruder, G., & Welch, G., Proceedings of IEEE on Virtual Reality and 3D User Interfaces (IEEE VR), 2020
Collaboration in a group has the potential to achieve more effective solutions for challenging problems, but collaboration per se is not an easy task, rather a stressful burden if the collaboration partners do not communicate well with each other. While Intelligent Virtual Assistants (IVAs), such as Amazon Alexa, are becoming part of our daily lives, there are increasing occurrences in which we collaborate with such IVAs for our daily tasks. Although IVAs can provide important support to users, the limited verbal interface in the current state of IVAs lacks the ability to provide effective non-verbal social cues, which is critical for improving collaborative performance and reducing task load. In this paper, we investigate the effects of IVA embodiment on collaborative decision making. In a within-subjects study, participants performed a desert survival task in three conditions: (1) performing the task alone, (2) working with a disembodied voice assistant, and (3) working with an embodied assistant. Our results show that both assistant conditions led to higher performance over when performing the task alone, but interestingly the reported task load with the embodied assistant was significantly lower than with the disembodied voice assistant. We discuss the findings with implications for effective and efficient collaborations with IVAs while also emphasizing the increased social presence and richness of the embodied assistant.
Shaping cooperation between humans and agents with emotion expressions and framing
de Melo, C., Khooshabeh, P., Amir, O., & Gratch, J., Proceedings of Autonomous Agents and Multiagent Systems (AAMAS 18), 2018
Emotion expressions can help solve social dilemmas where individual interest is pitted against the collective interest. Building on research that shows that emotions communicate intentions to others, we reinforce that people can infer whether emotionally expressive computer agents intend to cooperate or compete. We further show important distinctions between computer agents that are perceived to be driven by humans (i.e., avatars) vs. by algorithms (i.e., agents). Our results reveal that, when the emotion expression reflects an intention to cooperate, participants will cooperate more with avatars than with agents; however, when the emotion reflects an intention to compete, participants cooperate just as little with avatars as with agents. Finally, we present first evidence that the way the dilemma is described – or framed – can influence people’s decision-making. We discuss implications for the design of autonomous agents that foster cooperation with humans, beyond what game theory predicts in social dilemmas.
Increasing fairness by delegating decisions to autonomous agents
de Melo, C., Marsella, S., & Gratch, J., Proceedings of Autonomous Agents and Multiagent Systems (AAMAS 17), 2017
There has been growing interest in autonomous agents that act on our behalf, or represent us, across various domains such as negotiation, transportation, health, finance, defense, etc. As these agent representatives become immersed in society, it is critical we understand whether and, if so, how they disrupt the traditional patterns of interaction with others. In this paper we study how programming agents to represent us, shapes our decisions in social settings. Here we show that, when acting through agent representatives, people are considerably less likely to accept unfair offers from others, when compared to direct interaction with others. This result, thus, demonstrates that agent representatives have the potential to promote fairer outcomes. Moreover, we show that this effect can also occur when people are asked to “program” human representatives, thus revealing that the effect is caused by the act of programming itself. We argue this happens because programming requires the programmer to deliberate on all possible situations that might arise and, thus, promote consideration of social norms – such as fairness – when making their decisions. These results have important theoretical, practical, and ethical implications for designing and the nature of people's decision making when they act through agents that act on our behalf.
"Do as I say, not as I do:" Challenges in delegating decisions to automated agents
de Melo, C., Marsella, S., & Gratch, J., Proceedings of Autonomous Agents and Multiagent Systems (AAMAS 16), 2016
There has been growing interest, across various domains, in computer agents that can decide on behalf of humans. These agents have the potential to save considerable time and help humans reach better decisions. One implicit assumption, however, is that, as long as the algorithms that simulate decision-making are correct and capture how humans make decisions, humans will treat these agents similarly to other humans. Here we show that interaction with agents that act on our behalf or on behalf of others is richer and more interesting than initially expected. Our results show that, on the one hand, people are more selfish with agents acting on behalf of others, than when interacting directly with others. We propose that agents increase the social distance with others which, subsequently, leads to increased demand. On the other hand, when people task an agent to interact with others, people show more concern for fairness than when interacting directly with others. In this case, higher psychological distance leads people to consider their social image and the long-term consequences of their actions and, thus, behave more fairly. To support these findings, we present an experiment where people engaged in the ultimatum game, either directly or via an agent, with others or agents representing others. We show that these patterns of behavior also occur in a variant of the ultimatum game – the impunity game – where others have minimal power over the final outcome. Finally, we study how social value orientation – i.e., people’s propensity for cooperation – impact these effects. These results have important implications for our understanding of the psychological mechanisms underlying interaction with agents, as well as practical implications for the design of successful agents that act on our behalf or on behalf of others.
Beyond believability: Quantifying the differences between real and virtual humans.
de Melo, C., & Gratch, J., Proceedings of the 15th International Conference on Intelligent Virtual Agents (IVA 15), 2015
“Believable” agents are supposed to “suspend the audience's disbelief” and provide the “illusion of life”. However, beyond such high-level definitions, which are prone to subjective interpretation, there is not much more to help researchers systematically create or assess whether their agents are believable. In this paper we propose a more pragmatic and useful benchmark than believability for designing virtual agents. This benchmark requires people, in a specific social situation, to act with the virtual agent in the same manner as they would with a real human. We propose that perceptions of mind in virtual agents, especially pertaining to agency – the ability to act and plan – and experience – the ability to sense and feel emotion – are critical for achieving this new benchmark. We also review current computational systems that fail, pass, and even surpass this benchmark and show how a theoretical framework based on perceptions of mind can shed light into these systems. We also discuss a few important cases where it is better if virtual humans do not pass the benchmark. We discuss implications for the design of virtual agents that can be as natural and efficient to interact with as real humans.
People show envy, not guilt, when making decisions with machines.
de Melo, C., & Gratch, J., Proceedings of the 6th International Conference on Affective Computing and Intelligent Interaction (ACII 15), 2015
Research shows that people consistently reach more efficient solutions than those predicted by standard economic models, which assume people are selfish. Artificial intelligence, in turn, seeks to create machines that can achieve these levels of efficiency in human-machine interaction. However, as reinforced in this paper, people's decisions are systematically less efficient – i.e., less fair and favorable – with machines than with humans. To understand the cause of this bias, we resort to a well-known experimental economics model: Fehr and Schmidt's inequity aversion model. This model accounts for people's aversion to disadvantageous outcome inequality (envy) and aversion to advantageous outcome inequality (guilt). We present an experiment where participants engaged in the ultimatum and dictator games with human or machine counterparts. By fitting this data to Fehr and Schmidt's model, we show that people acted as if they were just as envious of humans as of machines; but, in contrast, people showed less guilt when making unfavorable decisions to machines. This result, thus, provides critical insight into this bias people show, in economic settings, in favor of humans. We discuss implications for the design of machines that engage in social decision making with humans.
The importance of cognition and affect for artificially intelligent decision makers.
de Melo, C., Gratch, J., Carnevale, P., Proceedings of the 28th Conference on Artificial Intelligence (AAAI 14), 2014
Agency – the capacity to plan and act – and experience – the capacity to sense and feel – are two critical aspects that determine whether people will perceive non-human entities, such as autonomous agents, to have a mind. There is evidence that the absence of either can reduce cooperation. We present an experiment that tests the necessity of both for cooperation with agents. In this experiment we manipulated people's perceptions about the cognitive and affective abilities of agents, when engaging in the ultimatum game. The results indicated that people offered more money to agents that were perceived to make decisions according to their intentions (high agency), rather than randomly (low agency). Additionally, the results showed that people offered more money to agents that expressed emotion (high experience), when compared to agents that did not (low experience). We discuss the implications of this agency-experience theoretical framework for the design of artificially intelligent decision makers.
Using virtual confederates to research intergroup bias and conflict.
de Melo, C., Carnevale, P., & Gratch, J., Best Paper Proceedings of the Annual Meeting of the Academy of Management (AOM 14), 2014
Virtual confederates–i.e., three-dimensional virtual characters that look and act like humans–have been gaining in popularity as a research method in the social and medical sciences. Interest in this research method stems from the potential for increased experimental control, ease of replication, facilitated access to broader samples and lower costs. We argue that virtual confederates are also a promising research tool for the study of intergroup behavior. To support this claim we replicate and extend with virtual confederates key findings in the literature. In Experiment 1 we demonstrate that people apply racial stereotypes to virtual confederates, and show a corresponding bias in terms of money offered in the dictator game. In Experiment 2 we show that people also show an in-group bias when group membership is artificially created and based on interdependence through shared payoffs in a nested social dilemma. Our results further demonstrate that social categorization and bias can occur not only when people believe confederates are controlled by humans (i.e., they are avatars), but also when confederates are believed to be controlled by computer algorithms (i.e., they are agents). The results, nevertheless, show a basic bias in favor of avatars (the in-group in the “human category”) to agents (the out-group). Finally, our results (Experiments 2 and 3) establish that people can combine, in additive fashion, the effects of these social categories; a mechanism that, accordingly, can be used to reduce intergroup bias. We discuss implications for research in social categorization, intergroup bias and conflict.
The effect of agency on the impact of emotion expressions on people's decision making.
de Melo, C., Gratch, J., Carnevale, P., Proceedings of the International Conference of Affective Computing and Intelligent Interaction (ACII 13), 2013
Recent research in neuroeconomics reveals that people show different behavior and lower activation of brain regions associated with mentalizing (i.e., the inference of other's mental states) when engaged in decision making tasks with a computer, when compared to a human. These findings are important for affective computing because they suggest people's decision making might be influenced differently according to whether they believe the emotional expressions shown by a computer are being generated by a computer algorithm or a human. To test this, we had people engage in a social dilemma (Experiment 1) or a negotiation (Experiment 2) with virtual humans that were either agents (i.e., controlled by computers) or avatars (i.e., controlled by humans). The results show a clear agency effect: in Experiment 1, people cooperated more with virtual humans that showed facial cooperative displays (e.g., joy after mutual cooperation) rather than competitive displays (e.g., joy when the participant was exploited) but, the effect was only significant with avatars; in Experiment 2, people conceded more to an angry than a neutral virtual human but, once again, the effect was only significant with avatars.
The effect of virtual agent's emotion displays and appraisals on people's decision making in negotiation.
de Melo, C., Carnevale, P., & Gratch, J., Proceedings of The 12th International Conference on Intelligent Virtual Agents (IVA 12), 2012
There is growing evidence that emotion displays can impact people's decision making in negotiation. However, despite increasing interest in AI and HCI on negotiation as a means to resolve differences between humans and agents, emotion has been largely ignored. We explore how emotion displays in virtual agents impact people's decision making in human-agent negotiation. This paper presents an experiment (N=204) that studies the effects of virtual agents' displays of joy, sadness, anger and guilt on people's decision to counteroffer, accept or drop out from the negotiation, as well as on people's expectations about the agents' decisions. The paper also presents evidence for a mechanism underlying such effects based on appraisal theories of emotion whereby people retrieve, from emotion displays, information about how the agent is appraising the ongoing interaction and, from this information, infer about the agent's intentions and reach decisions themselves. We discuss implications for the design of intelligent virtual agents that can negotiate effectively
Bayesian model of the social effects of emotion in decision-making in multiagent systems.
de Melo, C., Carnevale, P., Read, S., Antos, D., & Gratch, J., Proceedings of Autonomous Agents and Multiagent Systems (AAMAS 12), 2012
Research in the behavioral sciences suggests that emotion can serve important social functions and that, more than a simple manifestation of internal experience, emotion displays communicate one's beliefs, desires and intentions. In a recent study we have shown that, when engaged in the iterated prisoner's dilemma with agents that display emotion, people infer, from the emotion displays, how the agent is appraising the ongoing interaction (e.g., is the situation favorable to the agent? Does it blame me for the current state-of-affairs?). From these appraisals people, then, infer whether the agent is likely to cooperate in the future. In this paper we propose a Bayesian model that captures this social function of emotion. The model supports probabilistic predictions, from emotion displays, about how the counterpart is appraising the interaction which, in turn, lead to predictions about the counterpart's intentions. The model's parameters were learnt using data from the empirical study. Our evaluation indicated that considering emotion displays improved the model's ability to predict the counterpart's intentions, in particular, how likely it was to cooperate in a social dilemma. Using data from another empirical study where people made inferences about the counterpart's likelihood of cooperation in the absence of emotion displays, we also showed that the model could, from information about appraisals alone, make appropriate inferences about the counterpart's intentions. Overall, the paper suggests that appraisals are valuable for computational models of emotion interpretation. The relevance of these results for the design of multiagent systems where agents, human or not, can convey or recognize emotion is discussed.
A computer model of the interpersonal effect of emotion displayed in a social dilemma.
de Melo, C., Carnevale, P., Antos, D., & Gratch, J., Proceedings of Affective Computing and Intelligent Interaction (ACII 11), 2011
The paper presents a computational model for decision-making in a social dilemma that takes into account the other party's emotion displays. The model is based on data collected in a series of recent studies where participants play the iterated prisoner's dilemma with agents that, even though following the same action strategy, show different emotion displays according to how the game unfolds. We collapse data from all these studies and fit, using maximum likelihood estimation, probabilistic models that predict likelihood of cooperation in the next round given different features. Model 1 predicts based on round outcome alone. Model 2 predicts based on outcome and emotion displays. Model 3 also predicts based on outcome and emotion but, considers contrast effects found in the empirical studies regarding the order with which participants play cooperators and non-cooperators. To evaluate the models, we replicate the original studies but, substitute the humans for the models. The results reveal that Model 3 best replicates human behavior in the original studies and Model 1 does the worst. The results, first, emphasize recent research about the importance of nonverbal cues in social dilemmas and, second, reinforce that people attend to contrast effects in their decision-making. Theoretically, the model provides further insight into how people behave in social dilemmas. Pragmatically, the model could be used to drive an agent that is engaged in a social dilemma with a human (or another agent).
The effect of expression of anger and happiness in computer agents on negotiations with humans.
de Melo, C., Carnevale, P., & Gratch, J., Proceedings of Autonomous Agents and Multiagent Systems (AAMAS 11), 2011
There is now considerable evidence in social psychology, economics, and related disciplines that emotion plays an important role in negotiation. For example, humans make greater concessions in negotiation to an opposing human who expresses anger, and they make fewer concessions to an opponent who expresses happiness, compared to a no-emotion-expression control. However, in AI, despite the wide interest in negotiation as a means to resolve differences between agents and humans, emotion has been largely ignored. This paper explores whether expression of anger or happiness by computer agents, in a multi-issue negotiation task, can produce effects that resemble effects seen in human-human negotiation. The paper presents an experiment where participants play with agents that express emotions (anger vs. happiness vs. control) through different modalities (text vs. facial displays). An important distinction in our experiment is that participants are aware that they negotiate with computer agents. The data indicate that the emotion effects observed in past work with humans also occur in agent-human negotiation, and occur independently of modality of expression. The implications of these results are discussed for the fields of automated negotiation, intelligent virtual agents and artificial intelligence.
The influence of emotion expression on perceptions of trustworthiness in negotiation.
Antos, D., de Melo, C., Gratch, J., & Grosz, B., Proceedings of The 25th Conference on Artificial Intelligence (AAAI 11), 2011
When interacting with computer agents, people make inferences about various characteristics of these agents, such as their reliability and trustworthiness. These perceptions are significant, as they influence people's behavior towards the agents, and may foster or inhibit repeated interactions between them. In this paper we investigate whether computer agents can use the expression of emotion to influence human perceptions of trustworthiness. In particular, we study human-computer interactions within the context of a negotiation game, in which players make alternating offers to decide on how to divide a set of resources. A series of negotiation games between a human and several agents is then followed by a “trust game.” In this game people have to choose one among several agents to interact with, as well as how much of their resources they will trust to it. Our results indicate that, among those agents that displayed emotion, those whose expression was in accord with their actions (strategy) during the negotiation game were generally preferred as partners in the trust game over those whose emotion expressions and actions did not mesh. Moreover, we observed that when emotion does not carry useful new information, it fails to strongly influence human decision-making behavior in a negotiation setting.
The influence of emotions in embodied agents on human decision-making.
de Melo, C., Carnevale, P., & Gratch, J., Proceedings of Intelligent Virtual Agents (IVA 10), 2010
Acknowledging the social functions that emotions serve, there has been growing interest in the interpersonal effect of emotion in human decision making. Following the paradigm of experimental games from social psychology and experimental economics, we explore the interpersonal effect of emotions expressed by embodied agents on human decision making. The paper describes an experiment where participants play the iterated prisoner's dilemma against two different agents that play the same strategy (tit-for-tat), but communicate different goal orientations (cooperative vs. individualistic) through their patterns of facial displays. The results show that participants are sensitive to differences in the facial displays and cooperate significantly more with the cooperative agent. The data indicate that emotions in agents can influence human decision making and that the nature of the emotion, as opposed to mere presence, is crucial for these effects. We discuss the implications of the results for designing human-computer interfaces and understanding human-human interaction.
Evolving expression of emotions through color in virtual humans using genetic algorithms.
de Melo, C., & Gratch, J., Proceedings of the 1st International Conference on Computational Creativity (ICCC 10), 2010
For centuries artists have been exploring the formal elements of art (lines, space, mass, light, color, sound, etc.) to express emotions. This paper takes this insight to explore new forms of expression for virtual humans which go beyond the usual bodily, facial and vocal expression channels. In particular, the paper focuses on how to use color to influence the perception of emotions in virtual humans. First, a lighting model and filters are used to manipulate color. Next, an evolutionary model, based on genetic algorithms, is developed to learn novel associations between emotions and color. An experiment is then conducted where non-experts evolve mappings for joy and sadness, without being aware that genetic algorithms are used. In a second experiment, the mappings are analyzed with respect to its features and how general they are. Results indicate that the average fitness increases with each new generation, thus suggesting that people are succeeding in creating novel and useful mappings for the emotions. Moreover, the results show consistent differences between the evolved images of joy and the evolved images of sadness.
Expression of emotions using wrinkles, blushing, sweating and tears.
de Melo, C., & Gratch, J., Proceedings of the Intelligent Virtual Agents (IVA 09), 2009
Wrinkles, blushing, sweating and tears are physiological manifestations of emotions in humans. Therefore, the simulation of these phenomena is important for the goal of building believable virtual humans which interact naturally and effectively with humans. This paper describes a real-time model for the simulation of wrinkles, blushing, sweating and tears. A study is also conducted to assess the influence of the model on the perception of surprise, sadness, anger, shame, pride and fear. The study follows a repeated-measures design where subjects compare how well is each emotion expressed by virtual humans with or without these phenomena. The results reveal a significant positive effect on the perception of surprise, sadness, anger, shame and fear. The relevance of these results is discussed for the fields of virtual humans and expression of emotions.
Expression of moral emotions in cooperating agents.
de Melo, C., Zheng, L., & Gratch, J., Proceedings of Intelligent Virtual Agents (IVA 09), 2009
Moral emotions have been argued to play a central role in the emergence of cooperation in human-human interactions. This work describes an experiment which tests whether this insight carries to virtual human-human interactions. In particular, the paper describes a repeated-measures experiment where subjects play the iterated prisoner's dilemma with two versions of the virtual human: (a) neutral, which is the control condition; (b) moral, which is identical to the control condition except that the virtual human expresses gratitude, distress, remorse, reproach and anger through the face according to the action history of the game. Our results indicate that subjects cooperate more with the virtual human in the moral condition and that they perceive it to be more human-like. We discuss the relevance these results have for building agents which are successful in cooperating with humans.
The effect of color on expression of joy and sadness in virtual humans.
de Melo, C., & Gratch, J., Proceedings of the Affective Computing and Intelligent Interaction (ACII 09), 2009
For centuries artists have been exploring color to express emotions. Following this insight, the paper describes an approach to learn how to use color to influence the perception of emotions in virtual humans. First, a model of lighting and filters inspired on the visual arts is integrated with a virtual human platform to manipulate color. Next, an evolutionary model, based on genetic algorithms, is created to evolve mappings between emotions and lighting and filter parameters. A first study is, then, conducted where subjects evolve mappings for joy and sadness without being aware of the evolutionary model. In a second study, the features which characterize the mappings are analyzed. Results show that virtual human images of joy tend to be brighter, more saturated and have more colors than images of sadness. The paper discusses the relevance of the results for the fields of expression of emotions and virtual humans.
Creative expression of emotions in virtual humans.
de Melo, C., & Gratch, J., Proceedings of the International Conference on the Foundations of Digital Games (FDG 09), 2009
We summarize our work on creative expression of emotion based on techniques from the arts.
Evolving expression of emotions in virtual humans using lights and pixels.
de Melo, C., & Gratch, J., Proceedings of Intelligent Virtual Agents (IVA 08), 2008
We summarize our work on using genetic algorithms to evolve emotion expression through lighting and color.
Expression of emotions in virtual humans using lights, shadows, composition and filters.
de Melo, C., & Paiva, A., Proceedings of Affective Computing and Intelligent Interaction (ACII 07), 2007
Artists use words, lines, shapes, color, sound and their bodies to express emotions. Virtual humans use postures, gestures, face and voice to express emotions.Why are they limiting themselves to the body? The digital medium affords the expression of emotions using lights, camera, sound and the pixels in the screen itself. Thus, leveraging on accumulated knowledge from the arts, this work proposes a model for the expression of emotions in virtual humans which goes beyond embodiment and explores lights, shadows, composition and filters to convey emotions. First, the model integrates the OCC emotion model for emotion synthesis. Second, the model defines a pixel-based lighting model which supports extensive expressive control of lights and shadows. Third, the model explores the visual arts techniques of composition in layers and filtering to manipulate the virtual human pixels themselves. Finally, the model introduces a markup language to define mappings between emotional states and multimodal expression.
Mainstream games in the multi-agent classroom.
de Melo, C., Prada, R., Raimundo, G., Pardal, J., Pinto, H., & Paiva, A., Proceedings of IEEE/WIC/ACM Intelligent Agent Technology (IAT 06), 2006
Computer games make learning fun and support learning through doing. Edutainment software tries to capitalize on this however, it has failed in reaching the levels of motivation and engagement seen in mainstream games. In this context, we have integrated a mainstream first-person shooter game, Counter-Strike, into the curriculum of our Autonomous Agents and Multi-agent Systems course. In this paper we describe this integration and a platform to support the creation of Counter-Strike agents. In addition, a questionnaire was posed to our students to assess the success of our approach. Results show that students found the idea of applying a first-person-shooter game motivating and the integration with the curriculum useful for their education.
A story about gesticulation expression.
de Melo, C., & Paiva, A., Proceedings of the Intelligent Virtual Agents Conference (IVA 06), 2006
Gesticulation is essential for the storytelling experience thus, virtual storytellers should be endowed with gesticulation expression. This work proposes a gesticulation expression model based on psycholinguistics. The model supports: (a) real-time gesticulation animation described as sequences of constraints on static (Portuguese Sign Language hand shapes, orientations and positions) and dynamic (motion profiles) features; (b) multimodal synchronization between gesticulation and speech; (c) automatic reproduction of annotated gesticulation according to GestuRA, a gesture transcription algorithm. To evaluate the model two studies, involving 147 subjects, were conducted. In both cases, the idea consisted of comparing the narration of the Portuguese traditional story “The White Rabbit” by a human storyteller with a version by a virtual storyteller. Results indicate that synthetic gestures fared well when compared to real gestures however, subjects preferred the human storyteller.
Environment expression: Expressing emotions through cameras, lights and music.
de Melo, C., & Paiva, A., Proceedings of Affective Computing and Intelligent Agents (ACII 05), 2005
Environment expression is about going beyond the usual Human emotion expression channels in virtual worlds. This work proposes an integrated storytelling model – the environment expression model – capable of expressing emotions through three channels: cinematography, illumination and music. Stories are organized into prioritized points of interest which can be characters or dialogues. Characters synthesize cognitive emotions based on the OCC emotion theory. Dialogues have collective emotional states which reflect the participants' emotional state. During storytelling, at each instant, the highest priority point of interest is focused through the expression channels. The cinematography channel and the illumination channel reflect the point of interest's strongest emotion type and intensity. The music channel reflects the valence of the point of interest's mood. Finally, a study was conducted to evaluate the model. Results confirm the influence of environment expression on emotion perception and reveal moderate success of this work's approach.
Environment expression: Telling stories through cameras, lights and music.
de Melo, C., & Paiva, A., Proceedings of The International Conference on Virtual Storytelling (ICV 05), 2005
This work proposes an integrated model – the environment expression model – which supports storytelling through three channels: cinematography, illumination and music. Stories are modeled as a set of points of interest which can be characters, dialogues or sceneries. At each instant, audience's focus is drawn to the highest priority point of interest. Expression channels reflect the type and emotional state of this point of interest. A study, using a cartoon-like application, was also conducted to evaluate the model. Results were inconclusive regarding influence on story interpretation but, succeeded in showing preference for stories told with environment expression.