“Emo Sim”: Expressing Voice–Based Emotions in Mobile Interfaces

“Emo Sim”: Expressing Voice–Based Emotions in Mobile Interfaces. Prabath Weerasinghe ... (such as shopping, dining, picnics) and business activities (such as ...
473KB Größe 5 Downloads 68 Ansichten
“Emo Sim”: Expressing Voice–Based Emotions in Mobile Interfaces Prabath Weerasinghe, Rasika Ranaweera, Senaka Amarakeerthi, & Michael Cohen Spatial Media Group, University of Aizu, Aizu-Wakamatsu, Fukushima-ken 965-8580; Japan e·mail: {m5141110, d8121104, d8111101, mcohen}@u-aizu.ac.jp

Abstract Human interaction with mobile devices is currently a very active research area. Speech which is enriched with emotions is one of the major ways of exchanging ideas, especially via telephony. By analyzing voice stream using a program created in matlab, different emotions of human voice can be recognized. Using a simple Java client the recognized emotions are delivered to a sever as an index number. A mobile client then retrieves the emotion and displays it through colored icons. Each emotion is mapped to a particular color since it is natural to use colors to represent various expressions. We believe, with the help of this application one could conceivably, avoid chatting with somebody whose emotional state is negative! Immersive Virtual Environments such as Second Life and Wonderland allow users to meet in synthetic worlds, performing shared object manipulation and other group tasks including voice chat. By extending the emotion classifier user‘s feelings can be conveyed as gestures (actually animations) of avatar inside the virtual world. Keywords: emotion detection, emotion classification, emotion representation, iαppli, colors and emotions, Collaborative Virtual Environment, Open Wonderland

1

Introduction

what subjective, there are some color effects that have universal meaning [3]. So to display various kinds of emotions on a mobile screen, they were mapped to carefully selected colors. Communication between mobile phone and matlab program is done through cve server, a Java client–server protocol. This application was extended to use in conjunction with virtual environments such as Wonderland [2][8]. Through the cve server, emotions can be exchanged. An avatar-represented user in Wonderland can voice chat, while a mobile application can display each avatar’s emotions.

2

Mobile Devices

Most modern mobile devices such as iPhone, iPad, NetBook are internet-capable and feature more hardware capabilities. With such advanced capabilities, people schedule their daily life activities (such as shopping, dining, picnics) and business activities (such as banking, auctions, mails) and social activities (such as events, chatting, and dating.) Today’s phones not only have larger screens, high speed processors, and huge memory, but they also support a wide variety of programming languages including Java, Python, C/C++, and Objective C. Vendor companies also encourage developers to create new applications by the opening market place. NTT DoCoMo provides an integrated platform for Java-based applications, a jme profile compliant framework called DoJa (DoCoMo Java) to develop such applications using the iαppli framework [4] [16] (Table 1).

NTT DoCoMo, Japan’s premier provider of leading-edge mobile voice, provides a service called iαppli that provides an integrated platform for Java-based application programs. A program was 3 Human Voice–Based Emotion developed using this platform to display emotions of human voice using colors. It is recognized that We experience countless subtle emotions all the colors have a strong impact on emotions and feel- time. Human speech is one of the major ways ings [6]. Although perceptions of color are some- of exchanging ideas, especially via telephones.

Handset Model F502it N502it D503i F503i P503i N503i SO503i

Color Depth 256 colors 256 colors 4096 colors 256 colors 256 colors 4096 colors 65536 colors

Bits 8 8 12 8 8 12 16

Table 1: Color Depths of i-mode Handsets Normally, human speech is enriched with emotions. Emotions are not discrete entities with clear boundaries. A neutral state is not defined in most human emotion models. Modeling human emotions in virtual environments is complicated due to such issues. So that, in most of the popular cves such as Wonderland and SecondLife, emotional communication has been ignored. One of our intentions is to address this issue. In contemporary research, six basic emotions (anger, dislike, fear, happiness, sadness, surprise) classified by Paul Ekman are often used [5].

3.1

Emotion Representation using Col- Figure 1: Extended ι·Con Interface: The triangles represent mobile users in virtual environment, and ors color represents each user’s emotional state.

People have used colors to represent various kinds of emotions. Buddhist religious priests wear yellow color dress to represent happiness. In most cultures white denotes purity [15]. Red which is known as warm color evokes emotions ranging from feelings of warmth and feelings of anger and hostility [3]. The American vernacular is full of ties between color and mood. When people are sad they are blue, they are green with envy, and anger is often associated with red [7] (Fig. 2). Colors can also be classified to warm–cool, heavylight, modern-classical, clean-dirty, active-passive, hard-soft, tense-relaxed, fresh-stale, masculinefeminine, and like-dislike. Experimental results show no significant difference between male and female data for color preferences, whereas different results can be found between different cultures [11][12][13].

• Purple represents arrogance and mourning [7]. • Green is calming, healing and the color doctors wear in operating rooms. It symbolizes generosity, fertility, jealousy, envy and misfortune [7]. • Yellow symbolizes joy, happiness, and optimism [7]. Yellow is also a worm color [3], so it is a perennial favorite in interior design. • Black, even though it is the absence of light, means power, fear, unhappiness, sadness, and death [7]. This support the idea of wearing black dresses for funerals in western cultures.

• Blue represents confidence, security, order, loyalty, and depression [7]. • Red is associated with emotions like love and danger. it also represents life and vitality [7]. The combination of red and white signifies In their researches Robert Plutchik (Fig. 2), happiness and celebration to Japanese [15]. Claudia Cortes, Naz Kaya, Shirley Willett, and

Johann Wolfgang von Goethe have suggested different models for emotion-color mappings [9]. Perceptions of color are somewhat subjective [3] and classifications of colors result different between different cultures [11]. Even though emotion-color models do not follow a standard set of emotions, we used emotions classified by Paul Ekman and summarized them as in Table 2.

Frequency Cepstral Coefficients (lfcc) of voice frames are used as features [10][2]. A realtime system acquires sounds via a microphone and the sound stream is digitized by pulse code modulation (pcm). The sound stream was subjected to segmentation. Resulted segments are used for feature extraction. As the final step, emotion classification is done by six hmms.

4 4.1

System Architecture CVE Client–Server Architecture

Cve is a Java client–server protocol developed by the Spatial Media Group of the University of Aizu. Clients connect to a session server host via subscribing to channels and when clients need to communicate with each other they subscribe to the same channel. The cve server captures all the updates (sway, surge, & heave translation coordinates and roll, pitch, & yaw rotation coordinates) from session clients and multicasts (actually replicated unicast) to other clients via relevant channels. An extensibility “extra parameter” can also used to exchange information other than position details.

4.2 Figure 2: Plutchik’s Wheel of Emotions: Plutchik’s three-dimensional circumflex model describes the relations among emotion concepts. The cones vertical dimension represents intensity, and the circle represents degrees of similarity among the emotions. The eight sectors are designed to indicate that there are eight primary emotion dimensions defined by the theory arranged as four pairs of opposites. 2

3.2

Emotion Classification

Voice-based emotion classification is a relatively old research field. But still lot of research avenues are available for real-time emotion classification. Acoustic data such as energy, pitch, fundamental frequency, and formants can be used with hidden Markov Model(hmm) to get a considerable accuracy [1] for emotion classification. Log

Emo Sim Architecture

Emotion detector matlab program analyzes voice stream and outputs six weighted emotions. The highest value is considered as the current emotion and sent to the cve server as an extra parameter. To simulate this process we developed a simple cve client named Emo(tion) Sim(ulator). Once this parameter is captured from the cve server, it multicasts (actually replicated unicast) to other clients subscribed to the same channel. Another program called cve servent, a servlet running on top of Tomcat server which is also a cve client,keeps this parameter in memory. ι·Con can not connect continuously to cve just like other Java clients, but instead it can retrieve/send data from/to cve servent through http. Once an http request is sent to cve servent, it responds with a query string which consists of position data and extra parameters. These values get updated/overridden when cve server receive new values. Once query string is received it tokenizes to parameters, name–value pairs in which one of these parameters will be emotion. Emotion

Plutchik’s Cortes’s Kaya’s Goethe’s Willett’s Summary

Anger Red Red Red Red Red

Dislike Purple Purple Green-Yellow Red-Blue Purple

Fear Green Yellow Yellow Green

Happiness Yellow Yellow Yellow Yellow Yellow

Surprise Cyan Yellow-Red Blue

Sadness Blue Blue Purple Blue Black

Table 2: Mapping between colors and emotions of different models value will then be mapped to a particular color and displayed on the screen. As each user/avatar connects via different channels, this process is repeated for all subscribed channels (Fig. 3). Pseudocode for ι · Con emotion color mapping: begin comment: Initialize emotion color mappings mapping = newHashT able(iEmotion, Color); mapping.put(0, Y ELLOW ); comment: Retrieve data from cve-servent paramV alueList = tokenize(queryString); for paramV alue : paramV alueList if paramV alue.equals(“emotion”) bgColor = mapping.get(emotionV alue); icon.setBackground(bgColor); end end end

Emo Sim

Wonderland Wonderland—CVE Bridge

CVE Server

Other Clients

CVE Servent

Extended i.Con

Figure 3: System Schematic: Emo Sim sends emotion as a parameter to cve server. Cve server then multicasts this data to other clients subscribed to the same channel. Cve servent servlet 4.3 Data Communication runs on top of Tomcat server, which is also a Packet communications used for browsing sites in cve client, and retains this parameter in memi-mode and mail is a communications method that ory. When ι · Con retrieves/sends data from/to divides transmission data into smaller segments cve servent, it returns/updates the value it kept. called packets. A packet unit in i-mode communication is equivalent to 128 bytes. Packet communications charges are billed according to the volume of data (number of packets) that is sent ing shared object manipulation and other group and received. In order to minimize the charges tasks, including voice chat. Using lexical analysis, we intentionally used asynchronous communica- emotions can be extracted from text (chat) and tion, so that the user has to press/tap a button to emotions can be mapped to animations of avatars [2][8]. By extracting emotions from voice, betsend/receive data to/from cve. ter immersiveness can be conveyed especially if such operations are performed in real-time. Using 5 Conjunction with Immersive “Wonderland–cve Bridge” [14] Wonderland can communicate with cve server. Now an avatar repVirtual Environments resented user in Wonderland can voice chat, while Cves such as Second Life and Wonderland al- a mobile application can detect each avatar’s emolow users to meet in synthetic worlds, perform- tions.

6

Challenges and Problems

For this project we did not use a voice captured from a phone, instead we used a microphone to capture voice. When capturing voice other than in controlled environments, noise can be added to the voice stream which make it difficult to classify emotions. As human emotions are not always expressed as single, monotonic feelings, the emotion recognizer program can distinguish mixed emotions as probabilities of each static emotion. But our current system ignores such complexities and displays the highest probable emotion. Because DoJa ide can run only on Windows platform and iαppli runs only on DoCoMo phones, both development and executable platforms have dependencies. As the phone user has to observe the display for emotion identification of the callee while talking, s/he supposed to use a hands-free kit or turn the speaker on.

7

Future Work and Directions

We would like to improve our system by handling mixed emotions. In this case we hope to display emotions as bar-charts, which are natural as signal indicators. We also hope to extend our program to run on different phones, such as Android and iPhone.

References [1] Senaka Amarakeerthi, Rasika Ranaweera, and Michael Cohen. Speech-based emotion characterization using postures and gestures in cves. In Proc. 10th Int. Conf. on CYBERWORLDS, Singapore, October 2010. http://www3.ntu.edu.sg/SCE/ cw2010/cw2010.htm. [2] Senaka Amarakeerthi, Rasika Ranaweera, Michael Cohen, and Nicholas Nagel. Mapping selected emotions to avatar gesture. In Proc. IWAC: 1st Int. Workshop on Aware Computing, Aizu-Wakamatsu, Japan, September 2009. Japan Society for Fuzzy Theory and Intelligent Informatics. www.u-aizu.ac.jp/ misc/fan09/iwac09.html. [3] Kendra Cherry. How Colors Impact Moods, Feelings, and Behaviors. http://psychology.about.com/od/ sensationandperception/a/colorpsych. htm. [4] Michael Cohen and Norbert Gy˝orbir´ o. Personal and portable plus practically panoramic: Mobile and ambient display and control of virtual worlds. Innovation: The Magazine of Research andTechnology, 9(2):33–35, 2008. [5] P. Ekman. Emotion in the Human Face. Cambridge University Press, 1982.

8

Conclusion

We have developed a mobile application which can display human emotions using colors. We also extended this application to use in conjunction with virtual environments like Wonderland. Basically we recognize six different emotions, get the highest probable emotion, map the emotion to a carefully selected color, and display on DoCoMo mobile phone. The accuracy of recognition can be improved by incorporating syntax and semantics other than acoustic data. The system can also be used with Wonderland whose user-represented avatars can communicate with each others and finally display avatar emotions on the mobile screen as colorful icons.

[6] Naz Kaya and Helen H. Epps. Relationship between color and emotion: a study of college students. College Student Journal, September 2004. [7] Jan Landon. The color of emotion. Topeka Capital-Journal, October 2008. [8] A. Neviarouskaya, H. Prendinger, and M. Ishizuka. Emoheart: Conveying emotions in second life based on affect sensing from text. Advances in Human-Computer Interaction, 2010, 2010. [9] Niels A. Nijdam. Mapping emotion to color. http://hmi.ewi.utwente.nl/verslagen/ capita-selecta/CS-Nijdam-Niels.pdf.

[10] T. L. Nwe, S. W. Foo, and L. C. De Silva. Speech Emotion Recognition Using Hidden Markov Models. Speech Communication, 41(4):603–623, 2003. [11] Li-Chen Ou, M. Ronnier Luo, Andre Woodcock, and Angela Wright. A study of colour emotion and colour preference. part i: Colour emotions for single colours. Color Research and Application, 29:232–240, April 2004. [12] Li-Chen Ou, M. Ronnier Luo, Andre Woodcock, and Angela Wright. A study of colour emotion and colour preference. part ii: Colour emotions for two-colour combinations. Color Research and Application, 29:292–298, August 2004. [13] Li-Chen Ou, M. Ronnier Luo, Andre Woodcock, and Angela Wright. A study of colour emotion and colour preference. part iii: Colour preference modeling. Color Research and Application, 29:381–389, October 2004. [14] Rasika Ranaweera, Nick Nagel, and Michael Cohen. Wonderland–cve bridge. In Proc. HC-2009: 12th Int. Conf. on Humans and Computers, Hamamatsu, Japan, December 2009. http://ktm11.eng.shizuoka.ac.jp/ HC2009. [15] Ikko Tanaka and Kazuko Koike. Japan Color. Libro Port Co, Ltd, Tokyo, 1982. [16] Paul Wallace, Andrea Hoffmann, Daiel Scuka, Zev Blut, and Kyle Barrow. i-Mode Developer’s Guide. Pearson Education, Inc., Boston MA, April 2002.