Nonverbal Communication in Social VR: Recent Academic Research

Gestures (like this peace sign) are an example of nonverbal communication (Photo by Dan Burton on Unsplash)

In the real world, much of our communication is non-verbal: facial expression, gaze, gestures, body movements, even spatial distance (proxemics).

While older, flat-screen virtual worlds such as Second Life are somewhat limited in the forms of nonverbal communication available (most people rely on text or voice chat), modern VR equipment and social VR platforms allow for more options:

  • Hand/finger movement: most VR headsets have hand controllers; the Valve Index has Knuckles hand controllers which allow you to move your fingers as well as your hands;
  • Body movement: the Vive pucks can be attached to your waist, hips, feet, and other parts of your body to track their movement in real time;
  • Eye movements/gaze: for example, the Vive Pro Eye VR headset can track the blinking and movement of the eyes;
  • Facial expression: add-ons such as the Vive Facial Tracker (which attaches to your VR headset) allow you to convey lower face and mouth movements on your avatar.

In addition, many social VR platforms also employ emoticons, which can be pulled up via a menu and displayed over the head of the avatar (e.g. the applause emoji in AltspaceVR), as well as full-body pre-recorded animations (e.g. doing a backflip in VRChat). The use of all these tools, in combination or alone, allows users in social VR to approach the level of non-verbal communication found in real life, provided they have the right equipment and are on a platform which supports that equipment (e.g. NeosVR, where you can combine all these into an avatar which faithfully mimics your facial and body movements).

Two recently published research papers investigate nonverbal communication on social VR platforms, adding to the growing academic literature on social VR. (I am happy to see that social VR is starting to become a topic of academic research!)


Maloney, D., Freeman, G., & Wohn, D. Y. (2020). “Talking without a Voice”: Understanding Non-Verbal Communication in Social Virtual Reality. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2). https://doi.org/10.1145/3415246

Unfortunately, there is no open-access version of this conference proceeding available; you’ll have to obtain a copy from your local academic or public library. This paper, by Divine Maloney and Guo Freeman of Clemson University and Donghee Yvette Wohn of the New Jersey Institute of Technology, consists of two parts:

  • conducting unobtrusive observations of 61 public events held in AltspaceVR over the span of four weeks, to see what non-verbal interactions were being used naturally on the platform; and
  • interviewing 30 users of social VR platforms (of which I was one!), where the paper’s authors read through the transcribed interview data to acquire a picture with regards how social VR users used, perceived, and experienced non-verbal communication for further analysis.

In the first study of the two, the authors noted the following different kinds of nonverbal communication:

  • the use of movement to indicate that someone was paying attention. These included nodding behaviors and moving the body or head toward the person or object that was subject of attention;
  • the use of applause to indicate approval;
  • pointing and patting one’s own chest as a form of directing attention either at a remote object/person or oneself;
  • and behaviours such as waving, dancing, and kissing, which were mostly used in social grooming contexts (dancing was also used as entertainment);
  • and finally, the behaviour of trolls: interpersonal provocation and social disruptions.

With respect to the thirty interviewed conducted, they were analyzed as follows to answer two research questions:

Using quotes from users’ own accounts, in this section we present our findings as two parts. First, to answer RQ2 (How do people perceive and understand non-verbal communication in social VR?), we identified three common themes that demonstrated how users perceive and understand non-verbal communication in social VR: as more immersive and embodied interactions for body language; as a similar form of communication to offline face-to-face interaction in terms of spatial behavior, hand behavior, and facial expressions; and as a natural way to initiate communication with online strangers.

Second, to answer RQ3 (How, if at all, does non-verbal communication affect interaction outcomes in social VR?), we described the social consequences of interacting through non-verbal communication in social VR for various user groups, including marginalized users such as cis women, trans women, and disabled users. We specially highlighted how non-verbal communication in social VR afforded privacy and social comfort as well as acted as a protection for marginalized users.

Unsurprisingly, the researchers discovered that most participants considered non-verbal communication to be a positive aspect in their social VR experience. Those surveyed highly praised body tracking (either just the hands and head, or ins ome cases the whole body), as it allowed for a more immersive and embodied form of non-verbal communication than those in traditional, flatscreen virtual worlds.

In addition to supporting more immersive and embodied interactions for body language, participants also considered non-verbal communication in social VR similar to offline face-to-face interaction in terms of spatial behavior, hand behavior, and facial expressions. This familiarity and naturalness greatly contributed to their generally positive perceptions.

Participants also viewed non-verbal communication in social VR as positive and effective because it became a less invasive way to start interactions with online strangers (e.g. waving hello at someone you’ve just met). Nonverbal communication also afforded some users a sense of privacy and social comfort, and in some cases, became an effective protection for them to avoid unwanted interactions, attention, and behaviors (especially with LGBTQ people and women).

The paper made three design recommendations for improved nonverbal communication in social VR platforms: providing support for facial tracking (which is already on its way with products like the Vive Facial Tracker); supporting more accurate hand and finger tracking (again, already underway with the Knuckles controllers for the Valve Index); and enabling alternative modes of control, especially for users with physical disabilities. While most of the study participants highly praised full body tracking in social VR, disabled users in fact complained about this feature and demanded alternatives.

The conference paper concludes:

Recently, commercial social VR applications have emerged as increasingly popular digital social spaces that afford more naturally embodied interaction. How do these novel systems shape the role of non-verbal communication in our online social lives? Our investigation has yielded three key findings. First, offline non-verbal communication modalities are being used in social VR and can simulate experiences that are similar to offline face-to-face interactions. Second, non-verbal communication in social VR is perceived overall positive. Third, non-verbal interactions affect social interaction consequences in social VR by providing privacy control, social comfort, and protection for marginalized users.


Tanenbaum, T. J., Hartoonian, N., & Bryan, J. (2020). “How do I make this thing smile?”: An Inventory of Expressive Nonverbal Communication in Commercial Social Virtual Reality Platforms. Conference on Human Factors in Computing Systems – Proceedings, 1–13. https://doi.org/10.1145/3313831.3376606

This paper is available free to all via Open Access. In this conference proceeding, Theresa Jean Tanenbaum, Nazely Hartoonian, and Jeffrey Bryan of the Transformative Play Lab at the Department of Informatics at the University of California, Irvine, did a study of ten social VR platforms:

  • VRChat
  • AltspaceVR
  • High Fidelity (which shut down in January of 2020)
  • Sansar
  • TheWave VR (this social VR platform shut down in early 2021)
  • vTime XR
  • Rec Room
  • Facebook Spaces (since shut down and replaced by Facebook Horizon)
  • Anyland
  • EmbodyMe

For each platform, investigators answered the following eight questions:

  1. Can the user control facial expressions, and if so, how? (Pre-baked emotes, puppeteering, etc.)
  2. Can the user control body language, and if so, how? (Pre-baked emotes, puppeteering, postures. etc.)
  3. Can the user control proxemic spacing (avatar position), and if so, how? (Teleport, hotspots, real world positioning, etc.) How is collision handled between avatars? (Do they overlap, push each other, etc.)
  4. How is voice communication handled? Is audio spatialized, do lips move, is there a speaker indicator, etc.
  5. How is eye fixation/gaze handled? (Do avatars lock and maintain gaze, is targeting gaze automatic, or intentional, or some sort of hybrid, do eyes blink, saccade, etc.)
  6. Are different emotions/moods/affects supported, and how are they implemented? (Are different affective states possible, and do they combine with other nonverbal communications, etc.)
  7. Can avatars interact physically, and if so, how? (Hugging, holding hands, dancing, etc.) What degree of negotiation/consent is needed for multi- avatar interactions? (One-party, two-party, none at all?)
  8. Are there any other kinds of nonverbal communication possible in the system that have not be described in the answers to the above questions?

The results were a rather complete inventory of nonverbal communication in social VR, with the goal to catalogue common design elements for avatar expression and identify gaps and opportunities for future design innovation. Here is the table from the paper (which can be viewed in full size at the top of page 6 of the document).

An inventory of non-verbal communication in ten social VR platforms (source)

VR development is proliferating rapidly, but very few interaction design strategies have become standardized…

We view this inventory as a first step towards establishing a more comprehensive guide to the commercial design space of NVC [non-verbal communication] in VR. As a design tool this has two immediate implications for designers. First, it provides a menu of common (and less common) design strategies, and their variations, from which designers may choose when determining how to approach supporting any given kind of NVC within their platform. Second, it calls attention to a set of important social signals and NVC elements that designers must take into consideration when designing for Social VR. By grounding this data in the most commonly used commercial systems, our framework can help designers anticipate the likelihood that a potential user will be acquainted with a given interaction schema, so that they may provide appropriate guidance and support.

Our dataset also highlights some surprising gaps within the current feature space for expressive NVC. While much social signaling relies upon control of facial expression, we found that the designed affordances for this aspect of NVC to be mired in interaction paradigms inherited from virtual worlds. Facial expression control is often hidden within multiple layers of menus (as in the case of vTime), cannot be isolated from more complex emotes (as in the case of VR Chat), hidden behind opaque controller movement (as in Facebook Spaces), or unsupported entirely. In particular, we found that with the exception of dynamic lip-sync, there were no systems with a design that would allow a user to directly control the face of their avatar through a range of emotions while simultaneously engaging in other forms of socialization.

The authors go on to say that they observed no capacity in any of the systems to recombine and blend the various forms of nonverbal communication, such as can be done in the real world:

As we saw in our consideration of the foundations of NVC in general, and Laban Movement Analysis in particular, much NVC operates by layering together multiple social signals that modify, contextualize, and reinforce other social signals. Consider, for instance, that it is possible to smile regretfully, laugh maliciously, and weep with joy. People are capable of using their posture to
indicate excitement, hesitation, protectiveness, and many other emotional states, all while performing more overt discourse acts that inherit meaning from the gestalt of the communicative context.

The conference paper concludes:

As is evident in the scholarly work around social VR, improving the design space for NVC in VR has the potential to facilitate deeper social connection between people in virtual reality. We also argue that certain kinds of participatory entertainment such as virtual performance will benefit greatly from a more robust interaction design space for emotional expression through digital avatars. We’ve identified both common and obscure design strategies for NVC in VR, including design conventions for movement and proxemic spacing, facial control, gesture and posture, and several strategies unique to avatar mediated socialization online. Drawing on previous literature around NVC in virtual worlds, we have identified some significant challenges and opportunities for designers and scholars concerned with the future of socialization in virtual environments. Specifically, we identify facial expression control, and unconscious body posture as two critical social signals that are currently poorly supported within today’s commercial social VR platforms.

It is interesting to note that both papers cite the need to properly convey facial expressions as key to expanding the ability of avatars in social VR to convey non-verbal communication!

Breakroom Implements High Fidelity’s Three-Dimensional Audio

Photo by Jason Rosewell on Unsplash

Sinewave Entertainment’s Breakroom (the corporate cousin of their social VR/virtual world platform Sinespace) has recently implemented the spatialized, three-dimensional audio API offered by the revamped High Fidelity.

VentureBeat reports:

The deal is a convergence of pioneers who have made their mark on the development of virtual life. Philip Rosedale is the CEO of High Fidelity and launched Second Life in 2003. Sine Wave Entertainment, the creator of Breakroom, got its start as a content brand in Second Life before it spun out to create its own virtual meeting spaces for real-world events.

Adam Frisby, chief product officer and cofounder of Sine Wave, said in an interview conducted inside Breakroom that the High Fidelity spatial audio will help Breakroom create a triple-A quality experience in a virtual world.

“The real benefit of having 3D audio in a virtual world like this is you can have lots of conversations going on simultaneously,” Frisby said. “3D audio is the only way to replicate the real-world experience in an online environment. You can have a 150-person conference and end up with 10 groups of people talking at the same time. That has helped us with engagement.”

Breakroom is among the first group of clients for Philip Rosedale’s company. Adam tells me that they are looking at implementing the same 3D audio in Sinespace at some point in the future.

Here’s a two-minute YouTube video where Adam Frisby explains and demonstrates the new 3D audio:


This blogpost is sponsored by Sinespace, and was written in my role as an embedded reporter for this virtual world (more details here). 

Notes from The Metaverse: A Glimpse into the Future of Work, with Charlie Fink, Cathy Hackl, Alex Howland, and Philip Rosedale, Held in VirBELA on May 13th, 2021

Charlie Fink (at podium) addresses an audience of over 160 avatars in VirBELA

Today, four well-known people in the metaverse shared a virtual stage in VirBELA to talk about how the metaverse will impact the future of work. (I was not in-world, but I did receive a special livestream link on YouTube to follow the proceedings at virbe.la/metaverse-stream, which I hope works for you as well. Here’s a second link if the first one doesn’t work: https://www.youtube.com/watch?v=iH6Lj1AKi3o.)

Author and columnist Charlie Fink was moderator, asking questions and guiding the wide-ranging conversation among the panelists:

  • Alex Howland, the founder and CEO of VirBELA
  • Cathy Hackl, VR/AR/XR columnist and author of the new book The Augmented Workforce, who used to work at Magic Leap
  • Philip Rosedale, founder of Linden Lab (Second Life) and CEO of the spatial audio firm High Fidelity

Some of the interesting things from the panel which caught my ear were:

  • Cathy Hackl stated that the “metaverse” is not just limited to Ready Player One, but also Pokémon GO (even though I personally do not agree that cellphone-based “AR” is true augmented reality). She doesn’t want to see everything in one walled-garden marketplace like Oculus. She works a lot in the crypto space and wants to support decentralization, such as the portability of avatars between metaverses.
  • Philip said that COVID-19 introduced everybody to the idea of virtual worlds, or shared virtual spaces.
  • Alex was an organizational psychologist who got his original idea for VirBELA to create environments for business leaders to learn from each other, practice leadership skills, and to observe behaviour.
  • Philip recognized Second Life when he visited VirBELA, and really enjoyed walking around the virtual campus. He feels there is still a lot of work to be done to build platforms which allow people to be creative together.
  • Cathy sees ROBLOX and similar platforms as entry points for new generations of virtual world users. Her 12-year-old daughter’s friend is already making money creating and selling skins in ROBLOX.
  • Charlie commented on the fact that VirBELA lets you “lean in”, as opposed to more passive video-based services such as Zoom.
  • Philip talked about real-life use cases of his new company’s technology, High Fidelity, stressing how the three-dimensional, spatialized audio is better than a Zoom call. The company offers an SDK so that companies can integrate spatial audio into their products. High Fidelity works within the browser, and the company is working on native clients for iOS, Unity, etc.
  • Philip feels that avatars are extraordinarily important, saying that Second Life has a $600 million economy, with one of the largest segments being avatar hairstyles! But facial expression and lip movement are not yet there, and we are not yet across the Uncanny Valley effect (where avatars can appear creepy). Cathy notes that her daughter really cares a great deal about how her avatar looks in ROBLOX!
  • Alex talked about the FRAME platform, which he launched to pursue WebXR, to provide people ease of access from a wide variety of devices. There are tradeoffs between ecosystems (FRAME versus VirBELA), and they are still experimenting and innovating.
  • Cathy feels that VR/AR/XR is incredibly important to the development of the metaverse, in giving an enhanced sense of presence, and impact the way which we engage with environments. She encourages people not to restrict their thinking to just being in a VR headset.
  • Charlie asked Philip or Alex to explain what Agora is (a toolbox to allow you to build audio and video delivery into platforms, which is used in Clubhouse!). Philip noted that if the pandemic has happened even a decade earlier, it would have had a much bigger impact without services such as Agora.
  • Alex said that they has recently hosted a bar mitzvah in VirBELA, among many other unexpected uses (like speakeasies!).
  • Philip says that things are never going to be the same after the pandemic is over. It has now been shown that virtual events can be successful. New technology such as High Fidelity, starting with virtual events, are going to have many applications in future. There are also important cost and environmental aspects to holding meetings such as conferences in virtual spaces.
  • Alex notes that his company builds a lot of custom spaces for clients, and he notes that there are different approaches to world-building (i.e. building your own versus have someone build it for you).
  • Philip notes that Zoom and similar videoconferencing doesn’t have a natural network effect. If communications do embrace virtual worlds, then that will have network effects (i.e. more people want to join bigger networks, an example being Facebook and, to a lesser extent, Second Life). We still don’t know how everything is going to play out in the marketplace. A lot of CEOs are wrestling with the fact that many of their employees do not want to come back into the office, and how to build corporate culture in that new environment.
  • Philip advocated for a stable, cross-platform identity (not tied to your real-world identity), that serves to make us accountable for our behaviour.

Well, those are my rough notes. If you missed the talk, use one of the two links up top to watch and listen. It was an engaging one-hour conversation!

UPDATED: Second Life Founder and High Fidelity CEO Philip Rosedale Will Do an AMA (Ask Me Anything) on Reddit on February 23rd, 2021

Philip shared the following photo when posting about his Reddit AMA on Twitter (source)

Mark your calendars! Philip tweeted late tonight:

Join me for a Reddit AMA on Feb. 23rd from 11:00 a.m. – 2:00 p.m. Pacific Time. Ask me about Spatial Audio, VR, virtual worlds and virtual economies, avatars, and … anything.

So if you have any burning questions you’ve wanted to ask Philip, this is your perfect opportunity! When the AMA starts tomorrow, I will link to it here.

See you there!

UPDATE Feb. 23rd, 2021, 3:51 p.m.: Please accept my apoliogies for not linking to this AMA sooner; I was so tired that I lay down for a nap and landed up sleeping through the entire event!

Here’s the link to the Ask Me Anything posted to the r/IAmA subReddit, with the following introduction posted, plus the above photo as proof that he is, indeed, THE Philip Rosedale!

Hi Reddit!

I am the founder of the virtual civilization Second Life, populated by one million active users, and am now CEO and co-founder of High Fidelity — which has just released a real-time spatial audio API for apps, games, and websites. If you want to check it out, I’d love to hear what you think: highfidelity.com/api

High Fidelity’s Spatial Audio was initially built for our VR platform — we have been obsessive about audio quality from day one, spending our resources lowering latency and nailing spatialization.

Ask me about immersive spatial audio, VR, virtual worlds and spaces, avatars, and … anything.

(With me today I have /u/MaiaHighFidelity and /u/Valefox to answer technical questions about the API, too.)

This AMA has also been reposted the the r/secondlife, r//HighFidelity, r/WebRTC. and r/GameAudio subReddits.

UPDATE 4:26 p.m.: I have been informed that the AMA is still going on, as of this writing!