The world’s leading digital media and regulatory policy journal

Current approaches to moderation and safety in the metaverse

As the debate over online safety grows ever louder, ILKER BAHAR dons his head-mounted display and looks at the steps taken by virtual reality platforms to protect their users

The evolution of virtual reality (VR) is significantly transforming how users experience intimacy and social relationships. Social VR platforms such as VRChat (2014), Rec Room (2016) and Meta Horizon Worlds (2019) provide immersive environments where users present themselves with avatars to play games, chat, watch movies and interact with people from different countries. These environments offer unique ways to build and maintain relationships—such as attending a virtual concert on a first date, meditating with a friend in a serene landscape or exploring ancient Athens with classmates.

Users socialising in Rec Center, the primary meeting room in Rec Room (image by the author)

Examining user behaviour in VR environments reveals both new challenges and opportunities that come with this emerging technology. Previous research showed that these immersive experiences can significantly foster empathy, reduce social stigma and increase inclusivity.1 2

Yet the advent of social VR platforms also presents significant issues such as trolling, stalking, sexual abuse and hate speech.3 4 The immersive nature of these platforms, coupled with the blurred boundaries between virtual and real-world identities, can amplify emotional and psychological responses to these harmful occurrences. Traditional forms of moderation become increasingly inadequate and, given that these platforms are predominantly frequented by younger users, these concerns become even more critical.

Despite the rapid expansion of virtual environments, policy and regulation have not kept pace. At the EU level, the adoption of the Digital Services Act,5 the communication on Web 4.0 and virtual worlds6 and the citizens’ panel report7 represent critical steps forward in addressing these challenges. However, there remains a lack of comprehensive research and insight into the safety and moderation of these spaces.

Drawing on my ethnographic research into social VR platforms, in this article I briefly examine the current approaches to moderation and safety employed by Rec Room and VRChat—two of the most widely used social VR platforms with c.3 million and 9 million estimated total users respectively.8

Focusing on key areas such as platform governance, age verification, community moderation and personal safety features, I aim to guide developers and policymakers in adopting effective strategies to ensure that VR environments are safe and inclusive.

Users socialising with friends in a surreal world in VRChat (image by the author)

Platform governance

Platforms hosting multiple users will inevitably have to deal with users misbehaving or committing crimes online so they must agree guidelines, rules and resources to curb harmful behaviour.

VRChat employs a bottom-up approach to moderation, encouraging community moderation and individual reports. Recently the platform introduced a content gating system, which requires creators to label worlds and avatars containing adult content, graphic violence, gore or horror elements and make them inaccessible to users under 18, based on their registered birth date. The platform also benefits from a trust and safety system that ranks users based on their adherence to community guidelines and experience on the platform. Higher trust levels (such as ‘known user’ or ‘trusted user’) unlock additional features, such as uploading content to the platform, which motivates users to avoid inappropriate behaviour and engage positively with others.

There is also a special rank called ‘nuisance’ assigned to users who have been repeatedly muted, blocked and reported by others for inappropriate behaviour. These users have a visible indicator above their nameplate and their avatars are fully invisible and muted, preventing them from disrupting others.

VRChat trust ranks

Rec Room implements a combination of automated and human moderation. The platform uses voice moderation through AI (ToxMod) to detect and filter inappropriate language (it is available in 18 languages). Their automated systems scan text and images for violations of community guidelines. Human moderators and volunteers complement these efforts by responding to user reports and enforcing community standards. Their role is crucial in addressing more subtle forms of inappropriate behaviour (non-verbal communication, gestures, etc.), such as non-consensual groping of a person’s avatar or performing a Nazi salute, which might easily escape detection by automated systems.

Recently a community backlash erupted over the platform’s automated voice-moderation system, with users criticising it for false detections and infringing on freedom of expression even when they utter ‘moderate’ slang words. In response, platform officials issued a statement defending the system’s accuracy.9 They emphasised that actions are only taken after multiple detections to minimise false positives. The announcement also highlighted that the introduction of voice moderation has led to a 70 per cent reduction in toxic voice chat over the past year.

Moderation and safety challenges

The use of AI automated moderation systems raises important questions about their effectiveness and accuracy. Like other algorithmic systems, they may be prone to discriminatory decisions, such as removing content or banning users in racially or ethnically biased ways. Automated moderation systems can also struggle to detect more subtle forms of inappropriate behaviour that human moderators might easily catch. For example, a recent Bloomberg Businessweek report highlighted how child predators in Roblox circumvented automatic chat moderation by using coded language, such as referring to Snapchat with a ghost emoji or using ‘Cord’ instead of ‘Discord’, when inviting minors to these platforms.10 This demonstrates the limitations of AI-driven moderation in recognising and addressing nuanced or disguised forms of harmful behaviour.

Age verification

Given the presence of adult content on these platforms, it is crucial to implement robust age verification mechanisms to protect younger users from exposure to harmful content.

VRChat allows users aged 13 and above, encouraging parental monitoring for those under 18. It implements age gates for certain content, requiring users to confirm they are 18 or older to access mature worlds. Despite these measures, the system relies on self-verification, which is insufficient to prevent minors from accessing harmful content. In July 2024, the platform developers announced plans to partner with a third-party system for age verification to enhance the safety of minors.

Currently some worlds, such as bars and clubs, display disclaimers about adult content or employ different age verification methods, such as bouncers asking for birth dates. This example illustrates how the 3D and immersive nature of VR interactions allow users to engage in moderation practices akin to real-world settings through a combination of visual and auditory cues such as bodily gestures and voice. Yet, the effectiveness of these measures in preventing minors from accessing these spaces is still questionable. For example, a user can change their voice using the app Voicemod to sound older and falsely declare their date of birth.

‘Verification required’: a bar in VRChat where users provide their date of birth to the bouncer to enter (image by the author)

 Rec Room offers a junior mode for users under 13, encouraging them to add their parent’s or guardian’s email address. Junior mode restricts communication with strangers via text or audio and limits access to certain content. However, many users under 13 still join the platform using non-junior accounts if they are not detected by moderators or reported by other users. The platform’s age verification method, which requires a $1 payment through the  Stripe third-party system, is reportedly bypassed by juniors who simply create new accounts with incorrect birth dates.

Moderation and safety challenges

Third-party verification processes raise significant concerns with regards to data privacy and protection as, for example, identifiable personal information might be leaked if the user is supposed to scan their personal ID to verify their age.

Community moderation

Social VR platforms also offer more autonomous and decentralised ways of moderating and governing one’s immediate social circle and interactions.

VRChat offers audience segregation tools through the ‘instance system’, enabling users to create parallel lobbies or rooms of a world based on the privacy they prefer and the users they want to interact with. For example, they can create a ‘friends’ instance of a world, allowing only their friends to join them while blocking others from teleporting to their location. They can also form groups and host events in ‘group’ instances, where they can manage and moderate content with the authority to warn, mute, kick out or ban unwanted users from the world. Compliance with the platform’s code of conduct in private instances is relatively flexible, provided that all parties involved consent to the nature of the activities within the instance.

Audience segregation is crucial in maintaining a safe and enjoyable experience on the platform, as it allows users to control their social environments and limit exposure to unwanted interactions. Certain communities and individuals also engage in vigilante efforts to maintain peace and safety on the platform. This sometimes results in unique initiatives, such as the famous Local Police Department (LDP). LDP is a community formed in 2018, whose members role-play as police officers and loosely address issues such as harassment in public worlds, combining elements of entertainment with a bottom-up form of community moderation.

The ‘instance’ options available in VRChat

Rec Room similarly features audience segregation mechanisms through public and private rooms. Users can create rooms where only invited players are allowed to enter. Otherwise joining a public room places the user in an instance of the room with other players already present. Users also have their personal ‘dorm room’ where they can enjoy their privacy or invite their close friends to pass time together. Similar to the groups in VRChat, they can launch their own clubs and moderate them according to their preferences. They can invite the members of the club to customised rooms (clubhouse) and spend time with like-minded individuals in a well-curated and controlled social space.

Dorm room in Rec Room (image retrieved from rec-room.fandom.com)

Moderation and safety challenges

Audience segregation raises important questions about the extent of freedom allowed in private rooms. While these spaces can provide consenting adults with a private venue for sexual and romantic intimacy, the lack of effective age verification mechanisms means that users of all ages can access private instances on these platforms. This can result in minors entering spaces where harmful conduct may occur.

Personal safety measures

Platform users are provided with certain features to self-moderate their environment by controlling what they see and hear and how they feel their presence in these 3D spaces.

VRChat offers tools for personal safety, including blocking and muting disruptive users to prevent unwanted interactions. The votekick feature enables players to collectively remove users causing trouble in a room. The users can also report inappropriate behaviour to the VRChat staff. This trust and safety system allows users to customise settings for each trust rank, controlling aspects, such as voice chat, avatar visibility, custom emojis and sound effects, to enhance protection against malicious content (for example muting all users with ‘visitor’ rank). By toggling safe mode in their launchpad, users can also disable all features for  surrounding users, adding an extra layer of security.

VRChat trust and safety system interface

Rec Room includes personal safety measures such as a personal space bubble that makes others’ avatars invisible when they cross a set boundary, a stop hand gesture to instantly hide nearby avatars and options to mute, block or votekick disruptive users. Users can also report harassment or inappropriate behaviour to the platform staff.

‘Stop hand’ gesture, making the other avatar invisible (image from Rec Room’s tutorial video)

Moderation and safety challenges

As these platforms grow, individual reports become more difficult to process in short time periods.

Towards a safe and inclusive policy

As VR environments become increasingly accessible and widespread with the advent of more affordable consumer headsets, the opportunities and challenges presented here will grow even more relevant. As this article shows, social VR platforms implement various innovative measures to enhance safety and moderation for their users. They are trying to strike a balance between privacy and surveillance; however, much work remains to be done.

For example, platforms can deploy professional awareness teams in public worlds to support the community in cases of inappropriate behaviour and guide users in addressing their concerns. Educational workshops, videos and more comprehensive and mandatory onboarding tutorials can inform users about the unique challenges and issues they might experience in virtual environments. As recommended in its citizens’ panel,11 the EU can introduce voluntary certificates for virtual world applications to inform users about their safety, reliability and security.

As the regulatory landscape evolves with new initiatives, ongoing collaboration between platform developers, researchers, policymakers and  users will be essential to ensure that VR is a safe and inclusive space for all.

This article was first published at Metaverse.EU


Ilker Bahar

Ilker Bahar is a PhD candidate in Media and Cultural Studies at the University of Amsterdam. [email protected]

1 Maister L, Slater M, Sanchez-Vives M and Tsakiris M (2015). Changing bodies changes minds: owning another body affects social cognition. Trends in Cognitive Sciences, Volume 19 Issue 1, January. bit.ly/4dJokjk

2 Herrera F, Bailenson J, Weisz E, Ogle E and Zaki J (2018). Building long-term empathy: A large-scale comparison of traditional and virtual reality perspective-taking. Plos One, 17 October. bit.ly/3yT8KTi

3 Oppenheim M (2022). Woman reveals ‘nightmare’ of being ‘gang raped’ in virtual reality. Independent, 3 February. bit.ly/3XsVdey

4 Carville O and D’Anastasio C (2024). Roblox’s Pedophile Problem. Bloomberg, 22 July. bloom.bg/3ZcAA7M

5 See bit.ly/4dOt1bB

6 Grady P and Vona G (2023). EU Communication on Virtual Worlds: A brief explainer. Metaverse EU, 14 July. bit.ly/4e9HqyT

7 European Citizens’ Panel on Virtual Worlds: Final Report, July 2023. bit.ly/3XsiJIz

8 See mmostats.com

9 State of Voice Moderation, Rec Room Blog, 28 March 2024. bit.ly/4geS0Xa

10 See note 4.

11 See note 5.