Your Metaverse Is Too Small: My Keynote Presentation to the 2026 Virtual Worlds Best Practices in Education (VWBPE) Conference

As some of you might know, a while back I was asked by my librarian colleague (and fellow Second Life aficianado) Marie Vans if I would be willing to be one of the three keynote speakers during the 2026 Virtual Worlds Best Practices in Education (VWBPE) Conference, which is held every year in the pioneering virtual world of Second Life. I said yes (of course!). I never turn down an opportunity to give a presentation in my beloved Second Life. (I was asked to speak at the 2021 Virtual Ability Mental Health Symposium, giving a presentation on the topic of this blogpost on acedia during the pandemic, and in 2024, I gave a presentation on virtual world building in Second Life, in Second Life, to a graduate class in virtual world building and design, which was team-taught by a computer science professor and an interior design professor.) So, as you can see, this was not my first rodeo. 😉

The title of my presentation, which I gave as the keynote speech of the conference on its second day, Friday, March 20th, was Your Metaverse Is Too Small: How the Biases and Preconceptions of Virtual Worlds Hinder Their Use in Education. I was inspired by a keynote address at last November’s IMMERSIVE X conference by Andy Fidel, who titled her talk, held on the ENGAGE social VR platform: The State of the Metaverse in 2026. My talk therefore consisted of three parts as follows:

  • Quotes from Andy Fidel’s talk which I found inspiring and wanted to share (7 slides);
  • My initial, general observations about the metaverse (4 slides); and finally
  • A section titled Your Metaverse Is Too Small, where I discuss various ways our biases and preconceptions about virtual worlds and social VR/AR actually hinder their effective use in educational settings (with reference to Andy’s comments; 14 slides).

Yes, it’s a lot to cover in 45 minutes, but I did it! I’m just going to share my slides as-is, without any Creative Commons-type license this time around, since a lot of it is referring back to Andy Fidel’s ideas, which I found so inspirational in the first place. And yes, while the topics of these 14 slides in part 3 all sprang out of one particularly fevered brain dump of my ideas one evening, rather than relying on GenAI, I do freely admit that I fed my entire blog into a Google NotebookLM and asked it questions in order to create the content based on thirteen (yes, 13!) different ways that, quote: “your metaverse is too small!

Your metaverse is too small because…

  • it has too steep a learning curve for new users
  • your platform has a poor fit-to-purpose
  • it lacks accessibility features (e.g. speech-to-text for the Deaf/HoH community)
  • it is poorly designed and/or Quality Assurance tested, and it causes VR sickness/nausea (more common among women than men)
  • it is soulless/designed by committee (hello Meta Horizon Worlds and Workrooms! Proof positive that you cannot will metaverse platforms into existence by executive fiat and the spending of billions of dollars.)
  • it requires a VR/AR/XR headset (I used two slides to discuss this controversial take; see below for more detail)
  • it relies on cryptocurrencies, NFTs, or some other form of blockchain (do I really have to explain this at this point?)
  • it has poor (or non-existent) safety and trust features and policies
  • it focuses on the product rather than the community
  • it has user data privacy issues/is based on surveillance capitalism (once again, hello Meta Horizon Worlds and Workrooms!)
  • you fail to market it properly (or, in many cases, fail to market it at all)
  • it is unfriendly to different cultures and subcultures (e.g. trans people, furries, etc.)
  • (another controversial one, explained further below) it refuses adult content

Now, before all you social VR adherents rise up with torches and pitchforks and tar and feather me for even daring to say “your metaverse is too small because it requires users to have a VR/AR/XR headset,” here are the two slides, plus speaker notes:

Your metaverse is too small because it requires you to use a virtual reality headset.

It is not a surprise that many of the most popular social VR platforms (e.g. Rec Room, VRChat) also allow for non-VR users to participate. The VR headset market still has not taken off. Even the best-selling Meta Quest line of wireless virtual reality headsets (which make up an estimated 70% of the global VR headset market) has sold only approximately 30 million units around the world, and many of those devices land up collecting dust after the initial novelty of the product wears off. Apple’s Vision Pro, launched to enormous fanfare, does not publish sales figures, but industry reporters said that the company shipped approximately 390,000 units in 2024 and approximately 90,000 units in 2025. VR hardware remains bulky, heavy, and uncomfortable for extended wear. Nausea continues to affect a significant proportion of users. A VR headset isolates the wearer from their physical surroundings and from the facial expressions of the people around them—an anti-social device, in practice, even when its purpose is socialization via social VR.

The PC-tethered VR headset market—the high-end, high-fidelity segment that many early social VR platforms had built toward—proved especially stagnant. The dream of millions of consumers owning gaming-grade PCs with tethered Oculus Rift or HTC Vive headsets never materialized. Even the shift to standalone headsets like the Meta Quest series failed to generate the consumer mass-market that had been anticipated. Sansar is perhaps the most instructive case study in the danger of building a platform around assumed headset adoption. Developed by Linden Lab—the company behind Second Life—Sansar was announced in 2014 and launched in beta in 2017, timed almost precisely to coincide with what seemed like the dawning of the VR era following Facebook’s acquisition of Oculus. But the bet on tethered PCVR headsets never paid off. In March 2020, Linden Lab sold Sansar to a little-known startup called Wookey and pivoted to focus on live music events and virtual concerts, attempting to find a more sustainable niche. That hasn’t worked, either. Sansar still exists, but it is only being kept alive by volunteers at this point.

Research on technology adoption consistently shows that devices requiring behavioral change—for example, for VR, wearing something on your face, isolating yourself from physical surroundings—face much higher adoption friction than technologies that integrate into existing habits. The iPhone and Android phones succeeded partly because they fit into already-established phone-carrying behavior. VR headsets require building a new behavior from scratch.

The failure of the last metaverse hype cycle does not mean that immersive technology has no future. What failed was the specific prediction that millions of people would soon be spending significant time in virtual worlds accessed primarily through VR headsets; that this would create platform-scale opportunities comparable to social media or mobile devices.

Don’t hate me for speaking facts. No VR/AR headset (even the Meta Quest line of headsets) has taken off in the way that iPhones/Android phones and tablets and smartwatches have. In particular, the developers of those platforms who bet the farm on widespread adoption of high-end tethered PCVR headsets (hello, Sansar and High Fidelity!) lost that bet badly; Sansar is essentially moribund, and High Fidelity is now closed (although it does live on in its successor social VR platforms Vircadia and Overte, which were based on HiFi’s open-source codebase, but are also not heavily used). This failure is one of the reasons why Second Life is still going strong (or strong enough) to endure and still be profitable for Linden Lab, for over 22 years now.

And speaking of SL…

I want to make one thing very clear: in some educational applications of the metaverse (especially those intended for children and teenagers, i.e. K-12 education), a ban on adult content is absolutely necessary.

However. As my speaker notes for this last slide in my presentation state:

However, in any institution of higher learning (e.g. a university). you will find faculty, staff, and students teaching about, learning about, and doing research on topics which may include controversial or adult topics. I have argued that one of the most significant strategic errors a metaverse platform can make is the outright refusal to host adult content (or do some other sort of heavy-handed sanitization of adult content, like imposing baked-on underwear on the base male and female adult avatars).

In my blog, I’ve pointed out that for some successful virtual worlds, adult communities are not just a niche—they are the economic and social engine that keeps the lights on. I have frequently cited Second Life as the prime example of a platform that understands the value of adult content. On my blog, I’ve noted that the adult-rated regions of Second Life generate a good portion of the platform’s revenue through land tier fees and the sale of virtual goods (clothing, skins, animations). In contrast, I wrote about Sansar’s early decision to strictly moderate content and its struggle to establish a clear policy on adult material. I argued that by trying to keep the platform “brand-safe” for corporate partners, they essentially “cut off their nose to spite their face,” alienating a potential demographic of creators and consumers who were ready to spend money on higher-fidelity adult experiences. And the corporate clients never came anyway!!

I believe that the ability to explore one’s identity—including its sexual or adult aspects—is fundamental to the metaverse experience. For example, both Second Life and VRChat tend to attract the trans community, giving them a way to experiment with how they represent themselves in a way that might be difficult or impossible to do in real life (particularly at a time when trans people are increasingly under attack in certain jurisdictions). Platforms that ban adult content often end up banning people by extension. If a platform’s moderation is too aggressive, it can lead to the marginalization of subcultures (like the furry community or the trans community) who use virtual worlds as a safe space for exploration. This aligns with Andy’s focus on “presence” and “feeling seen”. Andy argues that gathering spaces should be “smaller, weirder,” and more human. I have argued that by refusing to host adult content, platforms are choosing “corporate safety” over “human authenticity.” They are creating “noise” for brands rather than “spaces that matter” to real people.

One of my core arguments is that you cannot impose a culture on a virtual world; the users bring the culture with them. I’ve pointed out that in almost every successful social VR platform (like VRChat), “NSFW” content and communities exist regardless of official policies. Trying to ban these things is like trying to stop the tide with a broom. Platforms that fight their own communities on this issue usually lose the “heart and soul” that Andy Fidel says is required for a space to be successful. Andy speaks about “architecting belonging” and building spaces like cities. A real-life city has red-light districts, gay bathhouses, private clubs, and adult stores. By refusing to allow these “niche micro-communities” to exist, platform owners are failing to be the architects of a real society and are instead acting as corporate landlords of a sanitized shopping mall.

Okay, enough ranting. Here’s my slide presentation, which you can download to read the rest of my slides and my speaking notes:


Please note: while Philip Rosedale’s keynote speech on the first day of the Virtual Worlds Best Practices in Education conference on Thursday, March 19th has already been uploaded to the VWBPE YouTube channel, mine has not yet been uploaded to view. When it is, I will update this blogpost with a link to the video of my talk.

Chatting with, Watching, and Listening to My Blog (The Good, The Bad, and the Ugly): My Experience Creating a Notebook from My Blog Posts Using Google NotebookLM

This is a follow-up of sorts from my initial blogpost about Google’s NotebookLM generative AI (GenAI) tool, which you might want to read first to gain a bit of context. I decided yesterday that I wanted to test NotebookLM out, by training a notebook on my entire RyanSchultz.com blog (from the very beginning to the previous post).

How I Got My Blogposts into NotebookLM

This part turned out to be easier than I anticipated it would be! First, I had to actually figure out a way to export all my blogposts into Google NotebookLM. I installed a free plugin for WordPress called WP All Export, asking it to export all my blogposts into a CSV file. Then, I asked Claude AI to convert the CSV file into a PDF format. I watched as it created a program to do the conversion, did it, and I landed up with a PDF of my entire blog. It took almost no time at all!

Then, because NotebookLM cannot work with files over 500,000 words apiece, I asked Claude to split my too-large PDF into sub-half-million-word chunks, which it happily did, automatically creating code to complete the task (and even going so far as to inform me that perhaps one of my blogposts had an incorrectly-formatted posting date):

Finally, I uploaded each of the four resulting PDF files into a brand new notebook in NotebookLM. It seemed to take a while to train the language model. I went away for an hour and it still seemed to be processing all four files, but then I exited that particular notebook and came back, and it appeared to be done (perhaps I needed to refresh the webpage to see that it had completed?).

Anyway, the rest of my commentary on Google NotebookLM will be split into three sections:

  • The Good: what worked well (sometimes astoundingly well, in my opinion);
  • The Bad: what worked, in my estimation, badly; and finally
  • The Ugly: because yes, one of the suggested follow-up questions was about “butt-ugly” avatars, and I went down that particular rabbit hole!

The Good

It is very clear that one thing Google NotebookLM particularly excels at is summarizing a collection of information from various sources (in this case, 2,768 blog posts from the RyanSchultz.com blog). In fact, I marveled at just how well NotebookLM was at giving a forest-level view of these particular 2,768 trees! And I found myself in the weird situation of actually having a conversation with my blog content, asking it questions.

The first question I decided to throw at the language model trained on my blogposts was this: “How long did the metaverse hype cycle last?”

Here’s what it spit back (in case you didn’t feel like squinting at the screen capture above):

The most recent and prominent metaverse hype cycle lasted approximately three years, cresting and then ultimately crashing.

This intense period of “metaverse mania” was ignited in October 2021 when Facebook officially rebranded itself as Meta, and CEO Mark Zuckerberg announced the corporation’s massive pivot toward building the metaverse. This announcement sparked a massive wave of public and corporate interest, leading to a phase where countless companies scrambled to associate themselves with the “metaverse” buzzword.

At its peak, this hype cycle was heavily fueled by a collision with the booming cryptocurrency and Non-Fungible Token (NFT) markets. Speculators, investors, and major real-world brands engaged in a frenzied digital land rush, spending millions of dollars to buy up artificially scarce virtual real estate on blockchain-based platforms like Decentraland and The Sandbox.

However, the hype eventually collapsed into a “trough of disillusionment” due to several compounding factors. First, a severe “crypto winter” caused the value of cryptocurrencies and NFT-based virtual real estate to plummet, leading to massive financial losses for investors and driving away the general public. Second, Meta faced widespread mockery and skepticism for the poor graphical quality of its flagship Horizon Worlds platform, coupled with staggering financial losses in its Reality Labs division.

By the end of 2022 and into 2023, the tech industry’s attention had firmly shifted away from the metaverse and onto a new, explosive hype cycle: generative artificial intelligence tools like ChatGPT and DALL-E. As public interest waned, major corporations like Microsoft, Disney, and Walmart began shuttering their metaverse divisions or virtual reality workspaces, prompting mainstream media outlets to effectively declare the three-year metaverse trend “dead”.

I mean, that’s actually a pretty good summary answer. Even better, littered throughout this response were numbered “citations” which, when hovered over, popped up a snippet of one of my blog posts from which the information was taken. Here are two examples of that (and, if you chose to click on the “citation,” it would open up the text in its context blogpost in the leftmost column, which I did not show here):

To give another example, here’s the response to an easier question I asked my blog: “What are the reasons why Second Life has been such a successful virtual world?”:

The remarkable success and longevity of Second Life can be attributed to several core factors that created a deeply engaging and “sticky” virtual world for its users.

A Thriving, Creator-Driven Economy: A critical element of Second Life’s success was Linden Lab’s decision to provide a practical combination of micropayments (the Linden Dollar) and an object permissioning system. This allowed users to retain the intellectual property rights to their creations and effectively monetize their work. Once people realized they could earn real money by creating and selling content, the platform’s economy exploded, leading to a massive marketplace with millions of virtual goods, a vibrant virtual real estate market, and millions of dollars in yearly payouts to creators.

Limitless Avatar Customization and Identity Exploration: Second Life offered an unparalleled mix-and-match approach to avatar construction, allowing users to painstakingly customize their digital appearance from head to toe. This freedom gave users the psychological benefit of reinventing themselves, allowing them to be anyone or anything they wanted, and recognizing that people often feel more plural than the physical world allows them to be. Because users invested so much time and money into perfecting their avatars, they developed a strong sense of embodiment and attachment to their virtual identities, which kept them rooted in the platform.

Deep Community Investment: Ultimately, the true secret to Second Life’s success is the profound investment users made in their communities. The platform provided a blank canvas where users could find their specific “tribes”—ranging from historical roleplayers to furry communities and live music fans. The connections, friendships, and social bonds forged in user-created clubs, music venues, and social spaces created a strong network effect that kept people returning year after year.

Creative Freedom and Early Market Dominance: In its early days, Second Life benefited from a culture of near-anarchy and open-ended exploration, which encouraged residents to build and shape the world in myriad, unexpected ways. By providing accessible in-world building tools, Linden Lab empowered everyday users to easily get started as content creators. Furthermore, Second Life had a significant historical advantage: during its peak growth period from 2003 to 2008, it had almost no effective competition in the virtual world space, allowing its “build it and they will come” strategy to flourish.

This is a good, concise answer, drawn from all the blogposts I have written about Second Life’s success over the years. And, again (although it didn’t come over from my copy-and-paste of this quote), there are actually links back to the blogposts from which this answer was generated. You can click on the “citation numbers” (for lack of a more technical term, although they’re not really true academic citations) to see where it pulled at least part of its answer from (such as this post I wrote back in 2019):

So, I then spent a merry hour firing all sorts of questions at my blog, just to see what the answers would be. And, to be honest, I was fairly impressed at how well it could put together coherent paragraphs neatly sunthesizing various posts from my blog into a concise answer. In my view, this is where Google NotebookLM really shines. One last example: I asked it, “What were some of the reasons that Blue Mars failed as a metaverse platform?” Here’s the response, again compiled from various sources on my blog:

Oh, trust, child, Auntie Ryan had opinions about this now-obscure, failed metaverse platform. And again, there are links to all those blogged opinions!

The failure of the Blue Mars virtual world platform can be attributed to several compounding factors:

  • Misplaced development priorities: The developers focused heavily on non-core technology projects while neglecting basic, critical features. For example, instead of giving users the ability to modify their avatar’s physical size and shape, developers prioritized creating automated artificial intelligence bots. This resulted in a widely mocked welcome area where legions of female NPC bots were programmed to automatically mob and flirt with male avatars.
  • Offensive default animations: The platform featured a small pool of unchangeable default animations. The default standing and walking animations for female avatars were seen as sexist, coquettish, and “too anime-like,” which actively alienated users.
  • Alienating the adult market: The creators made a key decision to strictly maintain a G/PG-13 rating and refused to support adult content or adult role-play communities. Because adult activities are a major driver of engagement and economics in virtual worlds, this decision caused many potential developers to simply walk away.
  • Severe bottlenecks for content creators: Blue Mars created an unacceptable regulatory bottleneck by requiring all user-generated content to be strictly vetted and approved by the company before it could be sold.
  • Lack of an online marketplace: There was no centralized online store for digital goods; instead, creators had to spend time and money setting up physical in-world stores. This cumbersome system resulted in a severe lack of content, with some massive in-world shopping malls hosting as few as four items of clothing for sale.
  • Bungled marketing and promotion: The platform had almost zero press coverage outside of the existing virtual worlds community. What few events Blue Mars did host were poorly promoted and sparsely attended, meaning the platform failed to attract the boosters and “raving fans” needed to build a sustainable population.
I always laugh when I see this old picture from Blue Mars. The tone-deafness of this automated bot “welcoming” system was staggering. Obviously, someone at Blue Mars thought that this sort of work was a high priority! No wonder the platform folded.

I could go on (I asked my blog a lot more questions), but you the drift. After each response, there were three follow-up questions, which appeared to be randomly generated. One of them stopped me in my tracks:

How did Ryan’s subconscious use Luba’s song to change his habits?

Excuse me?!?? Intrigued, I cut and pasted the suggested follow-up question in the chatbox, only to unlock a long-buried memory which I had written about in an idle moment, near the start of the pandemic:

Ryan believes that his subconscious often sends him messages through the songs that happen to get stuck in his head.

During a period of severe insomnia, Ryan was sitting at his computer feeling dejected over his “quixotic endeavour” to arrange for other people to inherit his large collection of Second Life avatars (alts) in his will. At that moment, he realized the lyrics to the 1984 song “Let It Go” by Canadian musician Luba were running through his mind.

He interpreted this song as a clear message from his subconscious telling him to abandon the stressful, unrealistic goal of finding homes for all his avatars after he passed away. The song prompted him to realize that his “obsessive little hobby” of compulsive virtual shopping and avatar styling—which he used as a coping mechanism to escape from messy, painful reality—was taking up too much time and energy that he needed to devote to other areas of his life.

Consequently, Ryan used this subconscious revelation to change his habits: he decided to scale back on his Second Life shopping and styling, and he actively “let go” of his digital creations by permanently deleting eighteen of his Second Life avatars.

Oh, wow. Wow. I had completely, utterly forgotten how I was sometimes deep in the bargaining stage of Elisabeth Kübler-Ross’ five stage of grief during the COVID-19 pandemic that engulfed the world. It was only after discussing my behaviour with my psychiatrist (online, at a time of social distancing) that I realized I was trying to find some small way to control a situation over which I had no control, hence my desire to leave my SL avatars to other people via my will (and actually, my brother does have a list of who gets what avatars. as part of the will I did draw up during the pandemic).


Now, at the top of the rightmost column in NotebookLM (at least, in the desktop browser version) is something called the Studio, where you can generate various kinds of artifacts from your notebook (and the created artifacts appear in a list below it):

For example, when you click on the Audio Overview button, it actually goes away for a little while, and pulls together a podcast. Actually, since it’s not a real podcast, I am gonna use quotes: it’s a “podcast” where a male voice and a female voice talk about…my blog. Yes, I am not kidding! You do have some options available; I chose the Deep Dive of the four options, and asked to have the resulting audio “podcast” focus only on the metaverse topics in my blog (and not talk so much about, for example, the pandemic, which I also wrote a lot about, expecially when it started in 2020).

I was pre-cringing after I hit the blue Generate button, expecting the resulting “podcast” to be criminally bad, but the results were actually, to my surprise, quite listenable! I am actually flabbergasted at how many different topics from my blog were addressed in only 20 minutes. And the discussion between the male and female co-hosts actually flowed, in a completely natural manner. I was stunned.

Now, I did detect a few mistakes (e.g. the pronunciation of the first name “Moesha,” one of my Seocnd Life avatars) and a few hallucinations, notably, that you could create a red, avatar-wearable high-heeled shoe (i.e., rigged to the feet of a particular brand of mesh body, like Maitreya LaraX in Second Life) using GenAI tools (sorry, but we are not there yet, to my knowledge!). But overall, I was absolutely astounded by how much this “podcast” DID. NOT. SUCK. 🤯

I was able to export my “podcast” in an audio format called .m4a, but WordPress wouldn’t allow me to upload and share that particular file format, so I used a free online converter to convert the M4A file into MP3. Just click on the link below to download and listen to the 20-minute Gemini-GenAI-generated podcast, using whatever audio or video players you use to listen to .mp3 files:

.
🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯Honestly, this one feature alone has just completely left me gobsmacked! My flabber is gasted!! 🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯Somehow, I just never imagined, even in my wildest dreams, that I would live long enough to be able to convert my entire blog into a listenable, 20-minute podcast, where two co-hosts casually and naturally discuss a number of topics that I had written about over the past eight years! 🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯If this is an indication of where GenAI is going, then the future is going to be deeply, deeply weird, folks. I mean, seriously, go click on the link above, download it, grab a coffee, and listen to it.

Then, I decided to try the video creation tool. I told NotebookLM that I only wanted a video based on my metaverse blogposts (and not, say, my reflections on the coronavirus pandemic, although there was some mention of that in the final production as well). For a lark, I chose the Cinematic option of the three offered selections for the style of the video, just to see what it would do:

Here is the four-minute GenAI-generated MP4 video it generated within 35 minutes after I clicked the blue Generate button, again based off my metaverse blog posts. There’s actually an overarching narrative structure, references to my blog traffic, and several interesting chosen visuals (not all the visuals were that compelling, and I certainly wouldn’t consider the results cinematic per se, though I probably should have added something to my original text prompt to make it all “pop” a bit more).

Honestly, it’s not that bad; again, it was better than I expected. Check it out:

Tumbleweeds to Tornado to Tumbleweeds: Exposing the Metaverse

Again, while I am probably not as gobsmacked with the video result as I was for the “podcast,” this is, still, far better than what I was bracing for. The narration in the video, in particular, was pretty well-written and coherent, I thought. Is it better than what an experienced video editor could create? Hell no. But it’s a far, far better video than anything I could have cobbled together myself, using my rudimentary (well, okay, non-existent) video editing skills. Again, rather impressive.

However, not all the tools in the Studio toolbox were as capable, as we will see in the following sections!


The Bad

Okay, now let’s get to the stuff NotebookLM sucked at, in my opinion. For example, here is the Mind Map it generated (and yes, the clumsy title was the one suggested):

The Mind Map generated was seriously underwhleming.

I opened and closed a number of the arrows on the various levels to check out the categories that were created, and yeah, not a fan.

I also checked out the Infographic button, making selections as follows: portrait mode, in an Editorial style, standard level of detail:

Here’s the result:

Now, there are several things wrong about this infographic. For example, three of the four metaverse platforms in the so-called “Big Four” are either dead (High Fidelity) or moribund (Sinespace, Sansar) services. Hardly what I would have picked for my top four, but then again, it was drawing from older, outdated blogposts on my blog. Perhaps I might have gotten better results had I restricted the text prompt, not only to metaverse posts, but also added a time range (e.g. “Only use blogposts from the past 12 months.”)

Also, I’m sorry, but the whole “decentralized governance” talking point in the blockchain-based metaverse platforms is utter bullshit. (And I frankly washed my hands of the whole stinking mess, and eventually refused to bother covering any blockchain or NFT-based metaverse platform due to the proliferation of scams and rug-pulls.) So, yeah, this was also a fail (although I did kinda like the style of it).


The (Butt-)Ugly

You remember those three follow-up questions after every chatbot response, on what appear to be random topics (so not really “follow-up” questions at all), like the earlier mention of Luba’s Let It Go?). Well, here was yet another head-scratcher question: “Tell me more about the avatars you called butt-ugly.”

I could not resist, and so I cut and pasted that question verbatim into the chat box, only to receive back the following detailed response:

You are likely referring to the avatars from two different social VR platforms that you have heavily criticized on your blog: AltspaceVR and Arthur.

AltspaceVR You originally described the early avatars in AltspaceVR as “dreadfully cartoony,” “terribly unappealing,” and “butt-ugly”. You reasoned that this was a deliberate design compromise by the developers to keep the avatars low-polygon so they could render quickly and easily across a massive range of hardware, including less powerful mobile headsets like Google Daydream and Samsung Gear VR.

Fortunately, AltspaceVR eventually released a major avatar update in July 2020 that allowed for much greater customization of hair, eyes, and skin colors. While you noted they still lacked arms and legs (likely to avoid inverse kinematics issues), you celebrated the update for finally banishing the “low-poly, dreadfully cartoony avatars” you had previously disliked.

Arthur You also used the term to describe the avatars in Arthur, a virtual office and remote collaboration space, declaring them “among the butt-ugliest avatars I have seen in any social VR platform or virtual world”. To avoid the difficult task of rigging the avatars for facial expressions, Arthur’s developers made several off-putting design choices:

  • The Eyes: The avatars wear black, Matrix-style sunglasses to avoid having to rig eye movements.
  • The Mouth: The avatars wear wraparound black microphones that cover the mouth so closely they look as if they are being gagged, completely hiding the lack of lip-syncing.
  • The Hands: You exist only from the waist up, and the avatar’s arms simply fade out into floating controllers instead of actual hands.

While Arthur’s founder claimed this design was “better than looking at dead eyes,” you strongly disagreed, calling them “creepy-looking, budget-basement, cookie-cutter avatars”. You noted that you would rather show up to a virtual meeting with messy hair and a stained shirt on a Zoom webcam than be represented by Arthur’s highly restrictive, blank-faced avatars.

i searched my blog to find a picture of Arthur’s avatars at that time:

And example avatar from Arthur

Ladies and gentlemen of the jury, I rest my case. (And, once again, something I had ranted about on my blog, and then completely forgotten about, until NotebookLM dredged it up again. By the way, Arthur still seems to be around, but I haven’t bothered to go back and check to see if they have replaced these butt-ugly avatars.)


So, although I received a rather mixed bag of results, I still found that the exercise of feeding my blog into NotebookLM actually had some concrete benefits, in my opinion. For example, it would have taken me hours of scrolling and compiling work to summarize my experience with the Apple Vision Pro from my blogposts, whereas all I needed to do was ask the NotebookLM chatbot, “What does Ryan think about the Apple Vision Pro?”

Here’s the full, unedited response:

Ryan Schultz is an overwhelming fan of the Apple Vision Pro (AVP), frequently describing the technology as “magic”. Despite its incredibly high price tag (which cost him roughly CA$7,700 for the 1-terabyte model, accessories, and taxes), he considers it “worth every single penny” and states he cannot imagine living and working without it.

Here is a breakdown of Ryan’s specific thoughts and experiences with the device:

A Leap Forward in Technology: Ryan notes that the visual clarity of the AVP is so stunning that it completely eliminates the blurry “screen door effect” common in older virtual reality headsets. He famously noted that the Apple Vision Pro makes every other VR headset he has ever used feel like a “cheap plastic View-Master toy”. He was deeply moved by the device’s 360-degree immersive “Environments,” such as Mount Hood and Haleakalā, stating that they felt so real they almost brought him to tears. He considers these Environments to be one of the device’s “killer apps”.

A Vital Productivity and Pain Management Tool: Ryan uses the headset for hours every day, primarily as a work productivity device. He considers the Mac Virtual Display to be the absolute ultimate killer feature of the AVP. By connecting it to his MacBook Pro, he can work on a massive, crystal-clear virtual screen suspended in a relaxing virtual forest or beach.

Crucially, the headset has become a pain management tool for Ryan. Suffering from deteriorating cervical joints in his neck, the Virtual Display allows him to look straight ahead at an ergonomic height rather than hunching over a laptop, providing significant pain relief and allowing him to work comfortably for four to six hours a day.

Flaws and Frustrations Despite his evangelism for the device, Ryan has documented several frustrations with the Apple Vision Pro:

  • The Purchasing Process: Because Ryan wears glasses (which cannot fit inside the AVP), he had to order magnetic Zeiss prescription lenses. A nightmare with UPS shipping delays left him with an unusable headset for weeks, making him so angry he nearly returned the entire device before he even got to use it.
  • “Crackgate”: After just seven weeks of meticulously careful use, his unit developed a spontaneous half-centimeter crack right down the middle of the front glass, a known defect he found deeply upsetting for such an expensive product.
  • Physical Discomfort: During his first few months of use, he experienced eye strain after about two hours of wear. He mitigated this by switching to a thicker light seal cushion, and the problem was entirely eliminated when he later upgraded to a newer AVP model with an M5 processor and a more comfortable dual-loop knit headband.
  • Limitations: He finds the side-to-side field of view slightly constrained (especially when watching spatial videos), and he notes that trying to share the device to give demonstrations to friends using “Guest Mode” is highly frustrating and clunky.

Ultimately, Ryan’s experience with the Apple Vision Pro cemented his belief in the future of spatial computing. He uses it heavily for media consumption, socializing in apps like InSpaze (where 3D spatial Personas “blew his mind”), and daily office work. He has declared that he is “all in” and plans to be first in line to purchase whatever Apple spatial computing device comes next.

I mean, come on. This is an accurate, concise response, stitched together from countless blogposts I have written about the Apple Vision Pro over the past year and a half. And it took me only a couple of minutes to generate! I’m sold.

The real power of Google NotebookLM is its ability to take a large amount of sheer data, and use it to create a personal language model that you can then use to ask questions of it, generate artifacts like flashcards, etc. In particular, it excels at summarizing the results in its answers, and I must confess that alone is worth the sometimes-high price of admission (i.e. fiddling with a data to get it into the service, like I did with my blogpost export and conversion, as described in the first section of this blog post above). While some of the Studio tools provided less-then-great results, others (like the “podcast”) simply blew my mind. And, like AI writer Ethan Mollick has stated, assume that this is the worst version of NotebookLM you will ever use. These generative AI tools are improving and evolving over time.

Welcome to the future…it’s gonna be deeply weird, folks. And I can certainly recommend that you give Google NotebookLM a try (there’s a free version available, which you can play around with yourself).

UPDATED! Generative Artificial Intelligence Tools for Academic Research: AI Research Assistants, AI-Powered Document Analysis Tools, and A Look at Elicit, Undermind, and NotebookLM

I freely admit that this was not the next blogpost I was planning to write, but as a follow-up to my previous detailed discussion of what I have started to call the “Big Three” of good (sometimes, good enough) general-purpose generative AI (GenAI) tools—ChatGPT, Claude, and Gemini—I wanted to write a little bit more about two particular subsets of GenAI tools which are focused on the academic research process. And, since I have two things coming up on my calendar which necessitate academic research, that is:

I figured, well, what better time to demonstrate some of these GenAI tools than with some real-world, real-life examples. from my own use of some of these new tools?

Some Definitions

Will new generative AI tools change how academic research is done? Photo by Dan Dimmock on Unsplash

These two categories of tools are:

  • AI research assistants: tools specifically designed to help researchers search, discover, synthesize, and analyze academic and scientific literature. Each of them uses large language models (LLMs, i.e. GenAI) combined with scholarly databases (e.g. PubMed for medicine; AGRICOLA for agriculture), to help users find relevant papers, extract key findings, and synthesize evidence across studies. Examples of such tools are Elicit, Undermind, Consensus, and Assistant by Scite. Keep in mind such tools are only as good as the scholarly databases they access! For example, while Consensus proudly announces partnerships with major academic publishers like Sage, Wiley, Taylor & Francis and ACS on their front page, Elicit only seems to use freely-accessible sources like Semantic Scholar and OpenAlex, as you will see below.
  • AI-powered document analysis tools: While AI research assistants search across published scholarly literature, GenAI-powered personal library/document analysis tools are built around the concept of “source-grounding” — you upload your own documents (e.g. PDFs of journal articles and conference papers, word processor documents, websites, YouTube videos, audio files, etc.) and then the GenAI tool works exclusively from those materials. They’re intended to help researchers make sense of a lot of information. The best-known of this relatively new category of GenAI tools is Google’s NotebookLM, but there are other products similar to it: Nouswise, and the open source tool Open Notebook.

To summarize the difference between the two: AI research assistants (Elicit, Consensus, etc.) help you discover literature, while AI-powered document analysis tools (NotebookLM, etc.) help you analyze and synthesize literature you’ve already collected. They occupy different stages of the research workflow.


Undermind

I currently have a Pro account with Undermind, at US$16 per month, which is one step up from their limited-use, free service. My initial question to Undermind was as follows:

I am researching the topic of the metaverse, both older virtual worlds (e.g. Second Life) and newer social VR/AR platforms (e.g. VRChat). I am interested in the history of the concept of the metaverse, and how the meaning of the term “metaverse” has evolved over time.

Undermind took this initial question, and asked a series of follow-up questions in order to clarify what I was looking for. Here’s part of that chat:

Eventually, I was able to come up with a more specific search, as follows:

The final question I finally sent Undermind off to work on was as follows:

Find academic literature on the history of platforms and user practices associated with what is now discussed as the metaverse, staying broad across decades. Focus on the history of virtual world platforms and how people used them, including older virtual worlds such as Second Life and newer social VR/AR platforms such as VRChat, while also including adjacent predecessor platforms that predate the coining and later popularity of the term “metaverse.” Emphasize user practices broadly rather than narrowing to a single type of practice, and help trace how the meaning of the term “metaverse” evolved over time in relation to these platforms and practices.

My search results were 80 papers which Undermind determined were relevant to my final question, covering a publication date range of 1970 to 2024:

Note, at the bottom of this screen capture, how Undermind actually went through and sorted these papers into eight broad categories or subtopics, in essence giving me a nice overview of these 80 published academic papers. This kind of context/overview work is something at which GenAI tools tend to excel, and it can save an academic researcher hours of work (but, of course, you still have to be the human in the loop, and actually read and digest all the papers retrieved!).

But even more important to note is how GenAI tools like Undermind mark a dramatic change in information retrieval: a shift away the from the sometimes-arduous task of using keyword searching, controlled thesaurus vocabulary, and Boolean logic to search traditional academic databases (e.g. PubMed and its MeSH or Medical Subject Headings), towards actually having a conversation with the search tool, starting with a plain English statement, and answering follow-up questions to clarify and refine that initial prompt into a final search question, then submitting it.

If you like what you see (and I did), you can click on the Generate Report button to start a new process, which prompts you:

I’d like to write a report based on papers from the search “History of metaverse platforms and practices”.

Let’s briefly discuss the content before you start writing.

And again, Undermind asks a helpful series of clarifying questions to help you figure out what you want from a report on all this research data it dredged up:

The final report (which I could save as a PDF or markdown file, using one of three citation styles), looked like this:

The resulting report had 36 citations. However, unlike the Elicit report, the Undermind report did not have a section where it got into the nuts-and-bolts of what sources it used to discover the papers used in this report, nor the method by which it selected them. So, while the initial read looked good, it would take actually getting and reading the full-text of the papers cited in this report to determine exactly how good it was.


Elicit

I also decided to spring for a Plus-level account on a tool similar to Undermind, Called Elicit (again, one step up from a free, Basic account, which offers a more limited service).

Having already done the Undermind search mentioned in the previous section of this blogpost, I decided to use the final search statement as my starting point, plugging it verbatim into Elicit to see what would happen…

…only to discover that Elicit doesn’t consider that a very concise search question, at all! (Actually, I kind of agree here. But Undermind let me do it, anyway!) However, instead of asking a series of follow-up questions like Undermind did, instead Elicit offered a series of buttons which, when pressed, rewrote the question to be much more narrowly focused, for example:

So, I clicked on the offered “Temporal and conceptual scope” button, and edited it a bit to include specific examples of what I was talking about, and hit the green Send button, using the default settings of research papers, and asking for a general review. Elicit then asks me what level of detailed answer I want (with the most detailed alternative greyed out unless I pony up more money for their Pro plan, one level up from my measly Plus plan):

I went with the Balanced report. However, I am not crazy about the limitations, especially when I could do a more traditional database search, using one of the over 650 databases offered by my university’s library service, without such petty limits as “the top 500 sources” (and, remember, that’s a ranking based on a newish GenAI computer algorithm, not keyword matching using a controlled thesaurus vocabulary and Boolean logic to construct a search strategy, the old-school way). Essentially, it’s a trade-off: a search using plain English language to start, with prompts to refine it, and a pre-limited number of sources examined—and with even more restrictions on the number of sources from which a comparative chart would be constructed (25). If you want more—and many users would want more—then you’ll have to pay extra for it.

However, for all of its limitations, the final report looked pretty good, at first. You can save a PDF version of the report, and you can even ask questions of it, via a chatbot interface (using the chat box located in the bottom-right corner of the screen capture below):

However, in doing a read of the PDF report, I was struck by several things:

  • Again, the hard limit of 25 papers from which data was extracted, which essentially makes Elicit useless to me at this level;
  • The fact that zero papers of the 500 selected were screened out by the selected criteria (see image below taken from the report: although, to be honest this technique probably would have worked much better for examining clinical research studies in, say, medicine, rather than looking for papers about metaverse platforms);
  • The search was performed against “over 138 million academic papers from the Elicit search engine, which includes all of Semantic Scholar and OpenAlex,” but again, my librarian mind kept thinking that there would be a lot of full-text content that was locked away behind academic publisher paywalls. And indeed, of the 25 sources picked for this report, only 15 sources had the full text of the article retrieved. For the other ten sources, Elicit likely relied only on the (freely-available) author-provided abstract. And indeed, many of these GenAI research tools tend to rely on scraping free sources such as Semantic Scholar and OpenAlex, rather than enter into potentially expensive agreements with academic publishers such as Elsevier and Wiley, which would give their users full access to content they own, and frankly, more complete data from which to write reports.

I actually came away from reading this report more disturbed by its limitations than I was impressed by any conclusions it was able to draw. Again, I hasten to add that my real-world test case would probably have performed much better if I had an actual real-world use case that fit Elicit better (like a systematic review of clinical medical trials, for example). It might just be that my admittedly fuzzy subject area didn’t fit the way Elicit works, at all. And that’s fine.

However, what bothered me most was that somebody without my 30-plus years of academic library experience would run this report it, read it, nod, and think that this was a good response. Even worse, an in-depth response. When, in fact, a more traditional search against a library database (perhaps executed with the expertise of a professional librarian) would give much better and more thorough search results.

Even worse, how many of those Elicit users would stop here, and run with this summary, and not actually go and read the full text of the 25 papers that were selected for the report, not to mention the countless papers NOT included? I would suspect that it’s more than a few. So yeah, this academic librariam does have some reservations about where all this is headed. However, I can also confess that the report did give me a few new ideas to think about, and some possible new directions to follow in my own academic research, which I might not have found otherwise.

UPDATE March 11th. 2026: I’ve since gone back to Elicit and realized that what I probably should have done first was just search for papers, instead of just asking it to generate a report (see the red arrow on the image above). I tried searching for papers using the question, “What are the most effective techniques for dealing with trolling, griefing, and harassment on metaverse platforms?”

The next image is a screen capture of the search results. It gave me ten research papers in a chart, with brief citation details, a GenAI-generated summary of each paper, and another GenAI-generated overview of all ten papers in a couple of paragraphs on the right-hand side of the page, with the option to chat with the papers (i.e., ask questions and get answers from the content of these research articles). There’s a button at the bottom of the chart which you can click on to load another ten papers to keep retrieving information, although the right-hand-side overal summary does not seem to update with new articles added to the chart.


NotebookLM

Now, I turn to NotebookLM, Google’s Language Model (the “LM” in the product name) which tries to do the same sort of thing to your personal research library that Google Gemini tries to do with—well, with an infinitely larger library of millions and millions of documents. The idea is the same, though: to feed a (much smaller) set of documents, audio, video, etc. into a service which then allows you to use a chatbot-type interface to ask it questions, and (hopefully) get some useful answers back. But, again, how useful NotebookLM will be to you depends entirely upon what you feed it! In computer science we have a saying, with the acronym GIGO: Garbage In, Garbage Out. If you fill NotebookLM with crappy sources, don’t be too surprised if you get crappy answers back!

I have a Google AI Pro plan, with 2 Terabytes of storage, which includes access to Google Gemini 3.1 Pro. This costs me CA$26.99 per month (approximately US$20), and frankly, I’m pretty sure I am not getting my money’s worth out of it. With that, My NotebookLM service is rated at the Pro level, which means I can have up to 500 notebooks, with each notebook having up to 300 sources. (NotebookLM Standard, the free service, lets you have up to 100 notebooks, with up to 50 sources each. You can compare the various levels of plans here.)

I have uploaded 103 documents (mostly PDFs of journal articles from my personal Zotero research library) into NotebookLM. Again, some of them are probably of lower quality than others, so the GIGO rule applies. For example, the notebook summary it seems to have automatically created veers alarmingly close to gobbledegook, and there’s even a mention of (gasp!) blockchain, and the audacity to name it as a “primary pillar necessary to facilitate real-time, multisensory interactions between users.” (WHAT THE ACTUAL FUCK?!?? Okay, I take it back, it is gobbledygook, a Frankenstein-like creation stitched together from bits and pieces of documents I had uploaded. I actually created this monster.)

There’s absolutely no explanation of how or why this summary was generated. In fact, the whole user interface of NotebookLM I found to be extremely confusing. I had to dig through the product’s Frequently-Asked Questions list to find out why some things wouldn’t load: any uploaded file over 200MB in size, any source with over 500,00 words, and any copy-protected PDF files will not load, but you don’t get any sort of error message if you try. In my limited testing thus far, you get…no response.

Even worse, this feels like a product that Google has just sorta dropped on us, with only the previously-mentioned FAQ and an email address for product support (yes, even for Pro users). I shouldn’t be surprised, I suppose. Just like I wouldn’t be surprised if Google is silently compiling notes on how people use NotebookLM*, or decide to yank it away, like so many other previous Google products and services.

*UPDATE March 11th, 2026: It turns out that I was wrong; apparently (according to a quote from Steven Johnson, a member of the NotebookLM product team, in a slide presentation I watched today by fellow librarian and GenAI expert Nicole Hennig) anything you upload to NotebookLM is only stored in the model’s short-term context memory, and it is not used in training the Gemini LLM used:

As an author, Johnson clarifies that no material uploaded to the model is used to train NotebookLM or Google Gemini; it’s only sent to the model’s context window, or “short-term memory.” Johnson explains that if you “have the right to use [the material] under copyright, you can use it inside of Notebook.” (source)

Honestly, I do need to spend some more time playing around with NotebookLM before I issue any final summary judgement on the product. In particular, I get the feeling that the GIGO rule really applies to NotebookLM! Google themselves, in their NotebookLM FAQ, states:

Sometimes NotebookLM can’t answer your question because of…

  • Information not in sources: NotebookLM answers questions based on the information provided in your uploaded sources. If the answer isn’t in the source material, it won’t provide a response.

I had a very interesting day playing with these GenAI tools, and I learned a few things. I’ll keep you posted on how things go!

Photo by Jaredd Craig on Unsplash

UPDATED! Generative AI Update, March 2026: My Updated Presentation on Artificial Intelligence and GenAI, Plus My First Thoughts on the Claude Add-In for PowerPoint, and Yet Another Head-to-Head Comparison Between Claude, Gemini, and ChatGPT

I am (as you can clearly tell by this absurdly long blogpost title) trying to do three related things here. If you want, you can skip to the very end, where there will be an executive summary, where I have some thoughts to share about (waves hands) all this.

First, I wanted to share an updated version of the original slide presentation on artificial intelligence and generative AI, which I shared in a December 2025 blogpost. I used to think that keeping track of the many metaverse platforms I blog about was a task similar to herding cats, but let me tell you, it was a breeze compared to trying to stay abreast of all the rapidly changing and accelerating developments in generative AI!

Keeping on top of developments in generative AI is like herding cats, where the cats are multiplying and mutating!
One of the updated comparison charts in my PowerPoint slide deck (see link below to download)

Below is my updated PowerPoint slide presentation, complete with my speaker notes, for you to download and use as you wish, with some stipulations. I am using the Creative Commons licence CC BY-NC-SA 4.0, which gives the following rights and restrictions):

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

This license requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. If others modify or adapt the material, they must license the modified material under identical terms.

BY: Credit must be given to you, the creator.

NC: Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation.

SA: Adaptations must be shared under the same terms.

(The tool I used to determine the appropriate Creative Commons licence can be found here: https://creativecommons.org/chooser/.)

So, with all that said, here is my PowerPoint presentation (please click on the text link or the black Download button under the picture, not the picture itself):


NEW: Claude a just released add-ins for Microsoft Office

Second, today I installed a brand-new add-in from Anthropic’s Claude GenAI tool, which is supposed to work with Microsoft PowerPoint. This is an initial review of a very beta product.

And I have an actual real-world use case against which I will be trying out this new add-in: the design of an actual keynote presentation which I will be giving in a couple of weeks. (I am also using it in the third section, but in a different test of all three of ChatGPT, Claude, and Gemini.)

Now, before I get into this, I should explain that I have tried in the past with all three GenAI tools on which I currently have paid accounts (OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Gemini) to create a PowerPoint slide deck presentation design—only to get highly disappointing and completely unusable results back. So I was not expecting much here, particularly as this is a still a research beta version of the PowerPoint add-in.

My initial prompt to the Claude add-in to Microsoft PowerPoint was:

Please create a new PowerPoint slide presentation design with the title of the presentation being: “Your Metaverse Is Too Small: How the Biases and Preconceptions of Virtual Worlds Hinder Their Use in Education.” The theme of the talk is educational uses of virtual worlds, social VR, and the metaverse in general. I want to have some nice background images to use in some of my slides, as well as a visually pleasing title slide. I’d prefer blue as a colour in the slide deck theme, thanks!

And Claude chugged away on my request, keeping my posted on what it was doing:

And it even prompted me to be sure I wanted to delete the Claude add-in help slide!

The set-up for the title slide took a long, loooong time, much longer than I would taken to click on the Designer button in the PowerPoint toolbar and just select one of the default options, and a colour scheme. Eventually, I just gave up on waiting and went off to work on another task, leaving Claude to beaver away. After fifteen minutes, I realized that I still had to explicitly okay the clearing of the original slide design (inset Homer Simpson “D’oh!), which I did, so that the work could continue.

If I could summarize the result in one word, it would be: meh (again, shout-out to The Simpsons):

I mean, I could easily do better than this myself. And two dots do not make, as I asked for, “some nice background images to use in some of my slides, as well as a visually pleasing title slide.” Here’s my section title slide:

Again, extremely underwhelming, and frankly, not an improvement at all over my previous failed attempts to generate a PowerPoint slide presentation design using any of the GenAI tools (Claude, ChatGPT, or Gemini). Mind you, I have deliberately stayed away from using the image-generation tools in these products; I can spot a GenAI-produced image from a mile away by this point, having been playing around with these tools, off and on, since they first came out in 2022.

Claude continued to generate all the standard versions of PowerPoint slides in this theme, ending with a final slide that, I must confess, I kind of liked the look of (although, again, I would have preferred some sort of background image):

This is where the process got interesting, as I finally decided to stop having to manually okay each individual step, and just gave Claude carte blanche to do whatever it felt was best. (I mean, the worst that could happen was that it come up with something I hated so much that I threw it away and started over.)

Claude was still working away while I took my lunch break, giving feedback along the lines of “Build stunning title slide design.” 🙄 (I’ll be the judge of what’s considered stunning, Claude. Calm the fuck down.)

Here’s the final result, my “stunning” title slide (insert RuPaul’s Drag Race shade death rattle):

The addition of three pieces of clip art in the upper right corner of the slide, plus a few more bubbles/dots. So, yes, this is, once again, a complete fail. I will probably still use this as a basic slide design, but obviously I will be locating and using my own images to illustrate it. This is now the second new tool in a week (first Claude Cowork and now Claude PowerPoint add-in) which has utterly failed at the tasks given it. I am not impressed.


Third, and finally, thank God, I had much better luck was in issuing all three general-purpose GenAI tools the exact same text prompt, a technique I had used before here (and one which I found very useful in comparing and contrasting the responses):

I am writing a keynote presentation on the mistakes companies make when creating, designing, and marketing the following product category: virtual worlds, social VR/AR, and metaverse platforms in general. Please give me a list of failed or shut down metaverse platforms, along with reasons why they might have failed. Please cite both academic and industry sources of information in your answer.

In all cases, I used the latest models as specified in Ethan Mollick’s latest AI Guide:

  • ChatGPT’s GPT 5.2 Thinking with the Extended Thinking option;
  • Claude Opus 4.2 Extended Thinking with the Research option; and
  • Gemini 3 Thinking with the Deep Research option.

Unlike the last comparison, I’m not going to go into great detail on the results (because I will be using some of these results, once they are double-checked against more authoritative sources, in an actual keynote presentation I will be delivering later this month). Instead, I will my general overall impression of each report (and all three did provide a detailed report with citations).

Please note that I deliberately left it up to the specific GenAI tool to define what “failed” or “shut down” means, how far back and how thoroughly to search for failed platforms, and what metaverse platforms to include or exclude from its final report. As always, I find the differences between the reports to be an interesting way to compare and contrast the results, so below I will give some basic statistics:

GenAI Tool# Failed Platforms ListedTime Range of Failed Platforms# Citations in Final Report
ChatGPT152003 to 202623
Claude13(start dates not given) to 2023/”effectively failed, still limping along”30
Gemini92009 to 2024 (but some platforms had no timeline information given)33

While ChatGPT was the most thorough in listing failed metaverse platforms, and seems to have gone the furthest back in time (including There.com, which launched back in 2003!), it also had the fewest number of citations, and most of them were historical, platform-related announcements (e.g. a 2020 announcement of the shutdown of the then-social-VR platform High Fidelity by its CEO) rather than peer-reviewed academic journal articles (although there were a couple of those, too). While Claude had more citations, a review of those showed mostly blogs and news websites, with fewer references to actual academic research papers (probably because much of that content is locked behind academic publisher paywalls, although there were still quite a few academic references to free sources such as ResearchGate and PubMed Central/PMC; see the Claude report image below for one section which did focus on academic sources). Of the three, Gemini’s 33 citations used included the most resources which I would consider academic, from a good range of different publishers (as well as more informal websites). Interestingly, Gemini also included a list of resources which it looked at, but chose not to include in the final report, something which neither ChatGPT nor Claude offered! I thought that was particularly valuable, in case something else caught my eye to follow up on. Gemini for the win here.

Gemini was also notable for the strong, overarching narrative structure to its report, something which I had also noticed in previous queries using this GenAI tool. Gemini has clearly been trained well in telling a cohesive story! However, Claude was also notable for listing, in a separate section of its report, what it called “cross-cutting failure themes” in its 13 examined metaverse failures (which is definitely a phrase I will be stealing for my final keynote presentation!). By comparison, the final report from ChatGPT, while thorough, was jargon-heavy, poorly-formatted, and seemed to lack the final polish of its competitors. For example, there were three separate sections titled “failure themes and comparative analysis,” “theme-to-platform mapping,” (?!??) and “top 10 failures by primary cause.” It was, in my opinion, the poorest of the three reports generated, just in terms of sheer (lack of) organization and narrative. Again, Gemini for the win!

Gemini’s report had a strong, overarching narrative structure—something which I have noticed seems to be a particular strength of this GenAI tool, a sort of final overall polish to the text that ChatGPT, in particular, was lacking in its report (see below).
Claude’s report had a summary section titled “cross-cutting failure themes,” which I am definitely stealing for my keynote presentation!
Compared to the Gemini report, the ChatGPT report was jargon-heavy and poorly-formatted.

EXECUTIVE SUMMARY: So, here are my final thoughts.

  • It is getting harder and harder (in fact, almost a full-time job) to keep on top of what is fast becoming an arms race between the top three general-purpose generative AI tools (ChatGPT, Claude, Gemini), not to mention an ever-growing legion of more narrowly-focused applications, which might be better at certain specific tasks, such as writing programming code or generating music.
  • While Claude seems to be good at putting new agentic (e.g. Claude Cowork) and add-in tools (Claude for PowerPoint) into the hands of its users first, my personal experience with these new tools has been very disappointing, even comically bad. However, Claude’s chatbot interface works well for generating detailed answers with citations (although slightly edged out by Gemini).
  • I am impressed by Gemini’s consistent ability to create a strong narrative structure within its generated reports, something in which ChatGPT in particular is noticeably lacking. It also came first in a key metric: actual citations to academic literature, not just freely-accessible websites (blogs and news articles).
  • If I were forced to rank the three GenAI tools by just this one head-to-head-to-head comparison (i.e. the third part of my blogpost), I would rank them as follows:
    • 1st: Google Gemini.
    • 2nd: Anthropic Claude.
    • 3rd. OpenAI ChatGPT.
  • Again, when these GenAI tools work, they work well (sometimes very well!), but they they fail, they fail spectacularly. Which, in my mind, is another reason why it is good to put these tools to the test regularly, and use them in real-life situations, so that you can learn what they are good and bad at!