Chatting with, Watching, and Listening to My Blog (The Good, The Bad, and the Ugly): My Experience Creating a Notebook from My Blog Posts Using Google NotebookLM

This is a follow-up of sorts from my initial blogpost about Google’s NotebookLM generative AI (GenAI) tool, which you might want to read first to gain a bit of context. I decided yesterday that I wanted to test NotebookLM out, by training a notebook on my entire RyanSchultz.com blog (from the very beginning to the previous post).

How I Got My Blogposts into NotebookLM

This part turned out to be easier than I anticipated it would be! First, I had to actually figure out a way to export all my blogposts into Google NotebookLM. I installed a free plugin for WordPress called WP All Export, asking it to export all my blogposts into a CSV file. Then, I asked Claude AI to convert the CSV file into a PDF format. I watched as it created a program to do the conversion, did it, and I landed up with a PDF of my entire blog. It took almost no time at all!

Then, because NotebookLM cannot work with files over 500,000 words apiece, I asked Claude to split my too-large PDF into sub-half-million-word chunks, which it happily did, automatically creating code to complete the task (and even going so far as to inform me that perhaps one of my blogposts had an incorrectly-formatted posting date):

Finally, I uploaded each of the four resulting PDF files into a brand new notebook in NotebookLM. It seemed to take a while to train the language model. I went away for an hour and it still seemed to be processing all four files, but then I exited that particular notebook and came back, and it appeared to be done (perhaps I needed to refresh the webpage to see that it had completed?).

Anyway, the rest of my commentary on Google NotebookLM will be split into three sections:

  • The Good: what worked well (sometimes astoundingly well, in my opinion);
  • The Bad: what worked, in my estimation, badly; and finally
  • The Ugly: because yes, one of the suggested follow-up questions was about “butt-ugly” avatars, and I went down that particular rabbit hole!

The Good

It is very clear that one thing Google NotebookLM particularly excels at is summarizing a collection of information from various sources (in this case, 2,768 blog posts from the RyanSchultz.com blog). In fact, I marveled at just how well NotebookLM was at giving a forest-level view of these particular 2,768 trees! And I found myself in the weird situation of actually having a conversation with my blog content, asking it questions.

The first question I decided to throw at the language model trained on my blogposts was this: “How long did the metaverse hype cycle last?”

Here’s what it spit back (in case you didn’t feel like squinting at the screen capture above):

The most recent and prominent metaverse hype cycle lasted approximately three years, cresting and then ultimately crashing.

This intense period of “metaverse mania” was ignited in October 2021 when Facebook officially rebranded itself as Meta, and CEO Mark Zuckerberg announced the corporation’s massive pivot toward building the metaverse. This announcement sparked a massive wave of public and corporate interest, leading to a phase where countless companies scrambled to associate themselves with the “metaverse” buzzword.

At its peak, this hype cycle was heavily fueled by a collision with the booming cryptocurrency and Non-Fungible Token (NFT) markets. Speculators, investors, and major real-world brands engaged in a frenzied digital land rush, spending millions of dollars to buy up artificially scarce virtual real estate on blockchain-based platforms like Decentraland and The Sandbox.

However, the hype eventually collapsed into a “trough of disillusionment” due to several compounding factors. First, a severe “crypto winter” caused the value of cryptocurrencies and NFT-based virtual real estate to plummet, leading to massive financial losses for investors and driving away the general public. Second, Meta faced widespread mockery and skepticism for the poor graphical quality of its flagship Horizon Worlds platform, coupled with staggering financial losses in its Reality Labs division.

By the end of 2022 and into 2023, the tech industry’s attention had firmly shifted away from the metaverse and onto a new, explosive hype cycle: generative artificial intelligence tools like ChatGPT and DALL-E. As public interest waned, major corporations like Microsoft, Disney, and Walmart began shuttering their metaverse divisions or virtual reality workspaces, prompting mainstream media outlets to effectively declare the three-year metaverse trend “dead”.

I mean, that’s actually a pretty good summary answer. Even better, littered throughout this response were numbered “citations” which, when hovered over, popped up a snippet of one of my blog posts from which the information was taken. Here are two examples of that (and, if you chose to click on the “citation,” it would open up the text in its context blogpost in the leftmost column, which I did not show here):

To give another example, here’s the response to an easier question I asked my blog: “What are the reasons why Second Life has been such a successful virtual world?”:

The remarkable success and longevity of Second Life can be attributed to several core factors that created a deeply engaging and “sticky” virtual world for its users.

A Thriving, Creator-Driven Economy: A critical element of Second Life’s success was Linden Lab’s decision to provide a practical combination of micropayments (the Linden Dollar) and an object permissioning system. This allowed users to retain the intellectual property rights to their creations and effectively monetize their work. Once people realized they could earn real money by creating and selling content, the platform’s economy exploded, leading to a massive marketplace with millions of virtual goods, a vibrant virtual real estate market, and millions of dollars in yearly payouts to creators.

Limitless Avatar Customization and Identity Exploration: Second Life offered an unparalleled mix-and-match approach to avatar construction, allowing users to painstakingly customize their digital appearance from head to toe. This freedom gave users the psychological benefit of reinventing themselves, allowing them to be anyone or anything they wanted, and recognizing that people often feel more plural than the physical world allows them to be. Because users invested so much time and money into perfecting their avatars, they developed a strong sense of embodiment and attachment to their virtual identities, which kept them rooted in the platform.

Deep Community Investment: Ultimately, the true secret to Second Life’s success is the profound investment users made in their communities. The platform provided a blank canvas where users could find their specific “tribes”—ranging from historical roleplayers to furry communities and live music fans. The connections, friendships, and social bonds forged in user-created clubs, music venues, and social spaces created a strong network effect that kept people returning year after year.

Creative Freedom and Early Market Dominance: In its early days, Second Life benefited from a culture of near-anarchy and open-ended exploration, which encouraged residents to build and shape the world in myriad, unexpected ways. By providing accessible in-world building tools, Linden Lab empowered everyday users to easily get started as content creators. Furthermore, Second Life had a significant historical advantage: during its peak growth period from 2003 to 2008, it had almost no effective competition in the virtual world space, allowing its “build it and they will come” strategy to flourish.

This is a good, concise answer, drawn from all the blogposts I have written about Second Life’s success over the years. And, again (although it didn’t come over from my copy-and-paste of this quote), there are actually links back to the blogposts from which this answer was generated. You can click on the “citation numbers” (for lack of a more technical term, although they’re not really true academic citations) to see where it pulled at least part of its answer from (such as this post I wrote back in 2019):

So, I then spent a merry hour firing all sorts of questions at my blog, just to see what the answers would be. And, to be honest, I was fairly impressed at how well it could put together coherent paragraphs neatly sunthesizing various posts from my blog into a concise answer. In my view, this is where Google NotebookLM really shines. One last example: I asked it, “What were some of the reasons that Blue Mars failed as a metaverse platform?” Here’s the response, again compiled from various sources on my blog:

Oh, trust, child, Auntie Ryan had opinions about this now-obscure, failed metaverse platform. And again, there are links to all those blogged opinions!

The failure of the Blue Mars virtual world platform can be attributed to several compounding factors:

  • Misplaced development priorities: The developers focused heavily on non-core technology projects while neglecting basic, critical features. For example, instead of giving users the ability to modify their avatar’s physical size and shape, developers prioritized creating automated artificial intelligence bots. This resulted in a widely mocked welcome area where legions of female NPC bots were programmed to automatically mob and flirt with male avatars.
  • Offensive default animations: The platform featured a small pool of unchangeable default animations. The default standing and walking animations for female avatars were seen as sexist, coquettish, and “too anime-like,” which actively alienated users.
  • Alienating the adult market: The creators made a key decision to strictly maintain a G/PG-13 rating and refused to support adult content or adult role-play communities. Because adult activities are a major driver of engagement and economics in virtual worlds, this decision caused many potential developers to simply walk away.
  • Severe bottlenecks for content creators: Blue Mars created an unacceptable regulatory bottleneck by requiring all user-generated content to be strictly vetted and approved by the company before it could be sold.
  • Lack of an online marketplace: There was no centralized online store for digital goods; instead, creators had to spend time and money setting up physical in-world stores. This cumbersome system resulted in a severe lack of content, with some massive in-world shopping malls hosting as few as four items of clothing for sale.
  • Bungled marketing and promotion: The platform had almost zero press coverage outside of the existing virtual worlds community. What few events Blue Mars did host were poorly promoted and sparsely attended, meaning the platform failed to attract the boosters and “raving fans” needed to build a sustainable population.
I always laugh when I see this old picture from Blue Mars. The tone-deafness of this automated bot “welcoming” system was staggering. Obviously, someone at Blue Mars thought that this sort of work was a high priority! No wonder the platform folded.

I could go on (I asked my blog a lot more questions), but you the drift. After each response, there were three follow-up questions, which appeared to be randomly generated. One of them stopped me in my tracks:

How did Ryan’s subconscious use Luba’s song to change his habits?

Excuse me?!?? Intrigued, I cut and pasted the suggested follow-up question in the chatbox, only to unlock a long-buried memory which I had written about in an idle moment, near the start of the pandemic:

Ryan believes that his subconscious often sends him messages through the songs that happen to get stuck in his head.

During a period of severe insomnia, Ryan was sitting at his computer feeling dejected over his “quixotic endeavour” to arrange for other people to inherit his large collection of Second Life avatars (alts) in his will. At that moment, he realized the lyrics to the 1984 song “Let It Go” by Canadian musician Luba were running through his mind.

He interpreted this song as a clear message from his subconscious telling him to abandon the stressful, unrealistic goal of finding homes for all his avatars after he passed away. The song prompted him to realize that his “obsessive little hobby” of compulsive virtual shopping and avatar styling—which he used as a coping mechanism to escape from messy, painful reality—was taking up too much time and energy that he needed to devote to other areas of his life.

Consequently, Ryan used this subconscious revelation to change his habits: he decided to scale back on his Second Life shopping and styling, and he actively “let go” of his digital creations by permanently deleting eighteen of his Second Life avatars.

Oh, wow. Wow. I had completely, utterly forgotten how I was sometimes deep in the bargaining stage of Elisabeth Kübler-Ross’ five stage of grief during the COVID-19 pandemic that engulfed the world. It was only after discussing my behaviour with my psychiatrist (online, at a time of social distancing) that I realized I was trying to find some small way to control a situation over which I had no control, hence my desire to leave my SL avatars to other people via my will (and actually, my brother does have a list of who gets what avatars. as part of the will I did draw up during the pandemic).


Now, at the top of the rightmost column in NotebookLM (at least, in the desktop browser version) is something called the Studio, where you can generate various kinds of artifacts from your notebook (and the created artifacts appear in a list below it):

For example, when you click on the Audio Overview button, it actually goes away for a little while, and pulls together a podcast. Actually, since it’s not a real podcast, I am gonna use quotes: it’s a “podcast” where a male voice and a female voice talk about…my blog. Yes, I am not kidding! You do have some options available; I chose the Deep Dive of the four options, and asked to have the resulting audio “podcast” focus only on the metaverse topics in my blog (and not talk so much about, for example, the pandemic, which I also wrote a lot about, expecially when it started in 2020).

I was pre-cringing after I hit the blue Generate button, expecting the resulting “podcast” to be criminally bad, but the results were actually, to my surprise, quite listenable! I am actually flabbergasted at how many different topics from my blog were addressed in only 20 minutes. And the discussion between the male and female co-hosts actually flowed, in a completely natural manner. I was stunned.

Now, I did detect a few mistakes (e.g. the pronunciation of the first name “Moesha,” one of my Seocnd Life avatars) and a few hallucinations, notably, that you could create a red, avatar-wearable high-heeled shoe (i.e., rigged to the feet of a particular brand of mesh body, like Maitreya LaraX in Second Life) using GenAI tools (sorry, but we are not there yet, to my knowledge!). But overall, I was absolutely astounded by how much this “podcast” DID. NOT. SUCK. 🤯

I was able to export my “podcast” in an audio format called .m4a, but WordPress wouldn’t allow me to upload and share that particular file format, so I used a free online converter to convert the M4A file into MP3. Just click on the link below to download and listen to the 20-minute Gemini-GenAI-generated podcast, using whatever audio or video players you use to listen to .mp3 files:

.
🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯Honestly, this one feature alone has just completely left me gobsmacked! My flabber is gasted!! 🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯Somehow, I just never imagined, even in my wildest dreams, that I would live long enough to be able to convert my entire blog into a listenable, 20-minute podcast, where two co-hosts casually and naturally discuss a number of topics that I had written about over the past eight years! 🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯If this is an indication of where GenAI is going, then the future is going to be deeply, deeply weird, folks. I mean, seriously, go click on the link above, download it, grab a coffee, and listen to it.

Then, I decided to try the video creation tool. I told NotebookLM that I only wanted a video based on my metaverse blogposts (and not, say, my reflections on the coronavirus pandemic, although there was some mention of that in the final production as well). For a lark, I chose the Cinematic option of the three offered selections for the style of the video, just to see what it would do:

Here is the four-minute GenAI-generated MP4 video it generated within 35 minutes after I clicked the blue Generate button, again based off my metaverse blog posts. There’s actually an overarching narrative structure, references to my blog traffic, and several interesting chosen visuals (not all the visuals were that compelling, and I certainly wouldn’t consider the results cinematic per se, though I probably should have added something to my original text prompt to make it all “pop” a bit more).

Honestly, it’s not that bad; again, it was better than I expected. Check it out:

Tumbleweeds to Tornado to Tumbleweeds: Exposing the Metaverse

Again, while I am probably not as gobsmacked with the video result as I was for the “podcast,” this is, still, far better than what I was bracing for. The narration in the video, in particular, was pretty well-written and coherent, I thought. Is it better than what an experienced video editor could create? Hell no. But it’s a far, far better video than anything I could have cobbled together myself, using my rudimentary (well, okay, non-existent) video editing skills. Again, rather impressive.

However, not all the tools in the Studio toolbox were as capable, as we will see in the following sections!


The Bad

Okay, now let’s get to the stuff NotebookLM sucked at, in my opinion. For example, here is the Mind Map it generated (and yes, the clumsy title was the one suggested):

The Mind Map generated was seriously underwhleming.

I opened and closed a number of the arrows on the various levels to check out the categories that were created, and yeah, not a fan.

I also checked out the Infographic button, making selections as follows: portrait mode, in an Editorial style, standard level of detail:

Here’s the result:

Now, there are several things wrong about this infographic. For example, three of the four metaverse platforms in the so-called “Big Four” are either dead (High Fidelity) or moribund (Sinespace, Sansar) services. Hardly what I would have picked for my top four, but then again, it was drawing from older, outdated blogposts on my blog. Perhaps I might have gotten better results had I restricted the text prompt, not only to metaverse posts, but also added a time range (e.g. “Only use blogposts from the past 12 months.”)

Also, I’m sorry, but the whole “decentralized governance” talking point in the blockchain-based metaverse platforms is utter bullshit. (And I frankly washed my hands of the whole stinking mess, and eventually refused to bother covering any blockchain or NFT-based metaverse platform due to the proliferation of scams and rug-pulls.) So, yeah, this was also a fail (although I did kinda like the style of it).


The (Butt-)Ugly

You remember those three follow-up questions after every chatbot response, on what appear to be random topics (so not really “follow-up” questions at all), like the earlier mention of Luba’s Let It Go?). Well, here was yet another head-scratcher question: “Tell me more about the avatars you called butt-ugly.”

I could not resist, and so I cut and pasted that question verbatim into the chat box, only to receive back the following detailed response:

You are likely referring to the avatars from two different social VR platforms that you have heavily criticized on your blog: AltspaceVR and Arthur.

AltspaceVR You originally described the early avatars in AltspaceVR as “dreadfully cartoony,” “terribly unappealing,” and “butt-ugly”. You reasoned that this was a deliberate design compromise by the developers to keep the avatars low-polygon so they could render quickly and easily across a massive range of hardware, including less powerful mobile headsets like Google Daydream and Samsung Gear VR.

Fortunately, AltspaceVR eventually released a major avatar update in July 2020 that allowed for much greater customization of hair, eyes, and skin colors. While you noted they still lacked arms and legs (likely to avoid inverse kinematics issues), you celebrated the update for finally banishing the “low-poly, dreadfully cartoony avatars” you had previously disliked.

Arthur You also used the term to describe the avatars in Arthur, a virtual office and remote collaboration space, declaring them “among the butt-ugliest avatars I have seen in any social VR platform or virtual world”. To avoid the difficult task of rigging the avatars for facial expressions, Arthur’s developers made several off-putting design choices:

  • The Eyes: The avatars wear black, Matrix-style sunglasses to avoid having to rig eye movements.
  • The Mouth: The avatars wear wraparound black microphones that cover the mouth so closely they look as if they are being gagged, completely hiding the lack of lip-syncing.
  • The Hands: You exist only from the waist up, and the avatar’s arms simply fade out into floating controllers instead of actual hands.

While Arthur’s founder claimed this design was “better than looking at dead eyes,” you strongly disagreed, calling them “creepy-looking, budget-basement, cookie-cutter avatars”. You noted that you would rather show up to a virtual meeting with messy hair and a stained shirt on a Zoom webcam than be represented by Arthur’s highly restrictive, blank-faced avatars.

i searched my blog to find a picture of Arthur’s avatars at that time:

And example avatar from Arthur

Ladies and gentlemen of the jury, I rest my case. (And, once again, something I had ranted about on my blog, and then completely forgotten about, until NotebookLM dredged it up again. By the way, Arthur still seems to be around, but I haven’t bothered to go back and check to see if they have replaced these butt-ugly avatars.)


So, although I received a rather mixed bag of results, I still found that the exercise of feeding my blog into NotebookLM actually had some concrete benefits, in my opinion. For example, it would have taken me hours of scrolling and compiling work to summarize my experience with the Apple Vision Pro from my blogposts, whereas all I needed to do was ask the NotebookLM chatbot, “What does Ryan think about the Apple Vision Pro?”

Here’s the full, unedited response:

Ryan Schultz is an overwhelming fan of the Apple Vision Pro (AVP), frequently describing the technology as “magic”. Despite its incredibly high price tag (which cost him roughly CA$7,700 for the 1-terabyte model, accessories, and taxes), he considers it “worth every single penny” and states he cannot imagine living and working without it.

Here is a breakdown of Ryan’s specific thoughts and experiences with the device:

A Leap Forward in Technology: Ryan notes that the visual clarity of the AVP is so stunning that it completely eliminates the blurry “screen door effect” common in older virtual reality headsets. He famously noted that the Apple Vision Pro makes every other VR headset he has ever used feel like a “cheap plastic View-Master toy”. He was deeply moved by the device’s 360-degree immersive “Environments,” such as Mount Hood and Haleakalā, stating that they felt so real they almost brought him to tears. He considers these Environments to be one of the device’s “killer apps”.

A Vital Productivity and Pain Management Tool: Ryan uses the headset for hours every day, primarily as a work productivity device. He considers the Mac Virtual Display to be the absolute ultimate killer feature of the AVP. By connecting it to his MacBook Pro, he can work on a massive, crystal-clear virtual screen suspended in a relaxing virtual forest or beach.

Crucially, the headset has become a pain management tool for Ryan. Suffering from deteriorating cervical joints in his neck, the Virtual Display allows him to look straight ahead at an ergonomic height rather than hunching over a laptop, providing significant pain relief and allowing him to work comfortably for four to six hours a day.

Flaws and Frustrations Despite his evangelism for the device, Ryan has documented several frustrations with the Apple Vision Pro:

  • The Purchasing Process: Because Ryan wears glasses (which cannot fit inside the AVP), he had to order magnetic Zeiss prescription lenses. A nightmare with UPS shipping delays left him with an unusable headset for weeks, making him so angry he nearly returned the entire device before he even got to use it.
  • “Crackgate”: After just seven weeks of meticulously careful use, his unit developed a spontaneous half-centimeter crack right down the middle of the front glass, a known defect he found deeply upsetting for such an expensive product.
  • Physical Discomfort: During his first few months of use, he experienced eye strain after about two hours of wear. He mitigated this by switching to a thicker light seal cushion, and the problem was entirely eliminated when he later upgraded to a newer AVP model with an M5 processor and a more comfortable dual-loop knit headband.
  • Limitations: He finds the side-to-side field of view slightly constrained (especially when watching spatial videos), and he notes that trying to share the device to give demonstrations to friends using “Guest Mode” is highly frustrating and clunky.

Ultimately, Ryan’s experience with the Apple Vision Pro cemented his belief in the future of spatial computing. He uses it heavily for media consumption, socializing in apps like InSpaze (where 3D spatial Personas “blew his mind”), and daily office work. He has declared that he is “all in” and plans to be first in line to purchase whatever Apple spatial computing device comes next.

I mean, come on. This is an accurate, concise response, stitched together from countless blogposts I have written about the Apple Vision Pro over the past year and a half. And it took me only a couple of minutes to generate! I’m sold.

The real power of Google NotebookLM is its ability to take a large amount of sheer data, and use it to create a personal language model that you can then use to ask questions of it, generate artifacts like flashcards, etc. In particular, it excels at summarizing the results in its answers, and I must confess that alone is worth the sometimes-high price of admission (i.e. fiddling with a data to get it into the service, like I did with my blogpost export and conversion, as described in the first section of this blog post above). While some of the Studio tools provided less-then-great results, others (like the “podcast”) simply blew my mind. And, like AI writer Ethan Mollick has stated, assume that this is the worst version of NotebookLM you will ever use. These generative AI tools are improving and evolving over time.

Welcome to the future…it’s gonna be deeply weird, folks. And I can certainly recommend that you give Google NotebookLM a try (there’s a free version available, which you can play around with yourself).

UPDATED! Generative Artificial Intelligence Tools for Academic Research: AI Research Assistants, AI-Powered Document Analysis Tools, and A Look at Elicit, Undermind, and NotebookLM

I freely admit that this was not the next blogpost I was planning to write, but as a follow-up to my previous detailed discussion of what I have started to call the “Big Three” of good (sometimes, good enough) general-purpose generative AI (GenAI) tools—ChatGPT, Claude, and Gemini—I wanted to write a little bit more about two particular subsets of GenAI tools which are focused on the academic research process. And, since I have two things coming up on my calendar which necessitate academic research, that is:

I figured, well, what better time to demonstrate some of these GenAI tools than with some real-world, real-life examples. from my own use of some of these new tools?

Some Definitions

Will new generative AI tools change how academic research is done? Photo by Dan Dimmock on Unsplash

These two categories of tools are:

  • AI research assistants: tools specifically designed to help researchers search, discover, synthesize, and analyze academic and scientific literature. Each of them uses large language models (LLMs, i.e. GenAI) combined with scholarly databases (e.g. PubMed for medicine; AGRICOLA for agriculture), to help users find relevant papers, extract key findings, and synthesize evidence across studies. Examples of such tools are Elicit, Undermind, Consensus, and Assistant by Scite. Keep in mind such tools are only as good as the scholarly databases they access! For example, while Consensus proudly announces partnerships with major academic publishers like Sage, Wiley, Taylor & Francis and ACS on their front page, Elicit only seems to use freely-accessible sources like Semantic Scholar and OpenAlex, as you will see below.
  • AI-powered document analysis tools: While AI research assistants search across published scholarly literature, GenAI-powered personal library/document analysis tools are built around the concept of “source-grounding” — you upload your own documents (e.g. PDFs of journal articles and conference papers, word processor documents, websites, YouTube videos, audio files, etc.) and then the GenAI tool works exclusively from those materials. They’re intended to help researchers make sense of a lot of information. The best-known of this relatively new category of GenAI tools is Google’s NotebookLM, but there are other products similar to it: Nouswise, and the open source tool Open Notebook.

To summarize the difference between the two: AI research assistants (Elicit, Consensus, etc.) help you discover literature, while AI-powered document analysis tools (NotebookLM, etc.) help you analyze and synthesize literature you’ve already collected. They occupy different stages of the research workflow.


Undermind

I currently have a Pro account with Undermind, at US$16 per month, which is one step up from their limited-use, free service. My initial question to Undermind was as follows:

I am researching the topic of the metaverse, both older virtual worlds (e.g. Second Life) and newer social VR/AR platforms (e.g. VRChat). I am interested in the history of the concept of the metaverse, and how the meaning of the term “metaverse” has evolved over time.

Undermind took this initial question, and asked a series of follow-up questions in order to clarify what I was looking for. Here’s part of that chat:

Eventually, I was able to come up with a more specific search, as follows:

The final question I finally sent Undermind off to work on was as follows:

Find academic literature on the history of platforms and user practices associated with what is now discussed as the metaverse, staying broad across decades. Focus on the history of virtual world platforms and how people used them, including older virtual worlds such as Second Life and newer social VR/AR platforms such as VRChat, while also including adjacent predecessor platforms that predate the coining and later popularity of the term “metaverse.” Emphasize user practices broadly rather than narrowing to a single type of practice, and help trace how the meaning of the term “metaverse” evolved over time in relation to these platforms and practices.

My search results were 80 papers which Undermind determined were relevant to my final question, covering a publication date range of 1970 to 2024:

Note, at the bottom of this screen capture, how Undermind actually went through and sorted these papers into eight broad categories or subtopics, in essence giving me a nice overview of these 80 published academic papers. This kind of context/overview work is something at which GenAI tools tend to excel, and it can save an academic researcher hours of work (but, of course, you still have to be the human in the loop, and actually read and digest all the papers retrieved!).

But even more important to note is how GenAI tools like Undermind mark a dramatic change in information retrieval: a shift away the from the sometimes-arduous task of using keyword searching, controlled thesaurus vocabulary, and Boolean logic to search traditional academic databases (e.g. PubMed and its MeSH or Medical Subject Headings), towards actually having a conversation with the search tool, starting with a plain English statement, and answering follow-up questions to clarify and refine that initial prompt into a final search question, then submitting it.

If you like what you see (and I did), you can click on the Generate Report button to start a new process, which prompts you:

I’d like to write a report based on papers from the search “History of metaverse platforms and practices”.

Let’s briefly discuss the content before you start writing.

And again, Undermind asks a helpful series of clarifying questions to help you figure out what you want from a report on all this research data it dredged up:

The final report (which I could save as a PDF or markdown file, using one of three citation styles), looked like this:

The resulting report had 36 citations. However, unlike the Elicit report, the Undermind report did not have a section where it got into the nuts-and-bolts of what sources it used to discover the papers used in this report, nor the method by which it selected them. So, while the initial read looked good, it would take actually getting and reading the full-text of the papers cited in this report to determine exactly how good it was.


Elicit

I also decided to spring for a Plus-level account on a tool similar to Undermind, Called Elicit (again, one step up from a free, Basic account, which offers a more limited service).

Having already done the Undermind search mentioned in the previous section of this blogpost, I decided to use the final search statement as my starting point, plugging it verbatim into Elicit to see what would happen…

…only to discover that Elicit doesn’t consider that a very concise search question, at all! (Actually, I kind of agree here. But Undermind let me do it, anyway!) However, instead of asking a series of follow-up questions like Undermind did, instead Elicit offered a series of buttons which, when pressed, rewrote the question to be much more narrowly focused, for example:

So, I clicked on the offered “Temporal and conceptual scope” button, and edited it a bit to include specific examples of what I was talking about, and hit the green Send button, using the default settings of research papers, and asking for a general review. Elicit then asks me what level of detailed answer I want (with the most detailed alternative greyed out unless I pony up more money for their Pro plan, one level up from my measly Plus plan):

I went with the Balanced report. However, I am not crazy about the limitations, especially when I could do a more traditional database search, using one of the over 650 databases offered by my university’s library service, without such petty limits as “the top 500 sources” (and, remember, that’s a ranking based on a newish GenAI computer algorithm, not keyword matching using a controlled thesaurus vocabulary and Boolean logic to construct a search strategy, the old-school way). Essentially, it’s a trade-off: a search using plain English language to start, with prompts to refine it, and a pre-limited number of sources examined—and with even more restrictions on the number of sources from which a comparative chart would be constructed (25). If you want more—and many users would want more—then you’ll have to pay extra for it.

However, for all of its limitations, the final report looked pretty good, at first. You can save a PDF version of the report, and you can even ask questions of it, via a chatbot interface (using the chat box located in the bottom-right corner of the screen capture below):

However, in doing a read of the PDF report, I was struck by several things:

  • Again, the hard limit of 25 papers from which data was extracted, which essentially makes Elicit useless to me at this level;
  • The fact that zero papers of the 500 selected were screened out by the selected criteria (see image below taken from the report: although, to be honest this technique probably would have worked much better for examining clinical research studies in, say, medicine, rather than looking for papers about metaverse platforms);
  • The search was performed against “over 138 million academic papers from the Elicit search engine, which includes all of Semantic Scholar and OpenAlex,” but again, my librarian mind kept thinking that there would be a lot of full-text content that was locked away behind academic publisher paywalls. And indeed, of the 25 sources picked for this report, only 15 sources had the full text of the article retrieved. For the other ten sources, Elicit likely relied only on the (freely-available) author-provided abstract. And indeed, many of these GenAI research tools tend to rely on scraping free sources such as Semantic Scholar and OpenAlex, rather than enter into potentially expensive agreements with academic publishers such as Elsevier and Wiley, which would give their users full access to content they own, and frankly, more complete data from which to write reports.

I actually came away from reading this report more disturbed by its limitations than I was impressed by any conclusions it was able to draw. Again, I hasten to add that my real-world test case would probably have performed much better if I had an actual real-world use case that fit Elicit better (like a systematic review of clinical medical trials, for example). It might just be that my admittedly fuzzy subject area didn’t fit the way Elicit works, at all. And that’s fine.

However, what bothered me most was that somebody without my 30-plus years of academic library experience would run this report it, read it, nod, and think that this was a good response. Even worse, an in-depth response. When, in fact, a more traditional search against a library database (perhaps executed with the expertise of a professional librarian) would give much better and more thorough search results.

Even worse, how many of those Elicit users would stop here, and run with this summary, and not actually go and read the full text of the 25 papers that were selected for the report, not to mention the countless papers NOT included? I would suspect that it’s more than a few. So yeah, this academic librariam does have some reservations about where all this is headed. However, I can also confess that the report did give me a few new ideas to think about, and some possible new directions to follow in my own academic research, which I might not have found otherwise.

UPDATE March 11th. 2026: I’ve since gone back to Elicit and realized that what I probably should have done first was just search for papers, instead of just asking it to generate a report (see the red arrow on the image above). I tried searching for papers using the question, “What are the most effective techniques for dealing with trolling, griefing, and harassment on metaverse platforms?”

The next image is a screen capture of the search results. It gave me ten research papers in a chart, with brief citation details, a GenAI-generated summary of each paper, and another GenAI-generated overview of all ten papers in a couple of paragraphs on the right-hand side of the page, with the option to chat with the papers (i.e., ask questions and get answers from the content of these research articles). There’s a button at the bottom of the chart which you can click on to load another ten papers to keep retrieving information, although the right-hand-side overal summary does not seem to update with new articles added to the chart.


NotebookLM

Now, I turn to NotebookLM, Google’s Language Model (the “LM” in the product name) which tries to do the same sort of thing to your personal research library that Google Gemini tries to do with—well, with an infinitely larger library of millions and millions of documents. The idea is the same, though: to feed a (much smaller) set of documents, audio, video, etc. into a service which then allows you to use a chatbot-type interface to ask it questions, and (hopefully) get some useful answers back. But, again, how useful NotebookLM will be to you depends entirely upon what you feed it! In computer science we have a saying, with the acronym GIGO: Garbage In, Garbage Out. If you fill NotebookLM with crappy sources, don’t be too surprised if you get crappy answers back!

I have a Google AI Pro plan, with 2 Terabytes of storage, which includes access to Google Gemini 3.1 Pro. This costs me CA$26.99 per month (approximately US$20), and frankly, I’m pretty sure I am not getting my money’s worth out of it. With that, My NotebookLM service is rated at the Pro level, which means I can have up to 500 notebooks, with each notebook having up to 300 sources. (NotebookLM Standard, the free service, lets you have up to 100 notebooks, with up to 50 sources each. You can compare the various levels of plans here.)

I have uploaded 103 documents (mostly PDFs of journal articles from my personal Zotero research library) into NotebookLM. Again, some of them are probably of lower quality than others, so the GIGO rule applies. For example, the notebook summary it seems to have automatically created veers alarmingly close to gobbledegook, and there’s even a mention of (gasp!) blockchain, and the audacity to name it as a “primary pillar necessary to facilitate real-time, multisensory interactions between users.” (WHAT THE ACTUAL FUCK?!?? Okay, I take it back, it is gobbledygook, a Frankenstein-like creation stitched together from bits and pieces of documents I had uploaded. I actually created this monster.)

There’s absolutely no explanation of how or why this summary was generated. In fact, the whole user interface of NotebookLM I found to be extremely confusing. I had to dig through the product’s Frequently-Asked Questions list to find out why some things wouldn’t load: any uploaded file over 200MB in size, any source with over 500,00 words, and any copy-protected PDF files will not load, but you don’t get any sort of error message if you try. In my limited testing thus far, you get…no response.

Even worse, this feels like a product that Google has just sorta dropped on us, with only the previously-mentioned FAQ and an email address for product support (yes, even for Pro users). I shouldn’t be surprised, I suppose. Just like I wouldn’t be surprised if Google is silently compiling notes on how people use NotebookLM*, or decide to yank it away, like so many other previous Google products and services.

*UPDATE March 11th, 2026: It turns out that I was wrong; apparently (according to a quote from Steven Johnson, a member of the NotebookLM product team, in a slide presentation I watched today by fellow librarian and GenAI expert Nicole Hennig) anything you upload to NotebookLM is only stored in the model’s short-term context memory, and it is not used in training the Gemini LLM used:

As an author, Johnson clarifies that no material uploaded to the model is used to train NotebookLM or Google Gemini; it’s only sent to the model’s context window, or “short-term memory.” Johnson explains that if you “have the right to use [the material] under copyright, you can use it inside of Notebook.” (source)

Honestly, I do need to spend some more time playing around with NotebookLM before I issue any final summary judgement on the product. In particular, I get the feeling that the GIGO rule really applies to NotebookLM! Google themselves, in their NotebookLM FAQ, states:

Sometimes NotebookLM can’t answer your question because of…

  • Information not in sources: NotebookLM answers questions based on the information provided in your uploaded sources. If the answer isn’t in the source material, it won’t provide a response.

I had a very interesting day playing with these GenAI tools, and I learned a few things. I’ll keep you posted on how things go!

Photo by Jaredd Craig on Unsplash