UPDATED! Dallying with DALL-E 2: My First Three Days Testing Out AI-Generated Art from Text Prompts (and Some Resulting Images!)

I know this post is off-topic, but I do hope you will indulge me! Today I checked my email and discovered that I have been among the first few lucky people to be accepted into the testing phase of DALL-E 2!

What is DALL-E 2? DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language. Here’s a two-minute video that explains the concept:

Vox has released a 13-minute YouTube video that explains the concept behind DALL-E 2 and related AI-generated art systems in more detail:

DALL-E 2 is a significant step up from the original DALL-E system, promising more realistic and accurate images with four times greater resolution! It can combine artistic concepts, attributes, and styles, as well as make realistic edits to existing images. It can also create variations of an image based on the original.

UPDATE June 23rd: According to its creators, who participated in an AMA session on the r/dalle2 subReddit, DALL-E 2 was trained using roughly 650,000,000 images along with their captions. These images were either licensed or publicly available on the internet.

So today, in my first day using DALL-E 2, I decided to put it through its paces, and I discovered some of the strengths—and weaknesses—of the AI program, from OpenAI.

First, I wanted to see what it could do with a selfie from Second Life of my main avatar, Vanity Fair.

I uploaded a picture and clicked on the Variations button, and it generated what looked like reasonable Second Life avatars with slight changes to the original, as if I had fiddled with the face sliders and tried on different wigs:

Then, I wanted to try erasing the background of the image, and using it with a text prompt: “Vanity Fair wearing a ballgown in a highly-realistic Regency Era ballroom with elegant dancers”.

Among the results I got back were these:

I love how it gave Vanity elf ears in the second picture! Then, I decided to erase the background from a shot of my main male SL avatar, Heath Homewood:

The text prompt I gave DALL-E 2 to fill in the erased area was “man in a highly detailed photograph of an elaborate steampunk landscape with airships and towers”. Here are five of the six results it spit back at me (please click on each image to see it in a larger size):

The backgrounds are all quite varied, and also quite intricate in some cases! I also noticed that the AI “augmented” Heath Homewood’s hair in some of the pictures, while it left it alone in others. Innnteresting…..

My next prompt, “smiling man wearing a virtual reality headset with a fantasy metaverse background very colourful and clean detailed advertising art”, also generated some astoundingly good results, any of which could easily be used in a magazine advertisement or article illustration! (Again, please click on the images to see them in full size.)

So, I continued. As my apartment patio looks out over a small forest known for its deer and rabbits, I decided to enter the same text prompt, “a lush green forest with deer and rabbits”, appending the text with an artistic style. In response to each prompt, I picked the best of the six pictures DALL-E 2 gave me back, along with the text prompts I used (in the captions below each picture).

A lush green forest with deer and rabbits digital art
A lush green forest with deer and rabbits impressionist art
A lush green forest with deer and rabbits by Johannes Verneer
A lush green forest with deer and rabbits by Salvafor Dali
A lush green forest with deer and rabbits by Andy Warhol
A lush green forest with deer and rabbits in the style of Sunday on La Grande Jatte by Georges Seurat
a lush green forest with deer and rabbits in the style of Inuit art
A lush green forest with deer and rabbits by Piet Mondrian
A lush green forest with deer and rabbits as a Disney cartoon
A lush green forest with deer and rabbits as a medieval tapestry
A lush green forest with deer and rabbits synthwave
A lush green forest with deer and rabbits cyberpunk
A lush green forest with deer and rabbits kawaii anime style (this wasn’t what I was expecting, but it’s so beautiful, like an illustration from a children’s book!)
A lush green forest with deer and rabbits chibi cartoon style
A lush green forest with deer and rabbits horror movie film still high quality
A lush green forest with deer and rabbits ancient Eqyptian carvings

While I am mightily impressed by these results, I did notice a few things. First, sometimes DALL-E 2 gave me a misshapen or mutated deer or rabbit, or even a mixture of a deer and a rabbit (and in one case, a deer merging into a tree!). Second, DALL-E 2 still seems to have a lot of trouble with faces, both of animals and of people (you can see this most clearly in the Disneyesque image above). In particular, you get terrible results when you put in the name of a real person, e.g. “Philip Rosedale wearing a crown and sitting on a throne in Second Life”, which gave some rather terrifying Frankenstein-looking versions of Philip that I will not share with you!

I did try “Strawberry Singh and Draxtor Despres dressed in Regency costumes in an episode of Bridgerton in Second Life”, and this is the best of the six results it spit back:

Strawberry Singh and Draxtor Despres dressed in Regency costumes in an episode of Bridgerton in Second Life

If you squint (a lot), you can just about make out the resemblances, but it’s very clear that presenting realistic human (or avatar!) faces is something DALL-E 2 is not really very good at yet. However, given how alarmingly quickly this technology has developed in a year (from DALL-E to DALL-E 2), the ability for AI-generated art to more accurately depict human faces realistically is probably not too far off…

However, the fact that you can already generate some amazing (if imperfect) art ahows the power of the technology,! This is AMAZING stuff.

But it also raises some rather unsettling questions. Will the realm of the professional human artist be supplanted by artificial intelligence? (More likely, tools like DALL-E 2 might be used as a prompt to inspire artists.) And, if so, what does that mean to other creative pursuits and jobs currently done by human beings? Will artists be out of a job, in much the same way as factory workers at Amazon are being replaced by robots?

Will we eventually have such realistic deep fake pictures and videos that they will be indistinguishable from unretouched shots filmed in real life? Are we going to reach the point where we can no longer distinguish what’s “real” from what’s AI-generated—or trust anything we see?

And how will all this impact the metaverse? (One metaverse platform, Sensorium Galaxy, is already experimenting with AI chatbots,)

So, like WOMBO and Reface (which I have writen about previously on this blog), DALL-E 2 is equal parts diverting and discomforting. But one thing is certain: I do plan to keep plugging text prompts into DALL-E 2, just to get a glimpse of where we’re going in this brave new world!

UPDATE June 23rd, 2022: I’ve spent the past couple of days playing around with DALL-E 2 a bit more, and I have discovered that, with the right kind of text prompts, you can generate some astoundingly photorealistic human profiles! Here are a couple of examples:

Prompt: “show the entire head and shoulders in a face forward picture of a handsome blonde man with blue eyes and a strong chin award winning photography 35mm realistic realism”
Prompt: “stunning breathtaking head and shoulders portrait of a beautiful African woman golden hour lighting. brilliant use of light and bokeh. Canon 85mm”

It doesn’t have to be a human, either; how about a wood nymph with green hair?

Prompt: “stunning breathtaking photo of a wood nymph with green hair and elf ears in a hazy forest at dusk. Dark, moody, eerie lighting, brilliant use of glowing light and shadow. Sigma 85mm f/1.4”

I’ve also dissocovered you can combine two or more artistic styles in one reault. Here are the six pictures DALL-E 2 spit back in response to the text prompt: “a cottage in a lush green forest with mountains in the background and a blue cloudy sky by Albert Bierstadt and Charles Victor Guilloux and Vilhelm Hammershøi” (please click on each picture to see it in a larger size):

However, there are also some prompts which fail miserably! For example, I tried to create an image using the text prompt: “steampunk gentleman in a top hat riding a penny farthing bicycle in a steampunk landscape with airships in the sky colorful digital art”, Here’s what I got back:

Here are four of those AI-generated pictures (click on each thumbnail to see a larger version):

It’s very clear that DALL-E 2 has no concept of what a penny farthing bicycle looks like! For your reference, here’s the results of a Google image search for the vehicle in question:

I assume that DALL-E 2 will get better the more images it is fed (including, hopefully, images of penny farthing bicycles!).

My last prompt yesterday was “Vogue fashion models eating cheeseburgers at MacDonalds”.

Now, while the thumbnails may look good, most of these pictures are nightmare material when you look at them full-size: mismatched, misshapen eyes, wonky face shapes, etc. Really uncanny valley stuff. In thumbnail number six, you can also clearly see that several of the Vogue fashion models have more than two hands!

So, while DALL-E 2 is certainly capable of generating stunning results, it is far from a perfect tool. I don’t think that human artists and designers have to worry about losing their jobs just yet! 😉

I leave you with this thought-provoking half-hour YouTube video by an industrial designer and professor named John Mauriello who claims, “with recent advancements in Artificial Intelligence design tools, we are about to see the biggest creative and cultural explosion since the invention of electricity in the 1890s.”

P.S. With my blogposts about AI tools such as WOMBO, Reface, and now DALL-E 2, plus my coverage of AI implementations of NPCs in social VR platforms such as Sensorium Galaxy, I decided it was time to create a new blogpost category called Artificial Intelligence (please give me a bit of time to go back and add this category to older blogposts, thanks!).

South Korea Launches a Metaverse Alliance with Industry

Photo by Daniel Bernard on Unsplash

This morning I learned that the South Korean government has recently launched a metaverse alliance with over 200 businesses, including Samsung. The Korea Economic Daily reports:

Samsung Electronics Co. has onboarded South Korea’s government-led metaverse alliance in a move to strengthen partnership in virtual reality (VR), augmented reality (AR) and artificial intelligence (AI) sectors.  

According to the Ministry of Science and ICT (MSIT) on Aug. 3, a total of 202 firms are participating in the metaverse alliance with a number of new joiners that include Samsung. Other companies that have newly joined the alliance include Shinhan Bank, KB Kookmin Bank, SM Entertainment Co., MegaStudyEdu Co. and Cheil Worldwide.

The MSIT-led metaverse alliance kicked off on May 18 with 25 firms and institutions as participants. The initial members include the country’s three telecom giants SK Telecom Co., KT Corp. and LG Uplus, automaker Hyundai Motor Co., Naver affiliate Naver Labs Co. and Kakao affiliate Kakao Entertainment Inc. Other current members of the metaverse alliance include GIANTSTEP Inc. MAXST Co. and CJ ENM Co.

The main objective of the alliance is to establish the country’s metaverse ecosystem by growing the metaverse-related AR, VR and extended reality (XR) sectors. The alliance has also designated six key industries that will adopt more metaverse technologies moving forward: medicine, manufacturing, construction, education, retail and national defense.

It will be interesting to see what comes out of this alliance, and I wonder if other nations will follow suit (if they haven’t already).


Thank you to Vytek for the heads up!

Vanity Goes Vintage! (As Does Moesha!)

The Reface app on my iPhone just uploaded a whole whack of vintage photographs to play with, so I had some fun tonight! Here’s what Vanity Fair looks like in Second Life:

And here is what my Vanity looks like as a vintage model! Just click on any thumbnail to see it in full size:

Not be left out of the fun is my Afro-Canadian model, Moesha Heartsong, who looks like this in her native Second Life (and whom has been through the Reface app before here on my blog):

And here is Moesha in a variety of vintage poses! Once again, you can click on any thumbnail to pull up a full-sized version.

Of course, you can then feed the Reface-d image into WOMBO, for even more fun and genre-bending, history-defying hilarity!!!

My Milkshake
Hollaback Girl

Between WOMBO and Reface, I am having so much fun! It’s helping me stay sane and entertained whilst under pandemic lockdown here in Winnipeg.

Andrew “Boz” Bosworth and John Carmack Have a Discussion About Next-Generation Virtual Reality

When John Carmack and Andrew “Boz” Bosworth have a conversation, people tend to listen. Andrew is vice president in charge of augmented and virtual reality at Facebook, and of course John is the millionaire Chief Technical Officer of Oculus, who is currently working away on an Artificial General Intelligence project.

The two recently held a half-hour conversation on Twitter Spaces (Twitter’s version of the hot new drop-in audio app Clubhouse), which offered a fascinating glimpse into the heads of two key people who are driving Facebook’s move into virtual reality.

Right now [VR is] still largely an early adopters’ toy where a lot of people that have VR already have everything else, and we’re just adding some new spice, but we need to be a displacement device where we need to be something that somebody hard up for money decides “I’m going to buy a VR headset instead of a Chromebook or instead of a tablet.” And we need to do everything that those devices do. You know, we need to have similar app libraries. We need to be just as effective with keyboard and mouse. We need it to be something that you could put on your head and do the work that you need to do during a normal day.

—John Carmack

Anybody who uses what Philip Rosedale has pejoratively called a “marimba keyboard” (i.e. where you use a mallet-like device to awkwardly type on a virtual keyboard), can immediately relate to what John says here. Despite the many technical advances of the past five years, we are still not anywhere near the ease of use that is required for people to actually opt for a VR headset instead of a tablet!

Here’s the whole half-hour discussion, which I can highly recommend: