For 50,000 years, artistic expression has been unique to mankind.
Today, this hallmark of humanity is claimed by another.
These images, generated by A.I., offer a glimpse into a future with unfathomable creative possibilites.
What will the next 50,000 years bring?—BINARY DREAMS: How A.I. Sees the Universe
This quote comes from an imaginative 3-minute YouTube video by melodysheep, illustrated with images created using Midjourney, one of many new AI-based systems that can create realistic images and art from a text description in natural language.
Such computer systems have surprised most observers by how rapidly they are evolving and learning over time, being able to take on tasks that were formerly thought to be the exclusive domain of humans. They have sparked curiousity, creativity, and, in some cases, dread, among people—along with much frustration at not being yet able to get their hands on these tools! Some are sill unavailable to the public (like Google’s Imagen), while others have long waiting lists (e.g. Midjourney, by an independent research lab, and OpenAI’s DALL-E 2).
As of July 1st, I am one of a little over 50,000 people who have been lucky enough to receive invitations to test out one of the leading text-generated AI art tools, called DALL-E 2. DALL-E 2 is an initiative by OpenAI, an artificial intelligence research company which is invested in by Microsoft and other companies. (Among OpenAI’s earlier offerings is GPT-3, an AI tool which uses deep learning to produce ever more human-like text.)
Over the past two weeks (since I got my invite via email on June 19th, 2022, and set up my account), I have been spending almost every day crafting and submitting text descriptions, and waiting for DALL-E 2 to spit back six result images. Each image in turn can be used as the basis to generate six variations, or if you wish, you can upload an image, erase part of its background, and then use it as a start for your creativity. Some people have uploaded famous works of art from throughout art history, to have DALL-E 2 expand the canvas beyond its original borders, a technique called “uncropping”.
Here’s one example of uncropping which somebody posted to the r/dalle2 community on Reddit, using the famous painting The Swing by French painter Jean-Honoré Fragonard. Here’s the original painting, and here’s the uncrop:
See the tiny coloured squares in the bottom-right corner of the image? Those are watermarks generated by DALL-E 2. You might be wondering if such images can be used for commercial purposes (advertising, album covers, etc.). The answer, from DALL-E 2’s detailed Content Policy, is:
As this is an experimental research platform, you may not use generated images for commercial purposes. For example:
• You may not license, sell, trade, or otherwise transact on these image generations in any form, including through related assets such as NFTs.
• You may not serve these image generations to others through a web application or through other means of third-parties initiating a request.
I have noticed that there are some kinds of images which DALL-E 2 seems to excel at. Among them is food photography. Check out these pictures, based on the following text prompt: “Food photography of delicious freshly fried chicken tenders with a side of honey mustard dipping sauce topped with green onion” (click on each thumbnail below to see it in greater detail).
You would be extremely hard pressed to find any difference between these AI-generated pictures, and actual photographs taken by professional food photographers! As one person commented on Reddit, “Incredible. It really got this one. So many people are going to lose their jobs.”
You can also specify the brand of camera, shutter speed, style of photography, etc. in your text prompts. There are still many problem areas, but people have been able to create some amazing “photographs” and “movie stills”, as the following examples illustrate (text prompts are in the caption of each image):
Another popular topic is bizarre juxtapositions, entering text prompts of unlikely topics combined with various art styles, for example, Star Wards stormtrooper recruitment in the syle of Soviet-era propoganda posters:
Or, perhaps, some advertising for McDonald’s new Minecraft Hamburger?
As you may have noticed, one area where DALL-E 2 fails (often quite humorously!) is in text captions. It’s smart enough to know that there needs to be some text in an advertisement along with the image, but it’s not bright enough to get the spelling right! (It’s become a bit of an inside joke within the DALL-E 2 subReddit.)
So, how have I been using DALL-E 2 over the past couple of weeks?
Well, I generated the following image using the text prompt: “Jesus at the Sermon on the Mount award-winning portrait by Annie Leibovitz dramatic lighting.” (The faces were messed up, so I used DALL-E 2’s built-in erase function to erase both faces and regenerated variations of the original image until I found one I quite liked.)
Inspired by another member of the r/dalle2 subReddit, I tried the following prompt:
Then, I tried my hand at several variations of the wording: “Human female face in a colorful galactic nebula detailed dreamlike digital art”, to get the following series (please click on each one to see it in a larger size):
(Adding the words “digital art” and colorful” really makes a difference in the results!)
I also tried my hand at creating some improbable art! Here’s Jesus scrolling through Twitter on his iPhone, by Gustave Doré:
And the same subject as a print by Albrecht Dürer (interestingly, using the word “woodprint” gave me monochrome results, while just “print” threw up a few coloured prints!):
(I love how cranky Jesus is in the last image! He’s definitely gotten into an argument with a Twitter troll!!!)
Finally, I did the same subject as a stained-glass window:
I absolutely love how DALL-E 2 even tried to include some garbled text messages in a few of the resulting images it spit back at me!
Yesterday, I wanted to see how well DALL-E 2 could mimic an existing artist’s style, so I selected renowned French knife-painter Françoise Nielly (website; some examples of her work), who has a very distinctive, vibrant look to her oeuvre:
Here’s some of the better results I was able to get after trying various prompts over the course of a couple of hours (interestingly, most of these portraits are of African faces, although I did not specify that in my text prompts!). Again, please click on each thumbnail to see the full image.
And, as I have with previous AI apps like WOMBO and Reface, I have also been feeding Second Life screen captures into DALL-E 2. Here’s an example of an uncrop of one of my favourite SL profile pictures, of my main male avatar Heath Homewood (note that among many of the beta test restrictions imposed by OpenAI, you cannot upload photographs of celebrities or other human faces, but the stylized look of SL mesh avatars doesn’t trigger the system!):
Here are five results I got back, using the text prompt: “Man standing in a library holding a book very detailed stunning award-winning digital art trending on artstation” (click on each to see it in full size):
I had an image of Vanity Fair dressed in an Alice in Wonderland Queen of Hearts costume, where I erased the background of the screen capture, and tried out several different prompts, with some surprising results (I certainly wasn’t expecting a playing card!):
Here are some variations the SL selfie of one of my alts, where I once again erased the background and expanded the canvas size using Photopea (all the blank white space in this image, I asked DALL-E 2 to fill in for me):
Here are some results of variations of the following text prompt: “fairytale lake forest and mountains landscape by Albert Bierstadt and Ivan Shishkin and Henri Mauperché” (notice again the text failures, and also in some cases how DALL-E 2 “enhanced” the model’s original flower headdress!). Again, click through to see the full-size images.
So, as you can see, I am having fun! But I have also been pondering what this creative explosion within AI means for society as a whole.
I think that we are going to begin to see an accelerating wave, as these AI tools and apps improve, and start to encroach upon existing creative industries. The days of companies meticulously compiling and licensing stock photography are surely numbered, in an age when you can create photorealistic depictions of just about anything you can imagine. And I suspect that the food photography industry is in for an unexpected shake-up!
Many creative types have suggested that tools like DALL-E 2 will become a useful way to mock-up design ideas, saving hours of work at the easel, behind the camera, or sitting in front of PhotoShop. But others fear that many artists and photographers will someday be out of a job, and sooner than they anticipate, in the face of this AI onslaught. For example, why pay an artist to design wallpaper when you can create any sort of pleasing, repeating design yourself, matching specific colours on demand? And keep rerunning the prompts until you get a result you like, in a fraction of the time it would take a human artist to churn them out?
I don’t know how long the closed beta test of DALL-E 2 will run, or when and how OpenAI will start charging for the service; I suspect I will be writing more blogposts about this over time.
UPDATE July 5th, 2022: Laura Lane writes about DALL-E 2 in The New Yorker magazine, in an article titled DALL-E, Make Me Another Picasso, Please.
UPDATE July 10th, 2022: Photographer Thomas Voland has written up a lengthy blogpost about DALL-E 2, including over 100 generated images. The original is in Polish, but here is an English version via Google Translate. Well worth the read!