Entertainment & Elephants: Experiments with Image Generators

Images created by DALL•E (left) and Stable Diffusion (right)

I don’t know why I have a thing for elephants, but that’s what always comes to mind when I try out a new artificial intelligence image generator. You’d think that they’d be able to draw an image of an elephant on horseback, but so far—nada.

On the other hand, DALL•E (Openai.com/dall-e-2) did a good job of fulfilling my request that it draw an elephant seated at a table drinking wine. And the Stable Diffusion demo (Bit.ly/3h0k1aL) did okay when I asked it to draw Santa Claus sitting on an elephant while delivering Christmas presents.

In each case, though, these were my initial attempts. If there’s one crucial technique for getting good results from AI image or text generators, it’s that you need to learn how to refine your prompts and make them more specific. I might yet get my elephant riding on horseback.

DALL•E has perhaps made the biggest splash this year. When I first had Writesonic write about it in my September column, it was gradually being rolled out and wasn’t yet available to the general public. But that changed at the end of September. Even with that limited rollout, 1.5 million users were generating 2 million images a day.

What’s the appeal? It’s fun. But it’s also useful. Graphic designers, artists, illustrators, and even interior designers and architects can quickly create images that might otherwise take hours.

Artist Jason Allen’s submission to the “digitally manipulated photography” category at the Colorado State Fair won the first-place prize of $300. He created it using Midjourney (Midjourney.com). According to The Washington Post, “The portrait of three figures, dressed in flowing robes, staring out to a bright beyond, was so finely detailed the judges couldn’t tell.”

Well, there’s a bit of a gulf between that and a cartoonish rendition of an elephant sitting at a table drinking wine, but I’ll get better.

The three that I’ve mentioned so far (DALL•E, Midjourney, Stable Diffusion) seem to be leading the way. And they all work similarly: they feed AI millions of images, and the AI neural net model uses “deep learning” to assimilate the images and recognize the content. A recent AI breakthrough, called diffusion models, then breaks down and corrupts the millions of images. It generates novel images by reversing that process. (I don’t understand it either.)

Added to that is the ability to comprehend natural language and then redraw the assimilated images according to any request (except for an elephant on horseback).

Once DALL•E became available, others quickly jumped in to make their image generators available. Google announced a forthcoming service and Meta said that it would be introducing a service that creates five-second videos (MakeaVideo.studio).

Typically these services require you to register by entering information such as email and cell phone number. And they typically give you a certain number of free images and then let you purchase more for a small amount of money.

The exception as I write this is the Stable Diffusion demo, which doesn’t require registration and lets you generate images for free. Stable Diffusion also offers DreamStudio Beta (Beta.Dream Studio.ai), which generates images more quickly and has access to more tools.

The Stable Diffusion software that runs the demo and the DreamStudio Beta is actually “open source” and freely available, which may be part of the reason there’s currently an explosion of image generators available online.

As I write this, Midjourney has the most users. Oddly, to use it, you need to join a Discord chat server and type in your prompt in a chat post. It took me a half hour to figure out how it works, and it entailed reading the Quick Start Guide on the Midjourney website. I signed up on the Midjourney website and then went to their Discord server (Discord.gg/midjourney) to type in my prompts. I was impressed with the features that let you quickly create variations of the images it presents and then get a high-resolution version of the one you like best.

How much do they cost after you use up their free offer? DALL•E offers 50 free credits to new users and then 15 free per month after that. Each credit can generate three to four images. You can purchase 115 additional credits for $15. DreamStudio gives new users 200 free credits. The default setting is one image per credit, but depending on factors such as resolution, it can be as much as 28 credits per image. I couldn’t figure out their fee structure after that. It’s seemingly $10 for some number of credits.

Midjourney costs $10 per month for around 200 images and $30 per month for unlimited personal use. As mentioned, the Stable Diffusion demo is free.

Other image generators include Photosonic (Photosonic.writesonic.com) and Jasper Art (Jasper.ai/art).

I hope you have some fun with this.

Find column archives at JimKarpen.com.