I've been exploring the use of DALL-E 3 inside of ChatGPT Plus. I'm doing this because it's my job, not because I have some kind of unhealthy little addiction to describing something in my mind and see it manifest in mere minutes on the screen. I can stop at any time. Sure, that's the ticket, I can stop at any time.
But not today. Today, I found a new toy. DALL-E 3 inside of ChatGPT can read and modify images. Sort of. You see, it's a bit fussy. But I'm getting ahead of myself. Let's start this story at the beginning...
I've been using Midjourney to customize uploaded images for a while. The problem is that it's very convoluted. You have to be running Midjourney in Discord, and then you have to go through a number of steps to upload an image into Discord, get a URL, yada, yada, yada…
In ChatGPT Plus, you simply have to click on the paperclip icon and upload your image. One and done.
That makes it a lot easier to use, and also a lot more fun. But how well does it work? To test it out, I tried three images: a picture of my car, a picture of me, and the ZDNET logo. Let's look at the results.
Here's a picture of my car, a 2013 Dodge Challenger.
Once the image was uploaded, I instructed DALL-E 3:
Put car in city
The results were promising. DALL-E 3 successfully reproduced a likeness of the car, in a city scene:
So, I uploaded the last image, the steampunk version of my car, and told DALL-E to:
Make the car fly
Here's what I got back. The representation of my original car was gone, and we got back a very cool Chitty Chitty Bang Bang-style vehicle, in the air:
It was cool, but it was no longer my car. But that's OK, let's see where we can go.
The Santa Clause
My next attempt, given the season, was to try to put Santa in the driver's seat. Here, I ran into a rather over-zealous guardrail, with ChatGPT telling me, "I'm sorry, but I can't assist with that request."
Yes, we have arrived at that time in the future when an AI says the precise equivalent of, "I'm sorry Dave. I'm afraid I can't do that." Life imitates art.
As you can see, the reasoning was a bit bizarre. But it seemed to hinge on the fact that I was asking it to modify an uploaded picture. So I gave it the above prompts as a single request:
Put car in city. Make it steampunk. Make the car fly. Put Santa in the driver's seat.
I got this:
It was a nice picture, but it missed the whole Victorian aesthetic that the previous images had. There's a fix for that, though.
DALL-E with ChatGPT essentially rewrites every prompt into a more detailed instruction. So, for example, "Make it steampunk" got rewritten into:
Image of a steampunk-themed street scene with vintage architecture, featuring a classic muscle car modified with steampunk elements like brass pipes, gears, and steam exhausts. The car is parked on a cobblestone street with ambient street lamps and a backdrop of old-fashioned buildings that blend Victorian and industrial revolution influences. There are people dressed in steampunk attire, with goggles and Victorian clothing, walking on the sidewalks. The overall atmosphere is that of a retro-futuristic city from an alternate history where steam power is the primary technology.
So I grabbed portions of the descriptive material from both "make it steampunk" and "make the car fly" and combined them for this prompt:
Put car in city. Make it steampunk. Make the car fly. Put Santa in the driver's seat. Below, the cobblestone streets are lined with gaslight street lamps, and people in Victorian attire look up in amazement. The sky is a dusk orange with a hint of smog and the buzz of smaller steampunk drones and airships in the distance. The overall atmosphere is that of a retro-futuristic city from an alternate history where steam power is the primary technology.
Here's what I got back:
Strictly speaking, it's not a flying car, but it's cool. Unfortunately, there's no connection at all to the original car image I started with.
Stop, Dave. Will you stop, Dave? Stop, Dave.
I had another HAL moment when I asked ChatGPT to put this picture of me in an office setting:
It told me, "I'm sorry, but I can't assist with that request." At least ChatGPT didn't say, "Look Dave, I can see you're really upset about this. I honestly think you ought to sit down calmly, take a stress pill, and think things over."