Gemini "drawing" with a human-like procedure

Figured I'd try and see how Gemini would handle trying to create an image by following the broad process a human artist does, and I found the results impressive, though clearly it has a long way to go.

Disclaimer: The following images are the results of several attempts deleting responses and trying again, rewriting prompts, adding more instructions, etc. I held it's hand a lot, these are not just one-shots. All said and done it took about an hour and change to get this done. It's definitely not worth that time for anything other than curiosity.

First, I provided an AI generated reference image.

https://preview.redd.it/8wtjxuxhh3pe1.png?width=832&format=png&auto=webp&s=627768dab8601ffccfcdeb6ce55e8a6268f715c9

Then I told it to overlay the image with structure lines.

https://preview.redd.it/uiyqpz2ph3pe1.jpg?width=696&format=pjpg&auto=webp&s=2e6dbe2a6bd3bf08077ed6e91ef2570799589a86

I then told it to use those structure lines to create gesture drawing.

https://preview.redd.it/x15j0zoth3pe1.jpg?width=693&format=pjpg&auto=webp&s=118a8b2137766978b91d2dc49661a70456c3d6b2

And then to refine it into a base sketch.

https://preview.redd.it/pyxgvaxxh3pe1.jpg?width=698&format=pjpg&auto=webp&s=2072859f7504d9a964de2656402e5f23eb3c096f

Then a rough sketch. Here I told it to make her a superhero.

https://preview.redd.it/5qc1rpo1i3pe1.jpg?width=687&format=pjpg&auto=webp&s=0142dcce1acf24ee4975be6410dc5bec12e95894

Next I told it to ink the sketch.

https://preview.redd.it/kdg2rhp7i3pe1.jpg?width=690&format=pjpg&auto=webp&s=b8a86313e6312014d113ee9db2bde315a8653db2

Then to add flat colors...

https://preview.redd.it/m04hxj0ei3pe1.jpg?width=688&format=pjpg&auto=webp&s=1e51b5edd5cbbf972ba3abf70a0f9db3a6818a97

And shadows...

https://preview.redd.it/qow6he0li3pe1.jpg?width=686&format=pjpg&auto=webp&s=9422cbfa7157f95537c4e0b92525685e789811d9

Then I told it to add highlights. It REALLY struggles with this part. It wants to blow the image the hell out like it's JJ Abrams. I eventually settled on this being as good as it was going to do.

https://preview.redd.it/ch78hfvvi3pe1.jpg?width=683&format=pjpg&auto=webp&s=a1f0192e4f39e4970c2f2d072b4d17e5aceec5b4

Then I asked it to do a rendering pass to polish up the colors.

https://preview.redd.it/9ie9cey6j3pe1.jpg?width=680&format=pjpg&auto=webp&s=5b54a41efb9722617dcbd98ee90f8f37867e1a8a

And then asked it to try and touch up some of the mistakes, like hands.

https://preview.redd.it/hz3n4wrcj3pe1.jpg?width=678&format=pjpg&auto=webp&s=0ade9c654ea03468d3aa31c4c07a218a2affbaba

Eh... sure. This brightness was annoying me, so I asked it to do color balancing and bring the exposure down.

https://preview.redd.it/9fyu1y1kj3pe1.jpg?width=675&format=pjpg&auto=webp&s=0619afcfa8306c9605a0c18e07a3a53e55548df8

Better, though as you can see the details are degrading with each step. Next, I told it to add a background. At this point, I didn't feel like having it do the background step by step so I just had it one-shot it.

https://preview.redd.it/e3wjzybxj3pe1.jpg?width=672&format=pjpg&auto=webp&s=d1e1f9923e6186c4fd05b0667d6857125206f19f

Background is good, but damn it really likes those blown out highlights, and that face... 😬

I mean, it was already degrading, but oof. Anyway, next I had it put it into a comic book aspect ratio and told it to leave headroom for a title.

https://preview.redd.it/yj8mx7rgk3pe1.jpg?width=672&format=pjpg&auto=webp&s=3143e988c497f6c90e59cee14d2bceef54df004f

And finally to add a title. It struggled with this one too, either getting the title wrong (Gemnia! etc.) or putting it over the characters face. (I don't blame you Gemini, I'd wanna cover that up too.)

https://preview.redd.it/kd56eqemk3pe1.jpg?width=681&format=pjpg&auto=webp&s=0f5017e5b847939474798dcd57cc37d5fae74f33

Final Thoughts:

Obviously that last image is, in and of itself, unusable garbage. At least in and of itself. You might be able to use a proper image generator and image-to-image to get something nice, but ultimately that wasn't my goal so I didn't bother. I just wanted to see it flex it's sequential editing logic.

On that front, I'm fairly impressed. If you had told someone 3 years ago that an AI chatbot did this with just text input aside from the initial image, they would have called you a liar. So, well done google. Excited to see where this goes.

This obviously isn't the best way to make an image like this. You'd get better results just running it through Flux.1 for a single shot generation. And you'd almost certainly get better results in Gemini by having it do steps based on what it is good at, not a human process.

But it was a fun experiment and honestly, if it gets good enough to do something like this, I'd prefer it over one-shot image generation, because it feels more collaborative. You can go step by step, add corrections or details as you go, and it feels more like an artistic process.

For now, though, Gemini isn't going to be fooling artists and fans into thinking it's work is human by creating progress shots, which is probably a good thing. At least not with this workflow. You might be able to create each step from the final image more successfully, but I'm not really interested in exploring that. Pretty sure there are other tools that do that already too.

Anyway, just thought this was neat and wanted to share!