15 July, 2022

cry to the angels and let them swallow you

So, I've developed something of a thing for artificial intelligence machine learning. There's a lot of different ways to do it, and I've read articles on all of them, it feels like. But one that's growing in popularity is image generation. I started playing on Deep Dream, then, a few years later, moved to DALL-E (which is now https://www.craiyon.com/), then to NightCafe Studio, and finally I was accepted to the Midjourney open beta (that last link will open a Discord prompt to join their Discord server, which is how they enable the image generation system through a bot function).

Let's take an example I found through Midjourney. The search prompt was "beautiful woman modeling a stunning silk floral kimono outfit, in the style of Alexis Franklin and Tom Bagshaw, no blur". So this is what that turned up for Midjourney:

Midjourney set of four women in kimono

Great detail, wonderful differentiation, little fuzzy on the hat, and in the fourth image, the hand looks somewhat detached, but otherwise, this could be four different portraits hanging in a gallery somewhere.

Meanwhile, the same search term through NightCafe brings us, well...this:

NightCafe Studio's version of a woman in kimono

Great pattern, style very similar to a kimono, black sticks for arms, headless, but decent.

And that brings us to Craiyon, where I ran the same prompt. Midjourney nets the user four to six detailed images, depending on detail of prompt, which the requestor can then reiterate, or evolve. NightCafe nets the user only one image, but generally extremely detailed, which can then be evolved. Craiyon does nine quick "sketches", and it's abysmal with facial recognition. These were the six that made the most visual sense:

Craiyon's woman in kimono, variant 1

Definite kimono, it's the right cut, the pattern's good, the pose is good. And at least with a side profile, it doesn't look so completely like the face is distorted.

And for all of these, I didn't chop off the heads, they came that way.

Craiyon's woman in kimono, variant 3

Remember what I said about their AI engine not quite..."getting"..faces? This is an excellent example. You can tell that it IS "face", but...not much beyond that.

Craiyon's woman in kimono, variant 4

The third only works, I think, because the head is cut off...

Craiyon's woman in kimono, variant 5

The fourth is not a kimono, but it's a great pattern, and it's a pretty perfect attempt at a sari. Face is still very odd.

Craiyon's woman in kimono, variant 8

This one's not as bad at some of the others, but it's still not the best face.

Craiyon's woman in kimono, variant 7

And this one could have walked straight out of a J-horror film.

There are some ethical concerns with the technology, to be fair--this article lays them out fairly well for Google's Imagen generator--but it's one step further for machine iteration in the goal for independent consciousness. And that is still fairly exciting to me.

Ultimately, we'll just have to realize we're moving into the era of post-reality, and adjust as we can.

No comments:

I wanna live a vibrant life, but I wanna die a boring death

This is the..."Ham Tree"...at LORE . It's a group gift. Mesmer's love of meat where meat should not be is spreading... ...