Shocking! – Absolutely Baching

This website tries to be image-rich -and finding sufficient royalty-free imagery for it can be a bit of a challenge at times (I'm too cheap to pay actual royalties!). So, for fits and giggles, I thought I'd experiment with a bit of ChatGPT artificial image creation.

So here is a very famous and very genuine photograph of Ralph Vaughan Williams with his cat, Foxy:

I love that photo, because the genuine kindness of the man oozes out without any difficulty or holding back at all. And there's a cat in it, of course, which is just icing on the cake 🙂 It's just a shame that the image is in a particularly grotty, grainy, blurred and over-contrasted state. If only it could be, er, 'improved' somewhat to bring it up to modern standards!

With that in mind, I asked ChatGPT to 'create me an imagined image of the composer Ralph Vaughan Williams with a cat on his lap', and it came up with this:

Now, the cat's all wrong, because I failed to mention in the prompt I gave to ChatGPT that the cat needed to be a tabby, not a black one. But I feel the representation of RVW himself is just about perfect: he may be a little more neatly dressed than he ever was in reality, but the face is pretty darn'd good, I'd say! Frankly, I was a bit shocked to see how good the artificial image generation capability was. I have read that one of the common complaints about ChatGPT was that, 'it can't really do hands properly'. The results here, however, suggest that maybe that's a problem that has been largely overcome.

I then uploaded the original (genuine) photograph directly to ChatGPT and asked it to re-imagine it as if it had been a black and white photograph taken recently, on modern equipment:

Pretty good, though the cat looks fake as hell! However, at least the system worked out that we ought to be using a tabby cat without me having to tell it. I still think the earlier artificial image (which was created simply from a text prompt of 'imagine RVW with a cat on his lap') gets RVW's face far more accurately than this, which was based on an actual photo, which struck me as a little odd. Instead of gentle kindness and a hint of a smile, here we have a fake-RVW looking like he's just been woken up from an afternoon nap and he's regretting it very much! The results are, nevertheless, still surprisingly decent.

One final experiment, then: "Can you imagine a 1024x1024 photograph of the composer Benjamin Britten looking out over the canals of Venice as he composes something on his lap":

...which is really shockingly accurate, too (though it needs to learn that music staves contain 5 lines, not two as would appear to be the case here)! So, I simply typed in: "Can you do that again, this time remembering that music staffs consist of five lines, not two. And can there be some actual music on the music paper?" and got this result:

...which is hilariously bad. Interesting that having specified five lines per staff, AI decided to give me six. And as for the 'actual music'... well, it looks like something he might have composed aged 2, if he'd been capable of writing at a 90° angle whilst staring off into the middle distance! I think this demonstrates that "AI" is very artificial and not at all intelligent!

I had one final go (well, you are supposed to 'chat' with ChatGPT, right?!) with the new prompt, "I did specify *five* lines per music staff, not six. And a human cannot write music that's rotated 90° to one's line of sight like that. Can you try to improve please". And got this:

...which is worse, if anything, only because the head and body proportions are now all wrong. He also appears to be morphing into a grumpy Richard Nixon! You'll note, too, that the music is still rotated and still written on 6-line staves (note, I try to remember to talk American to ChatGPT, since it's largely an American-based technology: apparently, Americans talk of 'staff' where I'd normally use 'stave').

I found this quite a bit with ChatGPT: you have to be very precise on your first prompt (specifying the cat should be a tabby one, for example) because if you ever try to refine that initial prompt, the images tend to become worse rather than better. Additionally, it was fairly common for very explicit requests ('five-line staff') to be completely ignored, which became rapidly frustrating! Nevertheless, I can see plenty of mischief capable of being wrought by this technology.

Anyway, I have to say I'm quite beguiled by it as an amusement and, possibly, an occasionally useful tool, despite its tendency to produce complete bonkers results in response to some prompts and to have an utterly deaf ear to others. Indeed, I was so impressed with some of the results shown here that I got it to 'imagine' the images that make up my new 'genre icons' (see previous blog post), thereby getting it pretty immediately to prove its worth.

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31