With AI image generation platforms like Dall-E, Midjourney, and Stable Diffusion flooding social feeds and media coverage, it feels like we’re just around the corner from AI rendering illustrators, graphic designers, and other artists obsolete. A disconcerting thought for Creatives like myself, especially given how quickly it seems the technology is evolving through exponential improvement.
Having spent the past few weeks generating images in DALL-E 2 (all the following images herein are AI-generated), the current state of AI art has generated more questions than answers. So, rather than rattling off hot takes or predictions about the future of this technology. crucial that we think through the questions we should be asking to contextualize the potential long-term benefits, shortcomings, and effects of AI art on our profession and the world at-large. So let’s take a page out of Dr. Ian Malcolm’s book, and instead ask some questions.
At a glance it feels like the answer is a resounding “no.” However, looking under the hood of how these AI platforms generate images, things become much less clear. While there’s plenty of whitepapers and explainers that detail the process in-depth, there’s two discrete elements that might sway opinions.
Firstly, these platforms aren’t compositing photos that already exist. When you enter a prompt like “cat banana” into Dall-E 2 the program isn’t just mashing together pictures of cats and pictures of bananas from an image library, it’s using upwards of 500 different variables across images cataloged in its training data to find things that match “cat” and “banana”. Not unlike how a human artist uses reference material before capturing a photograph or painting a picture, except the AI can reference millions of images whereas a human might only source a few dozen.
Secondly, the AI doesn’t just take one shot at creating the new image, it’s an iterative process. These AI generated images start as just random pixels in the software’s latent space that are then rerendered through a diffusion model thousands of times in a few seconds until the AI lands on a combination that best matches the concept of “cat banana.” This is a randomized process so the same prompt will have different output images each time it’s run through the generator. Is this kind of randomness any less creative than, say, Jackson Pollock’s surreptitious trial-and-error when discovering his own iconic style?
So it’s obvious that AI can replicate some of the same creative practices that are standard protocol for human artists, AI is even being recognized at film festivals and art competitions. However, let’s make sure we don’t overlook the definition of creativity itself. Creativity is irrevocably linked to imagination and its ability to create new and original ideas. Now while AI can certainly generate new images, it’s hard to say that they are original ideas. The software still needs human input in the form of a prompt to function and also needs a database of human created images to draw from for training. So is the AI exhibiting creativity or is it just brute-force automation that’s “largely a cultural-mining operation with a clever assembly line on top”? Maybe so, but Is this any different than the art of remix and sampling that’s commonplace in music and other creative disciplines?
Let’s set aside the idea that AI may outright replace human artists and instead look at what cultural impact AI art could have if it just becomes more commonplace. After all, the printing press didn’t kill oral storytelling (e.g. podcasts), photography didn’t cause painting to go extinct, so the advent of AI art might just be another step in technology redefining how we as humans exhibit our own creativity.
In the past, AI was promised to be mankind’s savior from monotony, taking over the dull, repetitive, or dangerous tasks freeing up people to utilize the soft skills that software cannot replicate. However, we’re seeing instances where AI is actually having the opposite effect — creating more mundane and untenable work environments for people rather than liberating them to follow more fulfilling pursuits.
Perhaps these are just growing pains and we actually are headed towards a future where AI is just a means of production fueled by that initial spark of a creative idea that only the human imagination can dream up. If that’s the case then a field like prompt-engineering not only becomes the profession of the future but also the primary way that people from all skill-levels and backgrounds can creatively express themselves. A kind of democratizing of creativity, similar to how the internet helps us democratize information. To put it another way, the conductor doesn’t play an instrument, they play the orchestra.
While AI may unlock creativity and a new appreciation in art history for some people, we should also consider the possible negative side-effects of this kind of friction-less experience in creating “new” art. After all, any AI is only as good as its ability to understand and negotiate adversarial inputs. Easy enough to wrap our heads around if we picture what could happen if a self-driving car mistook a stop sign for a green light — but what does this situation look like when applied to the creative space?
Illustrated examples like Loab show us how easily AI can take a swan dive into generating macabre, violent, and otherwise disturbing imagery but it also has an uncanny ability to perpetuate a kind of regressive monoculture. While the AI in and of itself cannot exhibit bias, its training data can certainly have a similar net-effect. This is why for some image generators a prompt like “doctor” generates more images of cacausian men, while the prompt “nurse” generates images of BIPOC women more often. AI art is not just the output of the inherent or unrealized biases of the software developers, they are subject to the shortcomings of the collective unconscious of internet culture as a whole.
Relying too heavily on AI in our creative pursuits could lead us down a rabbit hole we never anticipated. A dumbing down of art, not unlike how Zipf’s Law points to a dumbing down of language — words that are shorter and easier are used more often. Could AI art be the first step in art boiling down to some kind of Corporate Memphis on steroids where creative risks and innovation are abandoned for the safety of the algorithm? AI might not know any better but will we be wise enough to realize this and make the necessary improvements?
The intersection of technology, culture, and the law always yields interesting tensions and AI art is no different. It should be no surprise that legal precedents for AI art will likely develop far slower than the technology itself. This becomes an especially important question as AI art is bought and sold for commercial purposes.
The image databases used to train these AI platforms contain hundreds of thousands of works by past masters and multitudes of lesser known artists. It doesn’t seem too far a stretch to say these artists are entitled to royalties for the use of their work, or the ability to opt out of the inclusion in the training data, or at the very least an attribution in the generated images? It seems easy enough to say that AI generated art is not fair use but who knows how that argument would be met in a court of law. What if Picasso’s adage “good artists borrow, great artists steal” becomes the law of the land?
It’s also interesting to consider what claims the prompt writer has to the images they create via AI. We’re already seeing marketplaces pop up where you can sell your own prompts so it would appear there is some kind of legitimate intellectual ownership over a seemingly nonsensical phrase like “Tony the Tiger playing solitaire in the style of Vincent Van Gogh”
“100 years from now the idea is still going to be more important than all the technology in the world.”
These are the words of Bill Bernach, one of the founders of world renowned agency DDB, and they are doubly prescient when we consider AI’s role in developing creative strategy and content and how we can best leverage it as a tool to create rather than automate. AI’s ever-quickening development is owed to the endless race to improve efficiency, reduce costs, and thereby increase profit-margins but in this race to some fully automated future it’s easy to fall victim to the “disease of more.” — a thought that’s especially troubling if we picture it afflicting mass-culture instead of just a single individual or profession. So how do we avert this grim fate?
The answer lies in carefully considering when and where AI is best suited — that is, tasks which are highly repetitive where higher level storytelling and brand building are secondary needs. High-volume, low-touch creative like social content or performance display ads could prove the perfect place to implement AI, saving hundreds of hours of unfulfilling human labor. After all, I’ve yet to meet a designer who aspires to a career creating 10,000 banner ads.
The iterative nature of AI image generation can also lend itself to the ideation process. An AI generating thousands of mockup images would allow a skilled human creative to explore a vastly larger landscape of visual concepts instead of just a handful. Casting this wider conceptual net would inevitably lead to creative directions that may not have existed due to the constraints of time and effort needed for a person to draft mockup artwork, even in a rudimentary format. This frees up our Creatives’ mental bandwidth and time to hone and refine the most promising directions to align with a brand look, feel, and values.
This said, AI’s strengths turn to weaknesses when we consider higher-level brand building and storytelling. Whether it’s witty copy or thought-provoking visuals, the most compelling content is that which engages people’s emotions and this is where today’s creative professionals have an advantage over AI. Emotion and imagination are uniquely human qualities — they are the fulcrum for telling impactful stories. And they are qualities that AI can’t reproduce on its own. While AI will be infinitely better at parsing datasets to create “art derived from the massive cultural archives we already inhabit”, it’ll struggle in the analysis of what that data means. This is why DALL-E 2 can easily render an object image of, say, a “smiling person” but struggles to generate a cohesive image when given the prompt “happiness”. Imparting context, insights, and meaning from information, where it’s a prompt for an AI or a creative brief, is where people will always be more capable than machines. This means creative strategy driven by people will have an even more pivotal role in a future where AI creativity is commonplace. Whether it’s an irreverent tv spot or an epic novel, getting people to feel something is what motivates action and it takes human insight to create truly emotional experiences. So while AI could easily crank out those 10,000 banner ads in mere moments, if brands want to create insightful, meaningful, and lasting connections with an audience then AI would be a bad fit for the job.
CONCLUSION
It’s still not clear what future we’re heading towards here at the onset of AI-generated art and that’s why it’s doubly important that we poke, prod, and question this technology as it evolves, grows, and its use becomes more commonplace. Perhaps another legendary creative director, Lee Clow, is right, “Most ideas are a bit scary, and if an idea isn’t scary, it’s not an idea at all.” Scary or not doesn’t account for whether AI art is a good idea or a bad one. Maybe it will usher in a new era of unprecedented creative freedom and expression or maybe we’ll all just become IRL versions of the much feared hovering Art Director.