The Art of Prompt Engineering: BU Researchers Explore How Generative AI Impacts Human Creativity in Artistic Communities

7 min readMar 18, 2024

Welcome to the world of co-creation.

By Katherine Gianni

In the evolving landscape of creative expression, the integration of generative artificial intelligence has sparked both excitement and apprehension within artistic communities. As concerns grow over the potential displacement of human creativity by AI, it has become imperative to explore how this technology intersects with and augments human innovation. In their latest SSRN paper, “Generative AI, Human Creativity, and Art” Boston University researchers Dokyun “DK” Lee and Eric Zhou have done just that. Through a dataset analysis of over 4 million artworks from more than 50,000 unique users, the co-authors provide an examination of the Human-AI Co-Creative process, the concept of prompt engineering, and the emergence of “generative synesthesia.”

Professor Lee is a Kelli Questrom Associate Professor of Information Systems Management & Computing and Data Sciences in BU’s Questrom School of Business. He also runs the Business Insights through Text (BIT) Lab to study application domains such as content engineering and advertising, social media marketing, brand sentiment, technological innovation, and persuasion. Zhou is a third-year PhD candidate in information systems at Questrom and a member of the BIT Lab. His research interests include computational creativity, economics of unstructured data, and human-AI collaboration.

How does your research address concerns raised within artistic communities regarding the potential replacement of human creativity by generative AI?

The natural concern regarding generative AI is that it “automates” many of the knowledge-based and creative tasks that have traditionally been reserved for humans. What is unique about those types of tasks is that there are often latent talents that humans possess that enable certain individuals to excel at those particular types of tasks.

What we found in our research is that individuals with a knack for producing novel ideas–captured by the focal subject matter or the composition of various constructs–enables creatives to leverage text-to-image generative AI more effectively than their peers in producing meaningful artifacts. While text-to-image tools certainly have the potential to automate specific steps in a creative workflow, the ideas still originate from the artist. We should consider the benefit of AI as augmenting artists’ ability to explore and refine interesting ideas rather than automating humans’ creative process.

Which specific text-to-image generative AI tools were analyzed?

We primarily identified the adoption of the three mainstream text-to-image generative AI tools: DALL-E 2, Midjourney, and Stable Diffusion.

Art sharing social media platform. Photo courtesy of DK Lee.

Your findings indicate that these text-to-image generative AI tools enhance human creative productivity by 25%. How do you define “creative productivity” in this context?

Creative productivity can be defined as the volume of output that an artist produces. To elaborate on the period-over-period trends, we found that in the month that artists begin using text-to-image tools, their output increases by approximately 50% and spiked to 100% in the following month. While the average user sees an increase in productivity by 3–4 and 7 artworks in the adoption month and following respectively, we also observe the presence of superusers who began publishing several hundreds or even upwards of one thousand artworks when assisted by generative AI.

Could you explain the concept of “prompt engineering” and how it factors into the co-creative process between humans and text-to-image generative AI?

Prompt engineering can be defined as the creative practice of writing text inputs for text-to-image models. While seemingly simplistic on the surface, the art of writing a meaningful prompt comes from experimentation and understanding how the actual mechanics of the model work. For example, a model’s training data constitutes the known set of concepts that can be sampled via prompting. Thus, obscure or exceedingly novel ideas cannot always be represented by the model’s outputs through simple prompting. Prompt engineering is not the sole source of model guidance, as users can provide reference images via techniques like ControlNet, inpainting, IP-Adapters, etc., which allows users to exercise creative liberties in modifying or repurposing the contents of existing images.

The co-creative process with text-to-image models depends on the artist as the originator of the idea which is captured in the prompt. Thus, there is no meaningful output without human ideation and expression. While the model can automate the visual execution of an idea, the artist is still responsible for making sense of model outputs, iterating on ideas, and manually refining outputs — essentially curating the final piece.

Can you define “generative synesthesia” and its significance?

Generative synesthesia captures the idea that humans and generative AI can possess complementary competencies that enable the discovery of new creative workflows that augment those competencies. It signals the potential for generative AI to be a source of human flourishing via augmentation rather than one of human stagnation via automation and displacement. Generative AI serves as a conduit to the vast repository of knowledge and ideas upon which the model has been trained, thereby enriching the user’s generation workflow — synesthesia of worldly senses enabled through AI.

Human-AI generative synesthesia. Photo courtesy of DK Lee.

Your paper uncovers that AI-assisted artists who produce more novel content ideas are evaluated more favorably by their peers. Could you elaborate on the implications of this finding and discuss potential reasons?

The most immediate implications are two-fold: first, that artists who excel at ideation but maybe lack the requisite skill to execute the visuals envisioned with a novel idea can still contribute their talents to the creative world when assisted by generative AI. Second, artists with the talent to produce novel visuals may benefit more from leveraging generative AI to explore new interesting ideas.

The potential reason for this is because text-to-image models are guided by the artists’ ideas in the prompting, refinement, and curation phases in the workflow whereas the model itself handles the visual realization of the ideas. Thus, individuals with a proclivity for generating ideas have a skill that naturally complements what the text-to-image model offers. Skilled creatives exhibit a refined sensitivity to making sense of artifacts and their meaning, so the artworks that they ultimately publish are likely of greater value to their peers.

Were there any specific findings in your work that stood out as particularly unexpected or surprising?

Our research reveals an expansion in the creative content domain through the utilization of generative AI tools, though not without inefficiencies. Specifically, while artists employing these tools tend to produce content with recurring themes, instances of unique and innovative content have been observed. These instances contribute to the broadening of human creativity, surpassing the creative boundaries of those artists who do not use generative AI tools. We also found that the adoption of generative AI tools reduces concentration of value capture as measured by favorite per view on the platform.

Human-AI collaboration. Photo courtesy of DK Lee.

Your paper contributes to the emerging field of Human-AI Co-Creative systems. How do you envision this field evolving in the future, and what potential impact could it have on various industries beyond the art world?

The mainstream text-to-image models that were released in 2022 were the first to exhibit robust performance, so we are very much still in the nascent stages of what is possible. Even in the past one and a half years, we’ve seen many new models released like Stable Diffusion XL, Turbo, Cascade, DALL-E 3, and Gemini as well as new advancements like Animated Diffusion, text-to-video, and much more. As creatives continue to explore generative AI’s potential in augmenting their tasks, it is possible that creative workflows will split into more specialized tasks that require specific complementary skills to perform. A natural separation in the text-to-image context would occur between the ideation and the visual refinement stage.

This separation may not be excluded to the art world either. Our overarching takeaway is that the value of generative AI systems depends on the complementary skills of the individual interacting with the system. Firms faced with the decomposition of workflows into specialized tasks may require reskilling and upskilling their workforce and formal coordination systems between the various tasks. Hiring for knowledge and/or creativity-based tasks will likely change as well, as firms must consider domains where generative AI is appropriate, how to elicit effective collaboration between human and system, and identifying the requisite complementary skills that would enable a fruitful collaboration.

As we wrote this paper, an interesting question arose regarding the expansion of the creative concept space: does this expansion originate from a few innovative individuals, or is it the result of collaborative effort? We will explore this question in our next paper.

When interviewing designers, we’ve found that similar generative AI models for creating both 2D and 3D models would probably come next or at least in much demand. Personally, we are looking forward to text-to-virtual-world models now that the first generation of spatial computing (Vision Pro) has arrived.

For additional commentary by Boston University experts, follow us on X at @BUexperts. For research news and updates from Boston University’s Questrom Shcool of Business and Center for Center for Computing & Data Sciences follow @BUQuestrom and @BU_CDS.