Concepedia

Publication | Open Access

Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts

670

Citations

28

References

2023

Year

TLDR

Large language models such as GPT‑3 can fluently follow multi‑turn instructions, making them attractive for natural‑language interaction design, but crafting effective prompts is challenging and brittle. The study investigates whether non‑AI‑experts can effectively perform end‑user prompt engineering with a prototype LLM‑based chatbot design tool. The authors used a design probe, a prototype LLM‑based chatbot tool that supports development and systematic evaluation of prompting strategies. Participants explored prompts opportunistically, struggled similarly to end‑user programming and interactive machine learning, and found human‑instruction expectations and overgeneralization to be barriers, highlighting design implications and opportunities for further research.

Abstract

Pre-trained large language models ("LLMs") like GPT-3 can engage in fluent, multi-turn instruction-taking out-of-the-box, making them attractive materials for designing natural language interactions. Using natural language to steer LLM outputs ("prompting") has emerged as an important design technique potentially accessible to non-AI-experts. Crafting effective prompts can be challenging, however, and prompt-based interactions are brittle. Here, we explore whether non-AI-experts can successfully engage in "end-user prompt engineering" using a design probe—a prototype LLM-based chatbot design tool supporting development and systematic evaluation of prompting strategies. Ultimately, our probe participants explored prompt designs opportunistically, not systematically, and struggled in ways echoing end-user programming systems and interactive machine learning systems. Expectations stemming from human-to-human instructional experiences, and a tendency to overgeneralize, were barriers to effective prompt design. These findings have implications for non-AI-expert-facing LLM-based tool design and for improving LLM-and-prompt literacy among programmers and the public, and present opportunities for further research.

References

YearCitations

Page 1