Composing New Urban Futures with AI: Speculative Design and AI Text-to-Image Synthesis

Jamie Littlefield Texas Tech University

AI-generated header image
Figure 1. Three AI-generated images invite speculation toward alternative futures. DALL-E 3 and the author, 2023.

Introduction

In 2010, I co-founded a local organization dedicated to creating safer streets in Provo, Utah—a mid-sized city about an hour South of Salt Lake. The primary goal of our grassroots group was to encourage the construction of more walkable, bicycle-friendly, and accessible infrastructure in the built environment. As the organization matured and eventually became a 501(c)(3) non-profit, I often found myself debating developers and city officials in the public forum, in both textual exchanges (online forums, newspaper op-eds, mailers) and in face-to-face environments (town halls, engineering open houses, city council meetings). I soon discovered that citizen advocates are disadvantaged in debates about city infrastructure due to our lack of access to expensive, specialized technologies.

The professionals designing U.S. cities access industry software to extract place-based data and compose rhetorically persuasive visual renderings of the infrastructural futures they propose. Architects have AutoCAD and Revit. Urban planners rely on ArcGIS and CityEngine. City engineers use Infraworks and MicroStation. While these technologies are often described as helping professionals plan projects, they also work in tandem to help professionals communicate potentialities. Through polished renderings of street re-designs, skyscrapers, pedestrian crossings, and bus stops, members of the public are prompted to visualize the city not as it is but as it could be. City planning professionals are engaged in a perpetual act of storytelling in which “alternative versions of city space are explored, evaluated, and enacted over time” and “designers and architects envision new possibilities…then work to build them in physical form” (Hoffman, 2022, p. 11).

Without access to professional tools, citizens and grassroots advocates often struggle to convey alternative possibilities. For example, street safety organizations may have to pay outside consultants significant fees to create alternative images of sidewalk designs that prioritize vulnerable road users, such as people using wheelchairs or families with strollers. Visual rhetoricians have also attempted to demonstrate urban possibilities through free or low-cost design programs such as SketchUp, the method used by Gries et al. (2020) to counter gentrification by re-imagining a civic space at the University of Nevada, Reno. However, these tools can be time-consuming, and the results are generally more oriented toward communicating concrete design details than constructing rhetorically persuasive, stylized imagery.

Now, with the popularization of text-to-image tools, the disparity between professionals and grassroots communicators is beginning to decrease. AI is threatening to disrupt the monopoly urban professionals hold on a number of communicative affordances. Text-to-image synthesis (the processes used by programs such as DALL-E, Midjourney, and Stable Diffusion) is an emergent AI technology offering underresourced communicators a way to produce customized renderings with a lower price tag and a lower educational barrier to entry. A grassroots urbanist group might generate an image that adds a streetcar to an existing road, re-imagines an entire neighborhood around different values, or displays a visual prototype to demonstrate the consequences of a particular decision.

In my professional communications work, I’ve engaged with a number of non-profit leaders and community advocates who are beginning to explore how AI tools can help them more effectively communicate with the public. In 2022, social media saw an influx of AI-generated imagery showing cities not as they were but as they could be. That summer, I released a short prompt book for this niche audience, “Re-Imagining Urban Spaces with DALL-E AI,” which has been downloaded by just over 1,500 early adopters as of this writing. A second wave of AI-generated cities appeared on social media in late 2023, with the release of Microsoft Bing’s no-cost image generator powered by DALL-E 3 and ChatGPT. Social sites like X (formerly Twitter) circulated user-created images such as a Golden Gate Bridge favoring trolleys and pedestrians, a Baltimore Inner Harbor with extensively remodeled public spaces, and dozens of public beaches adjacent to muti-use paths rather than highways. For example, using DALL-E 2’s outpainting feature, I was able to take photos of real streets in my own community and edit them to include AI-generated elements such as highways, walking paths with native foliage, and water features (See Figure 2).

Figure 2. Three AI-generated images present different possibilities for the design of a real street in downtown Provo, Utah. DALL-E 2 and the author, 2022.

This chapter critically reflects on my experiences with AI text-to-image synthesis as a practitioner composing materials for grassroots advocacy and examines how this practical knowledge might translate to a classroom setting. I outline the essentials of how these tools work and how they offer new communicative possibilities to underresourced groups such as fledging non-profits and young political movements. I also describe some of the risks and challenges of AI text-to-image synthesis, including the possibility of value-locking the future. Finally, I suggest a method for critically engaging with AI text-to-image synthesis: speculative design.

Speculative design is a method used by designers, writers, and artists to envision and create concepts for possible futures rather than practical solutions for the present (Hoffman, 2022; Galloway & Caudwell, 2018; Dunne & Raby, 2013). It goes beyond traditional design by questioning societal norms and considering the broader impacts of technology and innovation on our lives. Speculative design involves crafting artifacts (such as AI-generated images) that demonstrate or visualize potential futures and encourage us to think critically about what could be (Opel & Rhodes, 2018; Wilkie et al., 2017). In their constitutive text, Dunne and Raby (2013) argue that speculative practices can function to generate greater elasticity for human futures: “...by speculating more, at all levels of society…reality will become more malleable and, although the future cannot be predicted, we can help set in place today factors that will increase the probability of more desirable futures happening…” (p. 6).

The tools of speculative design are uniquely equipped to help us critically consider the ways that AI is bound up in time, continually replicating the discourses of the past and artificially projecting limitations onto the ways humans will experience the future. In the case of urban communications, speculative design projects can help non-profits and grassroots organizations persuade the public by presenting images that deviate from the detrimental patterns of the past. In the writing classroom, speculative design practices can provide students with concrete approaches to analyzing and composing AI generated images that demonstrate awareness of the ways AI is entangled with the past.