Automation of Urban Plan Studies

Streamlining Urban Planning Studies Using Text-To-Image Models

Companies such as Stability AI and OpenAI have rapidly introduced innovative generative AI tools in the form of LargeLanguageModels and Text-To-Image Models, which have already significantly changed work processes around the world.

In order to remain competitive in the long term, these developments must be closely monitored. The potential integration of these tools to automate recurring, monotonous processes should be actively explored. Some startups are attempting to monetise these automation processes by offering them as services. However, open-source, commercially usable systems can help to remove access barriers to such tools.

“Produce 20 different urban design iterations as perimeter block developments, tailored for the selected portion of the plan” Excerpt of a figure-ground diagram utilised for training purposes, taken from Berlin. The selected area was deliberately chosen to represent the broadest possible range of different building types.

What is the purpose of our project?

Urban planning studies in variations of figure-ground diagrams are an essential part of the working process of architects in the preliminary design stage. In addition to the irreplaceable manual creation of such urban planning variations, automation of the process is intended to facilitate and accelerate work on variations. The human design process based on personal experience remains irreplaceable, as a person familiar with the building site has knowledge about the constraints and possibilities of the site that are not replicated in automated processes. As a tool to add additional variations to the self-created studies and to expand the spectrum of possible urban placements of buildings, the method offers a lot of potential.

The project objectives: What are we aiming to achieve?

The aim of the project is to automate figure-ground diagram studies in an architectural context based on Deep Learning Text-to-Image Generators (Stable Diffusion). The Open-Source Generator Stable Diffusion developed by Stability AI, first released in August 2022, is currently mainly used for generating images based on text input. At the same time, it is also suitable for tasks such as inpainting, the addition and exchange of marked areas in images. Due to the Open-Source architecture of the Stable Diffusion model, it is possible to use Deep Learning to use one’s own images as the basis for image generation and to train the open models on one’s own graphics and image content.¹

Using open map bases from Openstreetmaps and the Open-Source program QGIS, a dataset for training a Stable Diffusion urban plan model is to be created. Based on about 100 images of figure-ground diagram sections, a model will be generated that can fill building areas with urban plan variations within seconds. This process can be executed locally, for free, and commercially usable on consumer hardware systems.

Can the project objectives be quantified and validated?

Following the generation of a defined number of variations, the quality and usability of the automated generated variations should be manually checked for the following parameters:

Error rate in the output of the generated variations
Potential approval capability

Are there risks in the publication of the research?

The automation in creating variations could tempt individuals to neglect their own active design process, or even discredit the human design process itself as replaceable by machines. In its internal structure, however, Deep Learning Text-to-Image Generators are always an image synthesis of already existing images, and therefore already existing structures. By training on already built city structures, Stable Diffusion generates suitable additions from a “habit of seeing”. However, the application, selection, and integration of suitable variations into one’s own design remains a manual task based on personal experience and knowledge.

State of research

What are the recent advancements in research on this topic?

In the field of architecture, the use of such generative AI tools is currently limited to the generation of perspective images of architecture. The models of Deep Learning Text-to-Image Generators trained on photos naturally suggest the generation of such images. For the generation of architectural perspectives, there are exemplary examples from Andrew Kudless² or studies by Olivier Campagne³.

In the area of plan graphics, whether floor plans, urban plans, views, or sections, the basic models of Stable Diffusion, and also commercial alternatives like Midjourney, apparently lack training data. The plans generated here are often repetitive in their mode of representation and are currently not really usable. Training the models on plan graphics, and orthographic representations of buildings for generative use as a planning tool, currently seems to be a research gap.

Simplified representation of the training and generation process for the inpainting method

Methodology

The following is a bullet-point list for generating the basic training data for the creation of an figure-ground diagram model:

Creating the training data using QGIS and Openstreetmaps

Building footprints are read from Openstreetmaps using QGIS
- Install QuickOSM plugin in QGIS
- Vector → QuickOSM → Query: building — In: Canvas Extend → Run query
- Secure the imported layers with “Save Scratch Layer” and set the colour to black
  - Copy Layer Style → Paste Layer Style (apply to all imported building layers)
- Repeat this step for all areas you want to train the model on
- Create a Vector Grid for export
  - Vector → Research Tools → Create Grid
    - Grid Type: Polygon, Grid Extend: Draw on Map Canvas, Horizontal / Vertical Spacing in Meters: 500 m → Run
    - Grid → Symbology → outline blue
- Create a layout for export
  - Project → New Print Layout… → Name → right-click Page properties… → Custom 1024 x 1024 px
  - Add Item → Add Map
    - Resize it by clicking and dragging the boxes on the edges, adjust to page size
  - Atlas → Atlas Settings → Coverage Layer: Grid
  - Hidden coverage Layer ✓
  - Item Properties: Controlled by Atlas ✓
    - Margin around feature: 0%
- Atlas → Export Atlas as images…

The training of your own figure-ground diagram model and the subsequent generation of urban plan studies is currently still in the testing phase and will be supplemented in a further article shortly.

Urban architectural structures such as perimeter block developments, row buildings, and standalone buildings are automatically categorised using Liner.ai to enhance the precision of the model training. This allows for the addition of text captions to the training data. Subsequently, these parameters can be applied to the inpainting process, enabling the generation of urban architectural structures “on demand” in various forms. This increases control over the iterative process.

The refinement of this model will also be the focus of a separate article to be published in the near future.

References

The fundamental motivation for the automation of processes in our case goes back to reading “Automate the Boring Stuff with Python”⁴ several years ago. The possibilities described therein for beginners to simplify everyday life, whether privately or at work, served as the initiator for the ever-ongoing desire to optimise processes, even before the development of today’s generative AI’s.

The best part of programming is the triumph of seeing the machine do something useful. (- Hilary Mason)

Resources

The following sites are recommended to further explore the topics:

Creating the data basics for training a figure-ground diagram model:

Training the figure-ground plan model:

How To Install And Use Kohya GUI for SDXL Training

https://en.wikipedia.org/wiki/Stable_Diffusion ↩︎
http://digitalcraft.cca.edu/kudless @matsysdesign↩︎
http://oliviercampagne.com @oliver_country↩︎
https://automatetheboringstuff.com ↩︎