Abstract:
Objective This research explores a new question: how can artificial intelligence (AI) understand design features? This question is important and urgent for the field of landscape architecture, which can benefit from the new possibilities offered by AI technology. Especially, some image generation models based on deep learning, such as Midjourney, Dall-E, Stable Diffusion and other new tools, can create creative images based on simple user input, and seem to be able to produce satisfactory design results. However, can they capture the essence, logic and rules of design works? Or are they just generating graphics based on graphics? Despite their significant theoretical and practical implications, the aforesaid questions also face huge challenges and involve a number of problems. This research focuses on one aspect of these questions: How can AI algorithms identify and recognize high-dimensional design features based on StyleGAN (a style-based generative adversarial network)? This is a challenging technical problem that involves both design understanding and feature disentanglement. The research aims to use StyleGAN to train design schemes, capture the latent space features inside the StyleGAN algorithm, analyze whether the algorithm can recognize abstract design features of landscape architecture schemes, what features it can recognize, and whether it can disentangle feature coupling.
Methods The research adopts StyleGAN as the main method to generate and analyze landscape architecture schemes. StyleGAN is a style-based generative adversarial network proposed by Karras et al. in 2018, which aims to generate high-quality, high-resolution and diverse images. It can control different levels of style features to achieve fine-grained editing of generated images. The StyleGAN algorithm consists of two parts: a mapping network and a synthesis network. The mapping network can transform a random noise vector z into a latent vector w, which contains different levels of style features. The synthesis network can generate an image from a constant vector by progressively adding details from coarse to fine resolution. The style features are injected into each layer of the synthesis network by adaptive instance normalization (AdaIN) operations. The research adopts two datasets for training: one is a general dataset with 4,047 diverse design schemes collected from public sources; the other one is a directional dataset with 105 “multiple solutions for one problem” schemes for a specific site in Beijing. The research trains two generators (a general generator and a directional generator) based on StyleGAN2 model with 512 × 512 resolution. The research adopts two techniques to analyze the latent vector w: dimensionality reduction and truncation trick. Dimensionality reduction is used to visualize and cluster w vectors in a two-dimensional space by principal component analysis (PCA) and k-means methods. Truncation trick is used to manipulate and edit w vectors by changing their influence strength on different layers of the synthesis network. The truncation trick is adopted to compare each generated scheme with an “average scheme” that erases specific design features, and thus infers what kind of design features are contained in each w vector.
Results The research shows the analysis results in two parts: data feature analysis and semantic information analysis. In data feature analysis, the research adopts PCA to reduce the dimensionality of w vectors and compare them with z vectors, finding that w vectors have more distinctive features than z vectors, which are close to standard normal distribution. The research also adopts k-means to cluster w vectors and embed images into them finding that w vectors can roughly extract and classify some features from diverse design schemes, but the classification logic is different for different categories. Some categories are based on morphology, water area, hard-soft ratio, road network structure, park type and other design features, while some others are based on the frequency of appearance of certain design nodes. In semantic information analysis, the research adopts truncation trick to manipulate w vectors by changing their influence strength from 0 to 1, finding find that w vectors can control different levels of design features in generated schemes, such as vegetation density, water area, pavement area, road network structure and other high-level design attributes. The research also finds that some features are entangled with each other, which means that changing one feature may affect other features as well. This is due to the complexity of landscape design and the difficulty of feature disentanglement.
Conclusion The research concludes that AI algorithms can identify and extract some high-dimensional design features from landscape architecture schemes, not only image morphology, but also semantic-rich design features. However, most features are still difficult to disentangle due to the complexity of landscape design and the uninterpretability of algorithms. The research proposes that it is necessary to conduct feature disentanglement before exploring how AI algorithms understand design logic and rules, and that feature interpretation is an important topic for intelligent evidence-based design research, as it can help constrain algorithms to meet designers’ needs.