Abstract:
Objective Cities, as the core carriers of human civilization, have spatial forms and physical environments that profoundly affect the quality of residents’ life and well-being. In micro-scale urban settings—such as streets and squares closely tied to daily life — interactions between people and urban spaces are most direct and frequent, with design quality directly determining residents’ comfort and convenience. Urban design, as a key practice in shaping these spaces, has evolved into a human-centered, participatory process of placemaking. However, in practice, time and cost constraints often lead decisions to overly rely on professional judgment, making it challenging to fully reflect public needs. Urban perception captures the public’s preferences for specific urban environments and their experiential feedback on spatial demands. Rooted in environmental psychology, this concept views human perception as a vital link in human-environment interactions, bridging spatial design and human experience while playing a central role in shaping urban environments. Among sensory modalities, visual perception stands out as the dominant dimension due to its primacy, making it a core focus of urban perception studies. Examining urban perception reveals authentic public needs — though lacking professional design expertise to propose specific solutions, the public’s sensory experiences and perceptual feedback effectively highlight real demands, providing critical input for design decisions. Thus, understanding and integrating public urban perception not only enhances design responsiveness to user needs but also fosters a harmonious balance of functional efficiency and humanistic care. Crowdsourced visual perception data and analysis methods, with their advantages of efficiency, broad representativeness, and low cost, have become key methodological tools in urban built environment research. However, existing research has yet to systematically explore how these methods effectively facilitate the integration of public perspectives into urban design decision-making, leaving their pathways and methodological efficacy in need of further refinement and synthesis. This research aims to dissect the pathways and efficacy features of crowdsourced visual perception methods, systematically uncovering the theoretical logic and mechanisms by which they promote public involvement in urban design decisions, thereby providing a scientifically robust and practical methodological foundation for human-centered urban design practices.
Methods The research employs a case study approach, analyzing multiple urban research papers focused on Tokyo, Japan, that utilize crowdsourced visual perception methods. It systematically investigates how these methods, across key urban design decision-making stages — data collection, data analysis, and scheme generation — facilitate the deep integration of public perspectives into design decisions through specific pathways and mechanisms. The study particularly emphasizes micro-scale urban design.
Results In the data collection phase, perception inference models trained with crowdsourced visual perception data and deep learning efficiently compute and evaluate perceptions of micro-scale urban scenes, offering designers an automated, low-cost means to gain insights into public perspectives early in a project. This approach addresses the limitations of traditional methods in data coverage and cost, significantly enhancing the breadth and representativeness of public perspectives in design decisions. From the perspective of integration, crowdsourcing breaks the traditional top-down data acquisition barrier, enabling ordinary citizens to indirectly contribute to the informational foundation of urban design by sharing individual perceptions. This shift not only increases data diversity and inclusivity but also empowers the public to express needs and preferences in the early stages of design decision-making. In the data analysis phase, statistical methods establish quantitative models linking built environment factors with visual perception evaluations, uncovering the interaction mechanisms between public preferences and built environment elements. These methods identify key factors significantly affecting visual perception and determine optimal parameter ranges for design interventions via regression analysis, providing a scientific basis for formulating design strategies and goals. Notably, nonlinear analysis methods capture complex relationships with greater precision. In the scheme generation phase, integrating crowdsourced perception data with generative AI transforms public perceptual preferences into intuitive, visualized design language, introducing a novel human-machine collaboration approach to urban design decision-making. The visualized outputs of AI-generated schemes offer a transparent and comprehensible negotiation basis for subsequent decisions, facilitating stakeholder interpretation and participation. Within this technology, the Stable Diffusion (SD) model outperforms GAN in generation quality, diversity, and flexibility.
Conclusion The research uses visual perception as a lens to integrate public perspectives, exploring the role of crowdsourced visual perception in micro-scale urban design decision-making through detailed case studies. It systematically examines the pathways and mechanisms by which this approach embeds public input, highlighting its applicability, technical implementation, and inherent limitations. The findings offer a robust framework for incorporating public perspectives into micro-scale urban design decisions while laying a theoretical groundwork to advance the scientific rigor and inclusivity of the design decision-making process.