CN 11-5366/S     ISSN 1673-1530
"Landscape Architecture is more than a journal."
ZHONG Y, LIU Y X, YE Y. Measurement of the Public Activity Richness of Urban Park Based on Large Language Models and Social Media Data: A Case Study of Shanghai[J]. Landscape Architecture, 2024, 31(9): 34-41.
Citation: ZHONG Y, LIU Y X, YE Y. Measurement of the Public Activity Richness of Urban Park Based on Large Language Models and Social Media Data: A Case Study of Shanghai[J]. Landscape Architecture, 2024, 31(9): 34-41.

Measurement of the Public Activity Richness of Urban Park Based on Large Language Models and Social Media Data: A Case Study of Shanghai

  • Objective Urban parks are one of the most vital carriers of public services. Public perception and usage of urban parks can significantly impact their management and planning. In recent years, social media data has emerged as a critical source for understanding public interaction within urban spaces, making park analysis based on social media a research hotspot. However, the current research typically focuses on single-mode data analysis (such as text or image), and relies on traditional machine learning and natural language processing (NLP) techniques, which may limit the comprehensiveness and accuracy of research results. Advancements in artificial intelligence, particularly in large language models (LLM), have made significant breakthroughs in language understanding, reasoning, and image recognition, providing the technical foundation for using multi-modal social media data, including image and text, to analyze the rich urban park activities. This research aims to explore the methods for quantitative analysis of multi-modal social media big data to build a more accurate measurement system for park public activity richness.
    Methods Taking Shanghai Gongqing National Forest Park, the most popular and discussed urban park on the social media platform “Xiaohongshu”, as an example, this research employs a combination of classical questionnaire methods, LLM analysis, and traditional classical analysis methods. First, through the design and implementation of a semantic analysis questionnaire, multiple uniform surveys are conducted at the 43 most popular spots in Gongqing National Forest Park to understand public activity preferences and perceptions of different scenes. Descriptive statistical methods are used for analyzing activity intention data. Respondents are presented with images of various park scenes and their locations, and are required to detail their expected activities such as walking, running, or picnicking. The semantic differential (SD) method is used to analyze site perception data. Through statistical analysis of respondents’ ratings on different perception dimensions, a comprehensive perception evaluation of each scene is conducted to help construct quantitative indicators of activity preferences and emotional tendencies. And GIS technology is adopted to visualize public activity richness. Second, for the LLM analysis method, multi-modal data (text, image, video, etc.) from the 43 most popular spots in Gongqing National Forest Park on the Xiaohongshu platform are mined. For text data analysis, the application programming interface (API) of China’s leading LLM, Wenxin Yiyan, was used to extract activity information and calculate sentiment values. This helped identify activities and emotions of “Xiaohongshu” users in the park. For image data analysis, the API of ChatGPT-4 was used to extract activity information. Since LLM can’t directly process videos, the videos were first converted into frames and then analyzed using the same method as for images. The Shannon’s diversity index formula is adopted to calculate activity diversity in combination with the type and quantity of activities extracted from the multi-modal data, based on which a quantitative image of urban park public activity richness is constructed. Third, in the traditional classical analysis method, text data from the multi-modal data (all text portions of “Xiaohongshu” notes) are extracted as original data. The latent dirichlet allocation (LDA) model is adopted for topic modeling analysis, and NLP technology for calculation of sentiment values for each topic. Additionally, the diversity of various activities and sentiment values are combined to construct single-modal data indicators.
    Results This research explores various measurement methods for public activity richness. Using traditional questionnaire perception measurement as a benchmark, correlation analysis is conducted to compare the accuracy of traditional classical analysis and LLM analysis. Statistical results show that LLM analysis can significantly outperforms traditional classical analysis in terms of accuracy for public activity richness and emotional perception data, demonstrating high consistency with the benchmark questionnaire method. And LLM analysis proves superior in evaluating public activity richness. Based on these findings, LLM technology and multi-modal social media data are used to conduct large-scale data retrieval and analysis of the 20 largest urban parks within Shanghai’s Outer Ring, and public activity richness and sub-indicators for these parks are calculated, forming activity portrait for each park, including activity heat data, activity type, and emotional perception data. Moreover, specific suggestions for urban park improvement strategies are provided, achieving a panoramic and high-precision analysis of park public activity richness.
    Conclusion This research innovatively adopts LLM and multi-modal social media data for urban analysis, supporting comprehensive and rapid monitoring of urban park activities and user perceptions from the city scale to a larger scale. This can not only improve research efficiency and accuracy, but also provide scientific evidence for urban park planning and management. The successful application of this method indicates a scholarly transformation and deepening development of artificial intelligence in urban research, holding significant importance for promoting smart city construction and management.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return