Objective Digital technologies have opened new avenues for quantitative research on heritage landscapes. Web image-text data, primarily driven by user-generated contents, are frequently utilized in the research on heritage landscape perception. However, existing research often grapples with limitations related to single data type and inadequate integration, alongside insufficient application of advanced technologies and methods like machine learning. There is an urgent need to explore novel methods for effective fusion of multimodal data as well as innovative techniques for integrating multivariate machine learning models.
Methods This research reviews 12 world cultural heritage sites in Quanzhou. The Octopus Collector is applied to gather comments, images, and other web image-text data. After data cleansing and pre-processing, a total of 100,292 valid entries were obtained. Based on this dataset, the following analyses are completed. 1) Popularity analysis. Based on the number of annual comments, streamgraph, a pyecharts tool, is adopted for visualized analysis of the temporal and spatial evolution of heritage site popularity. 2) Image perception analysis. Latent Dirichlet allocation (LDA) topic clustering model is adopted for mining unsupervised clustering topics from of all comment texts to explore landscape perception dimensions associated with world cultural heritage in Quanzhou; one-for-all (OFA) image description model is adopted for natural language translation and description of all collected images while analyzing the landscape perception network through word frequency analysis and semantic network. 3) Sentiment perception analysis. Based on all comment texts and image description texts, long short-term memory (LSTM) sentiment analysis model is adopted to analyze the sentiment tendency of overall landscape perception and landscape perception of each heritage site.
Results 1) In terms of the spatial and temporal evolution of heritage sites’ popularity and tourists’ landscape perceptions, despite significant spatial and temporal variability, there is a discernible overall trend indicating rapid growth in both popularity and landscape perceptions. Various policies and events serve as the primary driving factors behind this phenomenon. The gradient of heritage sites’ popularity and tourists’ landscape perceptions is shifting from “high − low” to “high − middle − low”. 2) In the context of perception dimensions and networks, pluralistic integration serves as the cultural core of landscape perception, resulting in a multifaceted landscape perception system driven by culturally value. This framework identifies three categories of perceptions. Furthermore, the framework delineates seven subcategories within the dimensions of perception. Overall, cultural value perception > landscape appreciation perception > characteristic experience perception > material carrier perception. Notably, there exists a significant variance in topic proportions across different heritage sites, which culminates in four predominant types of heritage sites characterized by four perception dimensions. In terms of the heritage landscape perception network, the high-frequency words predominantly align with three key dimensions of landscape perception. The semantic network exhibits a “center − edge” structure devoid of absolute core words. The four semantic clusters of the semantic network align closely with LDA topic clustering perception dimensions; intersections among these clusters predominantly reflect both common perception dimensions and local common perception dimensions. 3) In terms of the sentiment perception of heritage landscapes, tourists effectively perceive Quanzhou’s world heritage landscape along with the profound historical and cultural attributes thereof. Overall, the tendencies in landscape perception sentiment range from neutral to positive tendency, exhibiting a greater dispersion in the probabilities of neutral and negative sentiments. The sentiment tendencies reflected in comment texts are predominantly concentrated and more positive, whereas those observed in image description texts display greater variability, leaning towards neutrality and negativity. The sentiment tendencies regarding landscape perception of each heritage site can be categorized as either text-image synergistic or text-image discrete tendency, revealing significant disparities in sentiment indices regarding the landscape perception of different heritage sites. Factors such as proximity to the ancient city of Quanzhou and the degree of aggregation of heritage sites fundamentally influence these sentiment indices. Furthermore, insufficient cultural and scientific outreach alongside inadequate services and supporting facilities also significantly contribute to a diminished sentiment index.
Conclusion This research effectively integrates multimodal web image-text data as well as multivariate machine learning models to explore a novel method for quantitative research on heritage landscape perception. It resolves the issues such as the singularity of data type and the insufficient integration in previous research, along with the inadequate application of new technologies and methods like machine learning.