Paper status: completed

Recognizing Building Group Patterns in Topographic Maps by Integrating Building Functional and Geometric Information

Published:06/01/2022
Original Link
Price: 0.100000
4 readers
This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This study integrates building function and geometric data for building group recognition, using Tencent user density, POIs, constrained Delaunay triangulation, and graph segmentation, achieving over 81% accuracy and improving spatial delineation in map generalization.

Abstract

Citation: He, X.; Deng, M.; Luo, G. Recognizing Building Group Patterns in Topographic Maps by Integrating Building Functional and Geometric Information. ISPRS Int. J. Geo-Inf. 2022 , 11 , 332. https://doi.org/ 10.3390/ijgi11060332 Academic Editors: Florian Hruby and Wolfgang Kainz Received: 1 April 2022 Accepted: 31 May 2022 Published: 1 June 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). International Journal of Geo-Information Article Recognizing Building Group Patterns in Topographic Maps by Integrating Building Functional and Geometric Information Xianjin He 1,2 , Min Deng 1, * and Guowei Luo 2 1 School of Geosciences and Info-Physics, Central South University, Changsha 410083, China; xjhe9@nnnu.edu.cn 2 Key Laboratory of Environment Change and Resources Use in Beibu Gulf, Ministry of Education, Nannin

Mind Map

In-depth Reading

English Analysis

1. Bibliographic Information

1.1. Title

Recognizing Building Group Patterns in Topographic Maps by Integrating Building Functional and Geometric Information

1.2. Authors

Xianjin He, Min Deng, and Guowei Luo.

  • Xianjin He: Affiliated with the School of Geosciences and Info-Physics, Central South University, Changsha, China, and Key Laboratory of Environment Change and Resources Use in Beibu Gulf, Ministry of Education, Nanning Normal University, Nanning, China.
  • Min Deng: Corresponding author, affiliated with the School of Geosciences and Info-Physics, Central South University, Changsha, China.
  • Guowei Luo: Affiliated with the Key Laboratory of Environment Change and Resources Use in Beibu Gulf, Ministry of Education, Nanning Normal University, Nanning, China.

1.3. Journal/Conference

ISPRS International Journal of Geo-Information (ISPRS Int. J. Geo-Inf.). This is a peer-reviewed open access journal focusing on geographic information science. It is well-regarded in the fields of photogrammetry, remote sensing, and geo-information, indicating a reputable publication venue for this type of research.

1.4. Publication Year

2022

1.5. Abstract

The paper addresses the challenge of recognizing building group patterns in topographic maps, which is crucial for urban landscape evaluation, social analysis, and map generalization. The authors highlight the limitations of existing methods that primarily rely on geometric features, leading to unsatisfactory grouping results due to insufficient information. To overcome this, the study proposes a novel approach that integrates both building function and geometric information. The methodology involves inferring building functions using the dynamic time warping (DTW) algorithm based on Tencent user density data and Points of Interest (POIs). Subsequently, constrained Delaunay triangulations (CDTs) are generated for each building block, from which various spatial indices (e.g., continuity index (SCI), direction, and distance) between adjacent buildings are derived. Finally, each building block is modeled as a graph, incorporating these derived matrices and building function information, and a graph segmentation approach is applied to extract building groups. A case study in Chengdu, China, demonstrates the effectiveness of the proposed method, achieving a correctness value above 81.63%. Comparative analysis shows that methods lacking building function information are ineffective, especially when buildings with different functions are in close proximity. The paper concludes that the proposed method yields generalization results that are more aligned with daily map use, providing more accurate spatial divisions of urban buildings.

/files/papers/691086c75d12d02a6339cf06/paper.pdf (This link points to a local file path within the system, implying it's an attached PDF). Publication Status: Officially published in ISPRS Int. J. Geo-Inf. 2022, 11, 332.

2. Executive Summary

2.1. Background & Motivation

The core problem this paper aims to solve is the inadequate recognition of building group patterns in topographic maps. This recognition is fundamental for various applications, including urban landscape evaluation, social analysis, and crucially, map generalization (the process of simplifying map features for smaller scales).

The problem is important because accurately identifying building groups provides structural information essential for automating map generalization operators (algorithms that modify map features). Prior research has largely focused on geometric features of buildings (e.g., size, distance, direction, shape, area, orientation) to define groups. However, the authors argue that this leads to unsatisfactory results because it neglects the semantic information or function of buildings. For instance, buildings with different functions (e.g., residential vs. commercial) might be geometrically close but should logically belong to separate groups. This disconnect between purely geometric grouping and real-world functional divisions creates a gap that the paper seeks to address.

The paper's entry point and innovative idea lie in integrating building function information with geometric characteristics. By leveraging geospatial big data (like Tencent user density data and Points of Interest or POIs), it becomes possible to infer building functions, which can then serve as a critical constraint in the grouping process. This move from purely geometric to a geo-semantic approach is the core innovation.

2.2. Main Contributions / Findings

The primary contributions and key findings of this paper are:

  • Novel Integrated Method: Proposing a novel building grouping method that effectively combines both building function and geometric information. This addresses the limitation of previous methods that only considered geometric features.
  • Building Function Inference: Demonstrating a method to infer building functions using dynamic time warping (DTW) applied to Tencent user density data and POIs. This provides a practical way to acquire semantic information for buildings, which is often missing in traditional topographic maps.
  • Graph-based Grouping with Spatial Indices: Developing a robust graph segmentation approach for building group recognition. This involves creating constrained Delaunay triangulations (CDTs) to derive various spatial indices (e.g., SCI, distance, direction) that characterize adjacency relationships, and then modeling building blocks as graphs where nodes are buildings and edges represent proximity relationships constrained by function.
  • Improved Accuracy and Realism: The case study in Chengdu, China, shows that the proposed method achieves satisfactory results with a correctness value above 81.63%. Comparative studies reveal that incorporating building function information significantly improves grouping accuracy, especially when functionally distinct buildings are geometrically close.
  • Enhanced Map Generalization: The generalization results derived from the proposed method are more consistent with maps for daily use, providing users with more accurate spatial divisions of urban buildings. This demonstrates the practical utility and real-world impact of integrating semantic information.
  • Addressing Under-segmentation: The comparative analysis highlights that methods without functional constraints tend to under-segment (group dissimilar buildings together) when buildings with different functions are close, a problem effectively mitigated by the proposed method.

3. Prerequisite Knowledge & Related Work

3.1. Foundational Concepts

To fully understand this paper, a reader should be familiar with several fundamental concepts from geographic information science, data mining, and graph theory.

  • Building Group Patterns: In cartography, this refers to the characteristic spatial arrangement or distribution of buildings. It's often determined by attributes like the size, distance, and orientation of individual buildings. Recognizing these patterns helps in understanding urban structures and simplifying maps.

  • Map Generalization: This is a crucial process in cartography that involves simplifying the representation of geographic features (like buildings, roads, rivers) when creating maps at smaller scales from larger-scale data. The goal is to maintain legibility, highlight important features, and reduce clutter while preserving essential spatial relationships. For example, a cluster of individual buildings might be generalized into a single block at a very small scale. Recognizing building groups is a prerequisite for effective map generalization, as it helps identify features that should be treated as a single entity during simplification.

  • Delaunay Triangulation (DT): A fundamental geometric construction. For a set of points in a plane, a Delaunay triangulation is a triangulation such that no point in the set is inside the circumcircle of any triangle in the triangulation. It maximizes the minimum angle of all triangles, avoiding "skinny" triangles, which makes it useful for various spatial analyses, including proximity and adjacency detection.

    • Constrained Delaunay Triangulation (CDT): An extension of Delaunay triangulation where certain edges are forced to be part of the triangulation. These "constraints" (e.g., building boundaries, road networks) ensure that specific predefined lines or polygons are preserved in the triangulation, which is essential for modeling real-world spatial relationships more accurately. In this paper, CDTs are used to model the adjacency between buildings and derive various spatial indices.
  • Dynamic Time Warping (DTW): An algorithm used to measure the similarity between two temporal sequences that may vary in speed. For instance, if one person says "hello" slowly and another says it quickly, DTW can align the two speech patterns and calculate their similarity. It finds an optimal alignment between two time series by "warping" one or both of them non-linearly in the time dimension. The minimum distance achieved through this warping indicates their similarity. In this paper, DTW is applied to compare user density time series of buildings to infer their functions.

  • Points of Interest (POIs): Specific locations that people might find useful or interesting. Examples include restaurants, shops, parks, schools, hospitals, and landmarks. POIs often come with categorical information (e.g., "restaurant," "supermarket") that can be used as ground truth or complementary data for inferring the function of nearby buildings.

  • Tencent User Density Data: This refers to real-time user density data collected from users of Tencent products (e.g., WeChat, Tencent QQ, Tencent Maps). These products track user locations, and aggregated, anonymized data can provide insights into population distribution and activity patterns over time. High user density during business hours might indicate an office or commercial building, while high density during evenings and weekends might indicate a residential area.

  • Graph Segmentation: A process in graph theory where a graph is divided into several subgraphs (segments or clusters) based on certain criteria. In the context of building grouping, buildings are represented as nodes, and their spatial relationships (proximity, similarity) as edges. Graph segmentation algorithms then partition these nodes into groups, aiming to make nodes within a group more similar to each other than to nodes in other groups.

  • Spatial Continuity Index (SCI): A measure of how continuous or aligned two adjacent spatial features are. In the context of buildings, a high SCI might indicate that two buildings are part of the same linear arrangement or block, suggesting they should be grouped together. It often considers both distance and alignment.

3.2. Previous Works

The paper categorizes previous works on building group pattern recognition into two main types:

  1. Clustering Methods: These methods typically model buildings and their proximities using graphs, often built upon constrained Delaunay triangulation (CDT).

    • Graph Representation: Nodes represent buildings, and edges represent proximity relationships. Edges are weighted with spatial similarity values.
    • Similarity Measurements:
      • Distance-based: European distances such as nearest distance [12], average distance [13], and visible distance [14] are common.
      • Geometric Attributes: Shape, area, and orientation of buildings are also used to measure similarity [15-18].
    • Techniques: Graph segmentation methods [16] are frequently employed to partition the graph into groups.
    • Machine Learning: More recently, machine learning methods like convolutional neural networks (CNNs) [21], random forest [22], and SVM [23] have also been applied.
  2. Template Matching Methods: These methods identify specific types of group patterns by defining template parameters and then matching potential groups against these templates.

    • Examples: Centerline alignment patterns [5], linear alignment patterns [19], and road alignment patterns [20].

      Common Limitation of Previous Works: The paper critically points out that "the abovementioned methods only consider buildings' geometric information and do not address building semantics (i.e., building functions), leading to great differences between the grouping results and those derived manually." This is the core gap the current paper aims to fill.

Evolution of Building Function Inference: The paper notes a trend in leveraging geospatial big data (e.g., POIs, social media data, GPS, trajectory data) to infer urban structures and building functions [24].

  • Scale of Research: This research has typically focused on the building block or community level [25,26] or the individual building level [27,28].
  • Limitations for Mapping: The paper argues that building block-level inference is too coarse for users to understand building scenes accurately, while individual building-level inference is too granular. Furthermore, these studies often neglect geometric information, leading to a lack of building pattern information and making them unsuitable for daily map use or constraining building group recognition.

3.3. Technological Evolution

The evolution of building pattern recognition has moved from purely manual cartographic methods to increasingly automated approaches, driven by advancements in GIS, computational geometry, and machine learning. Initially, efforts focused on defining geometric similarity based on basic properties like distance and orientation. The introduction of Delaunay triangulation provided a robust way to model adjacency and proximity. Later, more sophisticated graph theory approaches and machine learning algorithms offered powerful tools for clustering and classification.

The most recent significant shift, highlighted by this paper, is the integration of semantic information. The proliferation of geospatial big data from sources like social media, mobile phone usage, and POIs has made it feasible to infer the function or purpose of buildings, adding a crucial layer of intelligence that was previously unavailable or difficult to acquire. This evolution reflects a broader trend in GIS from purely spatial data processing to geo-semantic understanding.

3.4. Differentiation Analysis

Compared to the main methods in related work, the core differences and innovations of this paper's approach are:

  • Integration of Function and Geometry: The most significant differentiation is the explicit integration of building function information with geometric characteristics. While previous methods relied heavily on geometric features (distance, shape, orientation) or explored building functions separately, this paper combines both at the building group recognition stage. This is a direct response to the identified gap where purely geometric approaches produce grouping results that differ greatly from manually derived groups or real-world functional divisions.
  • Leveraging Geospatial Big Data for Semantics: The paper innovatively uses Tencent user density data and POIs in conjunction with DTW to infer building functions. This provides a concrete, data-driven method to obtain the necessary semantic information, moving beyond general classifications to actual activity patterns.
  • Function as a Grouping Constraint: Unlike traditional clustering methods where function is often ignored or implicitly assumed, this method explicitly uses function as a constraint during graph creation and segmentation. Only buildings with the same inferred function can form an edge in the graph, thereby preventing functionally distinct but geometrically close buildings from being grouped together (a common issue in under-segmentation for purely geometric methods).
  • Enhanced Realism for Map Generalization: By producing building groups that reflect both spatial proximity and functional coherence, the resulting generalization outputs are more in line with daily map use and provide more accurate spatial divisions, which is a practical improvement over geometrically-driven generalizations that might misrepresent urban structures.

4. Methodology

4.1. Principles

The core idea of the method is to enhance the accuracy and realism of building group pattern recognition by integrating semantic information (building function) with traditional geometric information. The theoretical basis is that buildings with similar functions are more likely to form coherent groups, especially in the context of urban planning and map generalization, even if their geometric arrangement might be complex. The intuition behind this is that human perception of urban areas often involves functional zones (e.g., residential areas, commercial districts, office blocks), and a grouping method should reflect these inherent divisions. This approach leverages geospatial big data to infer these functional semantics, which are then used as a critical constraint in a graph-based clustering framework that also considers geometric proximity and alignment.

4.2. Core Methodology In-depth (Layer by Layer)

The proposed methodology involves a series of steps, as outlined in Table 1 and detailed below.

4.2.1. Infer Building Functions

The first part of the methodology focuses on inferring the function of each building.

Step 1: Map User Density Distributions

The raw Tencent user density (RTUD) dataset, which consists of points with user counts, is processed. Abnormal user counts (e.g., several times larger than neighboring points for similar buildings) are manually removed. User density distributions are then generated using the ArcMap kernel density tool for each two-hour interval over a workday (5 June 2020) and a non-work day (6 June 2020). This step transforms discrete point data into continuous density surfaces, representing the intensity of user activity across the study area at different times.

Step 2: Select Building Samples and Compute Their Average User Density

After mapping the user density, building samples are selected for different building types (e.g., residential, office, commercial) within the study area. The functional information for these samples is acquired from Points of Interest (POIs) and street views in Baidu Maps, and verified using Google Earth and surveys. For each selected sample building, its average user density is computed across the specified time intervals using the following equation:

Dk,w=1mk1mk(Nk,w,t),t[0,24] D _ { k , w } = \frac { 1 } { m _ { k } } \sum _ { 1 } ^ { m _ { k } } ( N _ { k , w } , t ) , t \in [ 0 , 24 ]

Where:

  • Dk,wD_{k,w} represents the average user density for the kk-th type of function (e.g., residential, commercial, office).
  • kk denotes the specific type of function.
  • mkm_k is the total number of sample buildings identified for the kk-th function type.
  • Nk,wN_{k,w} denotes the user density of a specific sample building belonging to the kk-th function type at a given time point.
  • tt represents the activity times of Tencent users, ranging from 0 to 24 hours.
  • The summation 1mk(Nk,w,t)\sum _ { 1 } ^ { m _ { k } } ( N _ { k , w } , t ) implies summing the user density values of all mkm_k samples for the kk-th function type at each time point tt. The result is an average user density time series curve for each building type, representing its typical activity pattern over a 24-hour cycle.

Step 3: Infer Building Functions Using the Dynamic Time Warping (DTW) Algorithm

This step uses the Dynamic Time Warping (DTW) algorithm to infer the function of every other building (the "predicted building") by comparing its user density time series to the average user density time series of the known sample building types (derived in Step 2).

  • Standard Reference Template (R): The average user density sequence of each known building type (e.g., residential, commercial, office) serves as a standard reference template RR. This is an MM-dimensional vector: R={R(1),R(2),,R(m),,R(M)}R = \{R(1), R(2), \dots, R(m), \dots, R(M)\} Each component R(m) represents the average user density value at a specific time point mm.
  • Test Template (T): The average user density sequence of each predicted building (whose function is unknown) serves as a test template TT. This may be an NN-dimensional vector: T={T(1),T(2),,T(m),,T(N)}T = \{T(1), T(2), \dots, T(m), \dots, T(N)\}
  • DTW Application: The DTW algorithm is utilized to compare the time series of each predicted building (TT) with the reference sequence of every sample type (RR). DTW calculates a minimum distance by finding an optimal alignment between the two sequences, even if they are shifted or stretched in time.
  • Function Determination: The function of a predicted building is determined by the sample type (from RR) that yields the minimum DTW distance when compared with the predicted building's time series (TT). This implies that the predicted building's activity pattern is most similar to that specific sample type.

4.2.2. Recognition of Building Groups

The second part of the methodology focuses on identifying building groups based on both geometric and functional information.

Step 4: Create Constrained Delaunay Triangulation for Each Building Block

To improve computational efficiency, the topographical map of buildings is first partitioned into several building blocks using the road network. This is because comparing every pair of buildings for proximity relationships in a large dataset is computationally expensive. Each building block then becomes an individual treatment unit.

For each building block, two types of constrained Delaunay triangulations (CDT) are created:

  1. CDT for all buildings within each block: This CDT is computed for all building polygons within an individual block. This initial triangulation is primarily used to detect adjacency relationships among buildings.

  2. Paired building triangles: For any two buildings found to be adjacent based on the first CDT, a second type of CDT is computed specifically for that pair of adjacent buildings. These paired building triangles are then used to derive detailed spatial indices (described in Step 5).

    Before creating any CDT, extra points are added to the line segments of building polygons and roads at regular intervals. This preprocessing step helps avoid producing narrow triangles, which can lead to instability or inaccuracies in subsequent calculations.

Step 5: Compute Index Values Based on Constrained Delaunay Triangulation

Once the CDTs are created, several spatial indices are computed for adjacent buildings. These indices quantify different aspects of their spatial relationship and are stored in matrices (except for path angles).

  1. Proximity Relationship Matrix (RR): This Boolean matrix indicates whether two buildings are topologically adjacent. R=Ri,jR = R _ { i , j } Where:

    • i=1:ni = 1:n and j=1:nj = 1:n denote the buildings within a block.
    • Ri,jR_{i,j} is a Boolean variable.
    • Ri,j=1R_{i,j} = 1 if building ii and building jj are adjacent (i.e., they share common triangles in the CDT computed for all buildings within the block).
    • Ri,j=0R_{i,j} = 0 if building ii and building jj are not adjacent.
  2. Length of the Skeleton Line (LL): The skeleton line between two adjacent buildings is formed by connecting the middle points of the sides of triangles that link the two buildings in the paired building triangles CDT. L=Li,j=li,j,k L = L _ { i , j } = \sum l _ { i , j , k } Where:

    • Li,jL_{i,j} denotes the total length of the skeleton line between adjacent buildings ii and jj.
    • li,j,kl_{i,j,k} denotes the distance between the two middle points of the sides of triangle kk that link two adjacent buildings ii and jj.
    • If buildings ii and jj are not adjacent (as per Ri,j=0R_{i,j}=0), then Li,j=0L_{i,j} = 0.
  3. Mean Distance of Adjacent Buildings (DD): This metric represents the average distance between two adjacent buildings, weighted by the skeleton line segments. D=Di,j=hi,j,k×li,j,kli,j,k D = D _ { i , j } = \frac { \sum h _ { i , j , k } \times l _ { i , j , k } } { \sum l _ { i , j , k } } Where:

    • Di,jD_{i,j} denotes the mean distance between adjacent buildings ii and jj.
    • hi,j,kh_{i,j,k} denotes the height of triangle kk (from the paired building triangles CDT) with its base falling within either adjacent building polygon.
      • If the triangle is acute or right-angled, hi,j,kh_{i,j,k} is the height from the side shared with the buildings.
      • If the triangle is obtuse, hi,j,kh_{i,j,k} is defined as the shortest side of the triangle that links the two buildings.
    • li,j,kl_{i,j,k} is the distance between the two middle points of the sides of triangle kk that link two adjacent buildings, as obtained from Equation (3).
    • If buildings ii and jj are not adjacent (as per Ri,j=0R_{i,j}=0), then Di,j=D_{i,j} = \infty (infinity).
  4. Spatial Continuity Index (SCI): This index quantifies the spatial continuity or alignment between two adjacent buildings. It is calculated as the ratio of the skeleton line length to their mean distance. SCI=SCIi,j=Li,jDi,j S C I = S C I _ { i , j } = \frac { L _ { i , j } } { D _ { i , j } } Where:

    • SCIi,jSCI_{i,j} is the spatial continuity between buildings ii and jj.
    • Li,jL_{i,j} is the length of the skeleton line between buildings ii and jj, as described in Equation (3).
    • Di,jD_{i,j} is the mean distance between buildings ii and jj, as described in Equation (4). A higher SCI value generally indicates better spatial continuity.
  5. Azimuth Angles of Adjacent Buildings: These angles quantify the relative orientation of two adjacent buildings. The calculation involves three steps based on the paired building triangles CDT (referencing [15]):

    • First, the azimuth angles of all individual triangles linking the two buildings are determined.
    • Second, the mean azimuth angle of the two adjacent buildings is computed from these individual triangle angles.
    • Third, a final azimuth angle representing the general orientation between the two buildings is derived. This angle is important for identifying linear patterns.
  6. Path Angle (θ\theta): This angle is calculated for a middle building ii within a potential linear sequence of buildings (e.g., buildingi1_{i-1}, building_i, buildingi+1_{i+1}). It measures the angle formed by the direction vector from buildingi1_{i-1} to building_i and the direction vector from building_i to buildingi+1_{i+1}. θ=θi,j=αi,j,k×li,j,kli,j,k \theta = \theta _ { i , j } = \frac { \sum \alpha _ { i , j , k } \times l _ { i , j , k } } { \sum l _ { i , j , k } } Where:

    • θi,j\theta_{i,j} represents the path angle at building ii in relation to its neighbors jj.
    • αi,j,k\alpha_{i,j,k} denotes the azimuth angle of a triangle kk with its base falling within either adjacent building. This refers to the local orientation derived from the CDT for a specific triangle linking buildings ii and jj.
    • li,j,kl_{i,j,k} denotes the distance between the two middle points of the two sides of triangle kk that link two adjacent buildings, as obtained from Equation (4). This essentially weights the azimuth angles by the length of the skeleton line segments.
    • A path angle closer to 0 indicates a stronger linear pattern. This index is used during the tracing process for graph segmentation.

Step 6: Graph Creation and Segmentation

This is the final step for extracting building groups.

  • Graph Creation: Each building block is modeled as a graph.

    • Nodes: Represent individual buildings.
    • Edges: Represent adjacent relationships between two buildings. A crucial constraint here is that an edge is only created if the two adjacent buildings have the same inferred function (from Step 3). This integrates the semantic information.
    • Edge Weights: The edges are weighted using the index values derived in Step 5 (e.g., SCI, distance, azimuth angle).
  • Graph Segmentation: A graph segmentation approach is proposed to extract building groups from the created graph. The process involves:

    1. Linear Pattern Identification: Linear patterns are identified first due to their high homogeneity (i.e., buildings are aligned and similarly spaced). The method proposed in [15] is applied here, likely using the SCI and azimuth angle indices.
    2. Edge Removal based on Distance Homogeneity: After initial linear patterns are identified, edges are removed if they connect buildings that are too far apart or exhibit significant variations in distance within a potential group.
      • The process starts by finding the pair of nodes (buildings) connected by an edge with the smallest distance weight.
      • Then, all neighboring nodes of these two buildings are identified.
      • The standard deviation of distances among these neighboring nodes (within the potential group) is calculated.
      • If this standard deviation exceeds a given threshold (e.g., 0.2), the edge with the maximum distance among the neighbors is deleted.
      • The standard deviation is then recalculated to check if further deletions are needed.
      • This process continues iteratively.
      • After completing this for the first pair, the next pair of nodes with the second shortest distance is found, and the same operation is performed.
      • Edges that do not meet the distance homogeneity requirements are ultimately deleted, leading to the final segmentation of the graph into distinct building groups.

4.2.3. Assessment

Step 7: Accuracy Assessment with Reference Data

To evaluate the effectiveness of the proposed approach, an expert evaluation is conducted.

  • Building Function Assessment: The results of building functions detected using the DTW algorithm (from Step 3) are compared with Baidu Map POIs.
  • Building Group Recognition Assessment: The reference data for building group recognition is identified manually by experts based on Baidu Maps and Google Earth images.
  • Cases for Group Assessment: Four different cases are considered when assessing building group recognition results:
    • Correct patterns: Modeled patterns are consistent with reference patterns.
    • Inclusion patterns: One modeled pattern contains multiple reference patterns (suggests under-segmentation).
    • Within patterns: One reference pattern contains multiple modeled patterns (suggests over-segmentation).
    • Overlap patterns: A modeled pattern partially overlaps a reference pattern.
  • Metrics: Two metrics are used to assess accuracy:
    • Correctness: The ratio of correct patterns to the total extracted patterns.
    • Completeness: The ratio of correct patterns to the total reference patterns.
  • Comparative Study: To understand the robustness and contribution of building function information, the proposed method is compared against a standard CTD method (baseline). This baseline method follows Steps 4 to 6 but without considering building functions. Specifically, in its graph creation step, edges are created between adjacent buildings regardless of their function.

5. Experimental Setup

5.1. Datasets

The case study area is located in Chengdu, China, which is the largest city in southwestern China.

  • Building Footprints: Provided by the Province Urban Planning and Design Survey Research Institutes of Sichuan Province, China.

    • Format: ESRI shapefile.
    • Scale: Includes 546 buildings.
    • Characteristics: These buildings are partitioned into several blocks by road networks and are typically grouped into district units on electronic maps.
  • Road Networks: Also provided by the Province Urban Planning and Design Survey Research Institutes of Sichuan Province, China. An additional road network dataset downloaded from AutoNavi map was used for discussing generalization results.

  • Real-time Tencent User Density (RTUD):

    • Source: Captured from the Tencent website (http://ur.tencent.com) using a web crawler, recorded every two hours.
    • Content: Records location information of smart terminal devices using Tencent products (QQ, WeChat, Tencent Maps, other LBS mobile applications). Each point contains a user count.
    • Temporal Coverage: Data from a workday (5 June 2020) and a non-work day (6 June 2020) were selected to capture different activity patterns.
  • Points of Interest (POIs): Used in conjunction with Tencent user density data to infer building functions and as a reference for building function accuracy assessment.

  • Baidu Maps and Google Earth Images: Used for ground truthing, verifying experimental results, and manually identifying reference data for building group recognition.

    These datasets were chosen because they provide both the necessary geometric information (building footprints, road networks) and semantic information (Tencent user density, POIs, manual verification) needed to implement and validate the proposed method that combines both aspects. The Tencent user density data is particularly effective for inferring building functions due to its ability to capture dynamic population activity patterns.

The following figure (Figure 1 from the original paper) shows the location of the study area and the experimental data.

Figure 1. Location of the study area and the experimental data are shown for (a) Sichuan Province, China, (b) the capital city of Sichuan Province, and (c) the experimental data. 该图像是论文中的示意图,展示了研究区的位置及实验数据,包括(a)中国四川省地图,(b)四川省省会成都市地图,以及(c)成都市详细的建筑物测试数据分布情况。

5.2. Evaluation Metrics

The paper uses two primary metrics to assess the accuracy of building pattern recognition: Correctness and Completeness. These metrics are widely used in pattern recognition research [22,32].

5.2.1. Correctness

  • Conceptual Definition: Correctness measures the proportion of the extracted (modeled) building groups that are truly correct when compared to the reference (ground truth) patterns. It indicates how reliable the detected groups are. A high correctness value means that most of the groups identified by the method are valid.
  • Mathematical Formula: Correctness=Number of Correct PatternsTotal Number of Modeled Groups×100% Correctness = \frac { \text{Number of Correct Patterns} } { \text{Total Number of Modeled Groups} } \times 100\%
  • Symbol Explanation:
    • Number of Correct Patterns: The count of building groups identified by the proposed method that are consistent with the reference patterns.
    • Total Number of Modeled Groups: The total count of building groups extracted by the proposed method.

5.2.2. Completeness

  • Conceptual Definition: Completeness measures the proportion of the reference (ground truth) building groups that were successfully identified by the proposed method. It indicates how many of the actual groups in the study area the method managed to capture. A high completeness value means that the method did not miss many real groups.
  • Mathematical Formula: Completeness=Number of Correct PatternsTotal Number of Reference Groups×100% Completeness = \frac { \text{Number of Correct Patterns} } { \text{Total Number of Reference Groups} } \times 100\%
  • Symbol Explanation:
    • Number of Correct Patterns: The count of building groups identified by the proposed method that are consistent with the reference patterns.
    • Total Number of Reference Groups: The total count of building groups present in the manually identified ground truth data.

5.3. Baselines

The primary baseline model used for comparison is a "standard CTD method" without building function information.

  • Description: This method essentially follows the geometric grouping steps (Steps 4 to 6) of the proposed methodology but explicitly omits the integration of building function information.
  • Key Difference: In the graph creation step (part of Step 6), the standard CTD method creates edges between adjacent buildings regardless of their function. This means it does not use function as a constraint for forming potential groups.
  • Purpose: The comparison with this baseline is designed to clearly demonstrate the robustness and superiority of the proposed method, particularly in scenarios where buildings with different functions are geographically close, leading to under-segmentation issues in purely geometric approaches. It helps quantify the contribution of semantic information to the grouping accuracy.

Implementation Details:

  • Hardware: Personal computer with an Intel (R) Core (TM) i7-7700 CPU and 8 GB of memory.
  • Software: All algorithms were implemented using C# on Microsoft Windows 10 (x64).
  • Libraries: Component libraries and tool libraries of ArcGIS Engine 10.1 were used for development.

6. Results & Analysis

6.1. Core Results Analysis

The experimental results demonstrate the effectiveness of the proposed method in recognizing building group patterns by integrating building function and geometric information, especially when compared to a purely geometric-based approach.

The following figure (Figure 5 from the original paper) compares the building group recognition results of the proposed method, the standard CTD method, and the reference map.

Figure 6. The mapping results of Tencent user density for different times in our study area. The redder the color, the higher the density of Tencent users. Hour/date: 7/5 represents 7 o'clock on 5 Ju… 该图像是图表,展示了研究区域在不同时间的腾讯用户密度分布,颜色由绿色至红色表示密度从低到高,图中蓝色表示建筑物轮廓,时间标注如7/5代表6月5日7时。

Visually, Figure 5 shows that the proposed method (a) produces building groups that appear more coherent and aligned with functional zones, particularly in residential areas. Buildings are reasonably grouped, and residential buildings are distinctly separated from other functional types. In contrast, the standard CTD method (b) exhibits significant deviations from the reference patterns (c). Many buildings of different functional types are grouped together because of their spatial proximity, leading to under-segmentation. The blue hatched patterns indicating correctly grouped buildings are much more prevalent in the proposed method's output.

6.2. Data Presentation (Tables)

The following are the results from Table 2 of the original paper:

Method Proposed Method Sum Standard CTD Method Sum
Block ID 0 1 2 3 4 5 6 - 0 1 2 3 4 5 6 -
Number of reference groups 26 15 27 7 9 4 6 94 26 15 27 7 9 4 6 94
Number of modeled groups 25 17 29 7 9 5 6 98 17 15 19 5 9 5 6 76
Number of correct groups 22 14 23 7 5 3 6 80 10 9 10 3 4 3 6 45
Correctness (%) 88.00 82.35 79.31 100 55.56 60 100 81.63 58.82 60.00 52.63 60.00 44.44 60.00 100 59.21
Completeness (%) 84.62 93.33 85.18 100 55.56 75.00 100 85.10 38.46 60.00 37.03 42.85 44.44 75.00 100 47.87

Analysis of Table 2:

  • Overall Performance: The proposed method achieves an overall correctness of 81.63% and completeness of 85.10%. This indicates a strong agreement with the reference data.
  • Comparison with Standard CTD: The standard CTD method (without functional information) performs significantly worse, with an overall correctness of 59.21% and completeness of 47.87%. This clearly validates the importance of integrating building function information.
  • Block-specific Performance (Proposed Method):
    • High accuracy in blocks 0, 1, 3, and 6 (correctness and completeness often above 80%, some even 100%).
    • Poorer performance in blocks 4 and 5 (correctness 55.56% and 60%, completeness 55.56% and 75%). The authors attribute these errors primarily to over-segmentation, where the distance between buildings within the same group was larger than that between groups, suggesting that distance alone is not always the dominant factor in grouping, or that parameter calibration for distance thresholds might need refinement in certain contexts.
  • Block-specific Performance (Standard CTD Method): This method performs poorly across most blocks, except for block 6 where it achieves 100% correctness and completeness (likely a simple, isolated block). The low scores indicate a tendency for under-segmentation, grouping buildings with different functions together due to proximity.

6.3. Building Function Recognition Results

The paper first details the process and results of building function inference, which is a prerequisite for the grouping method.

The following figure (Figure 6 from the original paper) shows the mapping results of Tencent user density for different times in the study area.

Figure 7. The temporal changes in average user density over time. 该图像是图表,展示了不同类型建筑用户密度随时间的变化趋势,包括住宅、商业和办公三类用户的平均密度。

Figure 6 illustrates the dynamic nature of Tencent user density over different times and days (workday vs. non-work day). The varying redness (higher density) across the maps visually confirms that population activities shift spatially over time, providing strong evidence that building functions can indeed be inferred from these temporal patterns.

The following figure (Figure 7 from the original paper) shows the temporal changes in average user density over time for different building types.

Figure 8. Results of building function recognition are shown for (a) DTW method, and (b) reference data. 该图像是论文中图8的对比示意图,展示了基于DTW方法(a)与参考数据(b)识别的建筑功能分布情况,使用不同颜色区分居住、商业和办公建筑,并标注了建筑街区及其ID。

Figure 7 presents the temporal changes in average Tencent user density for residential, business, and office building types.

  • Business Buildings: Show obvious periodic oscillations, with high activity during the day and almost no one late at night. This distinct pattern makes commercial building recognition highly accurate.
  • Residential Buildings: Exhibit slight differences between weekdays and rest days, generally showing higher activity in evenings and weekends.
  • Office Buildings: Display considerable variation between the two days, with high activity on workdays and significantly less on non-work days. These distinct temporal curves are the basis for the DTW algorithm to differentiate building functions.

The following figure (Figure 8 from the original paper) shows the results of building function recognition using the DTW method compared to reference data.

Figure 9. The remaining triangles during the segmentation procedure (step 6) are shown for (a) the proposed method, and (b) the standard CTD method. Buildings within the red outlines indicate misclas… 该图像是论文中图9示意图,展示了分割步骤6中剩余三角形的分布情况,(a)为所提方法,(b)为标准CTD方法。红色轮廓标注的建筑群显示了错误分类的区域,插图详细放大对比了不同方法的表现。

Figure 8 shows the building function recognition results. The DTW method achieved an overall recognition accuracy of 87.91%.

  • Accuracy by Type:
    • Residential: 91.70% accuracy.
    • Commercial: 96.77% accuracy (highest).
    • Office: 47.82% accuracy (lowest).
  • Reasons for Accuracy:
    • The high accuracy in commercial building recognition is attributed to their more obvious user density curves (as seen in Figure 7).
    • The lowest accuracy in office building recognition is because their activity curves are often very similar to residential buildings, leading to misclassifications.
    • Another factor for office building misclassification is that some residential buildings in blocks 0 and 2 were actually being used as office buildings, making it genuinely difficult to distinguish them even for experts. The COVID-19 pandemic, leading to more work-from-home scenarios, could have further blurred these patterns.

6.4. Discussion of Grouping Results with Functional Information

Returning to Figure 5 and considering the function recognition results:

  • Commercial Building Grouping: The proposed method performs competitively, with only one error group. In contrast, the standard CTD method incorrectly groups three commercial groups.

  • Office Building Grouping: The proposed method correctly groups all office buildings. The standard CTD method, however, incorrectly groups office and residential buildings together, highlighting its inability to distinguish functions based solely on geometry.

  • Misclassified Residential Buildings: Neither method correctly groups residential buildings that were functionally misclassified as office buildings. This indicates that the accuracy of the function inference step directly impacts the grouping accuracy.

    The following figure (Figure 9 from the original paper) illustrates the remaining triangles during the segmentation procedure (step 6) for both methods.

    Figure 10. The generalized results based on building group patterns recognized using the two methods are shown for (a) the proposed method, and (b) the standard CTD method. Incorrect results marked w… 该图像是图表,展示了图10中两种建筑群模式识别方法的归化结果对比。左图为所提方法,右图为标准CTD方法,不同颜色圈出的错误结果分别对应包含、包含于和重叠三种模式。

Figure 9 provides insights into the graph segmentation process. Green triangles indicate retained proximity relationships.

  • Proposed Method (a):
    • The figure shows two main reasons for over-segmentation errors:
      1. Great variation in distance between buildings within the same intended group (Figure 9a(A)). Some internal distances are larger than distances between groups (Figure 9a(B,C)).
      2. Abnormal distances can lead to a low continuity value (SCI) (Figure 9a(A)), causing incorrect segmentation.
  • Standard CTD Method (b):
    • This method suffers from more errors, primarily under-segmentation, because it lacks the functional constraint.
    • When buildings with different functions are close to each other, the standard CTD method groups all of them together (Figure 9b(B,C)).
    • Without semantic information, it struggles to logically divide buildings, such as school buildings, into functionally distinct groups (Figure 9b(A)).

6.5. Impact on Map Generalization

Map generalization is a key application. The following figure (Figure 10 from the original paper) presents the generalization results derived from the building groups recognized by both methods, superimposed on the AutoNavi map road network.

Figure 2. Examples of constrained Delaunay triangulation are shown for (a) triangulation computed for all buildings within each individual block, and (b) triangulation computed for pairs of adjacent… 该图像是图2示意图,展示了受约束德劳内三角剖分的两个实例:(a)在每个单独建筑块内对所有建筑进行的三角剖分,(b)针对相邻建筑对的三角剖分,图中建筑以灰色和橙色区分,三角形以线框表示。

Figure 10 demonstrates how building grouping quality directly impacts map generalization results and user experience (e.g., navigation).

  • Under-segmentation (Red Circles): Inclusion patterns (where one modeled group contains multiple reference groups) lead to under-segmentation. In the generalized map, these appear as large, undifferentiated built-up areas. This forces users to make more detours during navigation, as internal roads or paths within these functionally diverse areas are obscured. The standard CTD method frequently produces these errors.

  • Over-segmentation (Green Circles): Within patterns (where one reference group contains multiple modeled groups) lead to over-segmentation. While occurring, particularly in residential communities for the proposed method, the impact on navigation is not as severe because these are typically internal divisions within homogeneous areas.

  • Overlap Patterns (Blue Circles): Incorrect generalized results from overlap patterns can cause navigation errors. The generalized features might create gaps that resemble driving roads but are actually walkways within residential compounds, leading to incorrect routing.

    Overall, the generalization results derived from the proposed method (incorporating function) are more in line with daily map use needs because they provide a more accurate spatial division of urban buildings, reflecting their functional organization.

7. Conclusion & Reflections

7.1. Conclusion Summary

This paper successfully addresses the limitations of traditional building grouping methods that rely solely on geometric characteristics by proposing a novel approach that integrates building function information. The methodology involves two main stages: first, inferring building functions using the dynamic time warping (DTW) algorithm applied to Tencent user density data and Points of Interest (POIs). Second, recognizing building groups through a graph-based segmentation strategy that leverages various spatial indices derived from constrained Delaunay triangulations (CDTs), with the crucial constraint that only buildings of the same inferred function can be grouped together.

The case study in Chengdu, China, demonstrated the effectiveness of this integrated approach, yielding correctness values above 81.63% for the study area. A comparative analysis with a standard CTD method (without functional information) clearly highlighted the superiority of the proposed method, especially in situations where functionally different buildings are geometrically close. The functional constraint effectively prevents under-segmentation, which is a common issue for purely geometric methods. Furthermore, the generalization results derived from the proposed method are shown to be more realistic and useful for daily map applications, as they provide a more accurate and functionally coherent spatial division of urban buildings.

7.2. Limitations & Future Work

The authors acknowledge several limitations and suggest future research directions:

  • Additional Semantic Information: The current method primarily uses building function. Future work could explore integrating more semantic information, such as the height of buildings, which could further refine grouping decisions (e.g., differentiating high-rise office blocks from low-rise commercial buildings).
  • Automatic Parameter Calibration: The segmentation strategy (Step 6) involves parameters like the path angle threshold and the standard deviation threshold for distance homogeneity. The paper indicates a need for automatically calibrating these parameters, rather than relying on manual tuning, to enhance the method's robustness and ease of use across different datasets and contexts.

7.3. Personal Insights & Critique

This paper makes a significant contribution by bridging the gap between geometric and semantic information in building group recognition. The use of Tencent user density data as a proxy for building function is particularly insightful, demonstrating how geospatial big data can enhance traditional GIS analysis. The DTW algorithm is well-suited for this task, effectively handling the temporal variations in user activity.

Inspirations and Applications:

  • Smart City Planning: The method's ability to accurately delineate functional zones could be invaluable for urban planners to understand how different areas are used and to inform decisions regarding zoning, infrastructure development, and service provision.
  • Location-Based Services (LBS): Improved building grouping could lead to more intelligent LBS, such as better recommendations for points of interest based on the functional context of a user's location, or more intuitive navigation instructions.
  • Automated Cartography: The direct application to map generalization highlights its potential for creating more intelligent and context-aware automated mapping systems, reducing the need for manual intervention and improving the quality of derived maps.
  • Beyond Buildings: The core idea of integrating functional semantics derived from dynamic user data with geometric analysis could be applied to other geographic features, such as parks, transportation hubs, or natural areas, to understand their usage patterns and group them accordingly for various analyses.

Potential Issues, Unverified Assumptions, or Areas for Improvement:

  • Data Availability and Bias: The reliance on Tencent user density data implies a dependency on a specific commercial data source. The generalizability of this approach might be limited in regions where such detailed big data is not available or where user demographics within the data provider's ecosystem do not accurately reflect the overall population. There could also be biases in the data (e.g., certain age groups or socioeconomic classes might be underrepresented).

  • Definition of "Function": The paper simplifies building function into broad categories (residential, commercial, office). Many buildings are mixed-use, especially in urban centers. How such complex functional types are handled or could be disaggregated into multiple functions per building is not fully explored and could pose a challenge.

  • Sensitivity to DTW Parameters: While DTW is robust, its performance can sometimes be sensitive to parameters or the quality of the time series data. Noise or sparse data in user density could affect function inference accuracy.

  • Over-segmentation in Certain Blocks: The observation that over-segmentation occurred in blocks 4 and 5 due to distance variations within intended groups suggests that the geometric weighting or segmentation strategy might still need refinement. A more adaptive approach that considers the local context for distance thresholds could be beneficial.

  • Scalability for Very Large Areas: While the building block partitioning helps with efficiency, for extremely large metropolitan areas with millions of buildings, the computational intensity of CDT generation and graph segmentation might still be a concern. Further optimization or hierarchical approaches could be explored.

  • "Correctness" of Manual Reference Data: While expert evaluation and manual identification of reference data are standard, human interpretation of building groups can still vary. The implicit assumption is that the manual reference map represents an objective "truth," but some level of subjectivity might exist.

    Overall, this paper presents a compelling and well-executed methodology that pushes the boundaries of building pattern recognition by effectively integrating semantic knowledge. It offers a clear path towards generating more intelligent and user-centric maps.

Similar papers

Recommended via semantic vector search.

No similar papers found yet.