AiPaper
Paper status: completed

Extracting building patterns with multilevel graph partition and building grouping

Published:11/11/2016
Original Link
Price: 0.10
2 readers
This analysis is AI-generated and may not be fully accurate. Please refer to the original paper.

TL;DR Summary

This study presents a multilevel graph partitioning method for extracting building patterns from remote sensing, enabling accurate identification of urban layouts like grid, cluster, and radial arrangements for urban planning and disaster management.

Abstract

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/309861890 Extracting building patterns with multilevel graph partition and building grouping Article in ISPRS Journal of Photogrammetry and Remote Sensing · November 2016 DOI: 10.1016/j.isprsjprs.2016.10.001 CITATIONS 62 READS 1,102 4 authors , including: Shihong Du Peking University 149 PUBLICATIONS 5,961 CITATIONS SEE PROFILE Mi Shu Peking University 5 PUBLICATIONS 193 CITATIONS SEE PROFILE All content following this page was uploaded by Shihong Du on 30 September 2017. The user has requested enhancement of the downloaded file.

Mind Map

In-depth Reading

English Analysis

1. Bibliographic Information

1.1. Title

Extracting building patterns with multilevel graph partition and building grouping

1.2. Authors

  • Shihong Du: Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China.
  • Liqun Luo: Unit 61175 of PLA, Nanjing 210049, China.
  • Kai Cao: Department of Geography, National University of Singapore, Singapore.
  • Mi Shu: Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China.

1.3. Journal/Conference

ISPRS Journal of Photogrammetry and Remote Sensing. This is a highly reputable journal in the fields of photogrammetry, remote sensing, and spatial information sciences, known for publishing cutting-edge research and impactful methodologies. Its influence is significant among researchers and practitioners in geographic information science and related disciplines.

1.4. Publication Year

November 2016

1.5. Abstract

This paper introduces a novel method for extracting building patterns from remote sensing imagery. The core challenge addressed is the identification and classification of building arrangements in urban environments, which is vital for applications like urban planning, disaster response, and infrastructure management. The authors propose an approach that models and analyzes spatial relationships among buildings using graph partitioning techniques. Specifically, a multilevel graph partitioning algorithm is employed to recursively decompose a building footprint graph into meaningful subgraphs. These subgraphs are subsequently grouped based on their spatial and structural characteristics. The primary goal is to capture the inherent regularity and structural order in building layouts, enabling the recognition of common urban patterns such as grid-like, cluster, or radial arrangements. The method's effectiveness is validated across various urban scenes, demonstrating its capability to accurately segment buildings into coherent groups and precisely extract higher-level pattern types. The paper contributes a scalable and efficient algorithm for urban pattern extraction, offering potential applications in automated urban analysis and geographic information systems.

/files/papers/691089cb5d12d02a6339cf17/paper.pdf (This link refers to a local file path provided by the user, indicating the PDF content was directly supplied for analysis rather than an external web link. The paper is officially published as per the journal information.)

2. Executive Summary

2.1. Background & Motivation

The core problem the paper aims to solve is the accurate and efficient recognition and extraction of building patterns from a large number of buildings in remote sensing imagery. Building patterns refer to the arrangements and forms that groups of buildings exhibit collectively in space.

This problem is crucial for several reasons:

  • Urban Planning and Management: Understanding building arrangements is fundamental for evaluating landscape configuration, estimating urban population, and analyzing urban structures and functions (e.g., distinguishing residential, industrial, or commercial areas).
  • Map Generalization: Building patterns are critical configurations that must be preserved when spatial scales decrease during map generalization processes.
  • Disaster Response and Infrastructure Management: Rapid identification of urban patterns can support emergency services and resource allocation.
  • Gaps in Prior Research:
    • Incomplete Typology: Existing studies often have limited typologies of building patterns, overlooking some common patterns (e.g., parallel and perpendicular patterns) and failing to address the associations among different pattern types.

    • Ineffective Extraction Methods: Previous building clustering methods often relied on simple similarity measurements or ineffective clustering algorithms, leading to unsatisfactory results. Pattern analysis methods were often tailored to specific patterns and lacked integration with clustering approaches.

    • Subjectivity in Weighting: Many methods rely on manually selected weights for similarity measurements, which is subjective and time-consuming.

      The paper's entry point or innovative idea is to propose a complete typology of building patterns and an integrated strategy that combines the strengths of building clustering and pattern recognition methods. It addresses the subjectivity of weight determination through an automated approach.

2.2. Main Contributions / Findings

The paper makes three primary contributions:

  1. Extended Typology of Building Patterns: It considers and handles a broader range of building patterns than previous studies, including collinear (straight and oblique), curvilinear, parallel and perpendicular groups, and grid patterns. This more complete typology better reflects urban functions.

  2. Integrated Multilevel Graph Partitioning for Clustering: It designs and integrates four similarity criteria (area, shape, visual distance, and orientation) into a multilevel graph partition method for robust building clustering. Importantly, it uses the Relief-F algorithm to automatically identify the optimal weights for these similarity measurements, replacing subjective manual selection. This leads to high-quality building clusters.

  3. Systematic Extraction Strategy for Diverse Patterns: It proposes four systematic strategies to extract the different types of patterns (collinear, curvilinear, parallel and perpendicular groups, and grid patterns) in an integrated manner, unlike previous approaches that handled them separately.

    The key conclusions and findings are:

  • The proposed integrated strategy, combining multilevel graph partition for clustering and a graph-based grouping for pattern recognition, can produce high-quality patterns with over 90% accuracy.
  • The Relief-F algorithm is effective in automatically determining optimal weights for similarity measurements, reducing subjectivity and improving clustering quality.
  • The F-histogram model is demonstrated to be more reliable and applicable for representing relative orientation between buildings compared to the centroid model and Voronoi graph model, yielding higher accuracy in pattern extraction.
  • The method is shown to be scalable and efficient on various urban scenes, making it suitable for automated urban analysis and Geographic Information Systems (GIS).

3. Prerequisite Knowledge & Related Work

3.1. Foundational Concepts

3.1.1. Building Patterns

Building patterns refer to the collective spatial arrangements and forms exhibited by groups of buildings. These patterns are visually recognizable and can be semantically named, such as grid-like, linear, or cluster arrangements. They provide insights into urban structure and function.

3.1.2. Graph Theory

Graph theory is a branch of mathematics dealing with graphs, which are mathematical structures used to model pairwise relations between objects. A graph consists of vertices (or nodes) and edges (or links) that connect pairs of vertices.

  • Vertex (V): Represents an individual building.
  • Edge (E): Represents a spatial relationship or connection between two buildings.
  • Weighted Graph: A graph where each edge has a numerical value (a weight) associated with it, representing the strength or cost of the connection. In this paper, edge weights represent similarity between buildings.

3.1.3. Graph Partitioning

Graph partitioning is the process of dividing a graph into smaller subgraphs such that a certain objective function is optimized (e.g., minimizing edge cuts between subgraphs, or maximizing similarity within subgraphs). This is used to group related entities.

3.1.4. Multilevel Graph Partitioning

A multilevel graph partitioning framework is an advanced technique used for efficiently partitioning large graphs. It operates in three main phases:

  1. Coarsening Phase: The original (finest) graph is successively transformed into a sequence of smaller, coarser graphs. This is typically done by collapsing highly connected nodes into single super-nodes.
  2. Partitioning Phase: The smallest (coarsest) graph is partitioned. This initial partition is computationally cheaper.
  3. Refinement Phase: The partition is then iteratively refined by moving nodes between partitions (or undoing coarsening) to improve the objective function at each level, working back up to the original graph. This approach balances computational efficiency with solution quality.

3.1.5. Delaunay Triangulation

Delaunay triangulation is a fundamental geometric construction. For a set of points in a plane, a Delaunay triangulation is a triangulation such that no point in the set is inside the circumcircle of any triangle in the triangulation. It has the property of maximizing the minimum angle of the triangles, making triangles "fat" and avoiding "sliver" triangles. In this paper, it's used to establish neighboring relations between buildings.

3.1.6. Smallest Bounding Rectangle (SBR)

The Smallest Bounding Rectangle (SBR) of an object is the rectangle with the minimum area that completely encloses the object. It is often used to approximate the object's shape and derive properties like length, width, and orientation.

3.1.7. Relief-F Algorithm

Relief-F is a feature selection algorithm, an extension of the original Relief algorithm. Its primary purpose is to estimate the quality or relevance of features (attributes) in a dataset. It works by randomly sampling an instance, then finding its nearest neighbors from the same class (hits) and from different classes (misses). It updates a weight for each feature based on how well it distinguishes between the instance and its neighbors. Relief-F extends Relief to handle multi-class problems and incomplete data. It assigns higher weights to features that are more relevant for distinguishing between instances of different classes.

3.1.8. F-histogram

The F-histogram (Fuzzy Histogram) is a model for representing relative direction between areal objects (like buildings). Unlike simpler models, it considers anisotropy (directional dependence) and the effects of distance and shape. It describes the relative direction from object A to object B as a combination of tuples (θi,FAB(θi))(\theta_i, \mathcal{F}_{AB}(\theta_i)), where θi\theta_i is a direction angle and FAB(θi)\mathcal{F}_{AB}(\theta_i) represents the "force" or membership degree of B belonging to the direction θi\theta_i of A. The value FAB(θi)\mathcal{F}_{AB}(\theta_i) is typically higher for closer objects.

3.1.9. Centroid Model

The centroid model for relative orientation defines the azimuth between two objects as the angle of the line connecting their centroids (geometric centers) to a reference axis (e.g., horizontal axis). While simple, it does not account for the shape or size of the objects, or the influence of distance in a cognitively meaningful way.

3.1.10. Voronoi Graph

A Voronoi diagram (or Voronoi tessellation) partitions a plane into regions based on distances to a set of points (sites). Each region consists of all points closer to one site than to any other. The Voronoi graph is the dual graph of the Delaunay triangulation. It can be used to model spatial relationships, but in the context of relative direction, it may not effectively capture the influence of distance or the nuances of areal objects.

3.2. Previous Works

The paper categorizes existing work into two main issues: cognition and typology of building patterns, and representation and extraction of building patterns.

3.2.1. Cognition and Typology of Building Patterns

  • Function-based: Buildings categorized as multi-family, single-family, non-residential (Meinel et al., 2009). This focuses on individual building function, not spatial patterns.
  • Map Generalization Typology: Linear, circular, star, and irregular patterns for building typification generalization (Anders, 2006). This serves generalization but is not comprehensive for all urban patterns.
  • Special Structures: Patterns like Stair-Pattern, T-Pattern, L-Pattern, E-Pattern, Z-Pattern, and H-Pattern (Yang, 2008). These are very specific and don't cover broader urban layouts.
  • Multilevel Categorization: Zhang et al. (2012) proposed a three-level hierarchy: building clusters (top), linear and non-linear (middle), and further refined linear into collinear, curvilinear, align-along-road, and non-linear into grid and non-structured (bottom). Later, collinear and curvilinear were complemented by alignment-of-center and alignment-of-side (Zhang et al., 2013a).
    • Limitation: The paper argues these typologies are still incomplete (missing patterns like parallel and perpendicular) and overlook associations between pattern types.

3.2.2. Representation and Extraction of Building Patterns

This area is divided into building clustering and pattern analysis.

3.2.2.1. Building Clustering

  • Methodology: Often uses a graph to represent relations, then partitions it into clusters based on building similarities.
  • Similarity Measurements:
    • Local density (Anders et al., 1999).
    • Neighboring vectors between centroids, distance, and directions of vectors (Boffet and Serra, 2001).
    • Multi-parameters: minimum distance, visual ranges, area ratio, boundary ratio, smallest bounding rectangles, and direction Voronoi graph (Yan et al., 2008).
  • Clustering Algorithms:
    • Relative relation graph (Anders et al., 1999).
    • Minimal spanning tree (MST) (Regnauld, 2001; Zhang et al., 2012).
    • Combinations of graph theory, Delaunay triangulation, Voronoi graph, urban morphology, Gestalt theory (Li et al., 2004).
  • Limitations: The paper states these methods often produce unsatisfactory results due to simple similarity measurements (Boffet and Serra, 2001) and ineffective clustering algorithms (Regnauld, 2001; Li et al., 2004; Yan et al., 2008).

3.2.2.2. Pattern Analysis

  • Methodology: Designed to extract specific patterns using parameters (e.g., proximity, orientation, path) to measure homogeneity and arrangements, and graph-based algorithms.
  • Examples:
    • Linear templates for linear patterns (Christophe and Ruas, 2002).
    • Delaunay triangulation and MST for align-along-road and unstructured patterns (Zhang et al., 2012).
    • Specialized algorithms for alignment-of-center and alignment-of-side patterns (Zhang et al., 2013a).
  • Algorithm Comparison: Cetinkaya et al. (2015) compared MST, ASCDT (Adaptive Spatial Clustering based on Delaunay Triangulation), DBSCAN (Density Based Spatial Clustering), and CHAMELEON, suggesting DBSCAN and ASCDT are more suitable for grouping buildings.
  • Limitations: Existing approaches are typically tailored to extract special patterns rather than a complete typology and are often separated from building clustering methods.

3.3. Technological Evolution

The evolution of urban pattern extraction has moved from simple geometric feature analysis and manual rule-based systems to more sophisticated computational geometry (like Delaunay triangulation) and graph-based clustering methods. Early methods focused on single-feature similarities or specific pattern types, often relying on subjective parameter tuning. The trend is towards multi-parameter similarity, automatic weight determination, and integrated frameworks that can handle a broader range of patterns and their interrelationships. The shift from basic geometric models (like centroid or Voronoi) to more cognitively plausible ones (F-histogram) for orientation is also a key development. This paper builds upon these advancements by integrating robust clustering with comprehensive pattern recognition, and by introducing automated weight learning, pushing towards more scalable and reliable urban analysis.

3.4. Differentiation Analysis

Compared to the main methods in related work, this paper's approach offers several innovations and differentiations:

  • Comprehensive Typology: Unlike prior work that focused on a limited number of patterns (e.g., linear, circular, specific H/Z patterns) or separated linear/non-linear, this paper introduces a more complete typology including collinear (straight, oblique), curvilinear, parallel and perpendicular groups, and grid patterns, and crucially, considers their inter-associations (e.g., collinear patterns forming parallel groups, which then form grid patterns).
  • Integrated Strategy: It combines building clustering (bottom-up) with pattern recognition (top-down) in an integrated framework. Existing methods often treat these as separate stages or focus on only one. This integration leverages the strengths of both: clustering provides homogeneous groups, and pattern recognition then extracts specific arrangements within those groups.
  • Automated Weight Determination: For similarity measurements used in clustering, previous studies often relied on manual selection or trial-and-error for weights. This paper innovatively uses the Relief-F algorithm to automatically optimize these weights, making the process more objective, efficient, and data-driven.
  • Advanced Graph Partitioning: It employs multilevel graph partition, which is known for its better performance and efficiency in clustering large graphs compared to simpler methods like minimal spanning tree or basic graph partitioning algorithms used in prior work.
  • Improved Orientation Modeling: The paper demonstrates and validates that the F-histogram model provides a more reliable and applicable representation of relative orientation for buildings compared to widely used models like the centroid model and Voronoi graph, especially for areal objects where shape, size, and distance influences are crucial.

4. Methodology

4.1. Principles

The core idea behind the proposed method is to leverage the hierarchical nature of urban building patterns by combining a bottom-up clustering approach with a top-down pattern recognition approach. The theoretical basis rests on the idea that buildings belonging to the same patterns share visual and geometric similarities and exhibit consistent spatial relationships (proximity, continuity, directionality).

  1. Bottom-up Clustering: Buildings are first grouped into homogeneous clusters based on their intrinsic geometric properties (area, shape) and spatial relationships (distance) using multilevel graph partition. This step ensures that buildings within a cluster are similar to each other, forming a foundation for pattern extraction. The weights for these similarities are automatically determined using the Relief-F algorithm, reducing subjectivity.

  2. Top-down Pattern Recognition: Within these pre-clustered groups, specific building patterns (collinear, curvilinear, parallel, perpendicular, grid) are then extracted. This step applies rules based on continuity and directionality specific to each pattern type, guided by established cognitive principles for urban forms. The F-histogram model is used for accurately capturing relative orientations.

    The proposed framework (Figure 2) illustrates this combined strategy:

    该图像是论文中用于展示Dataset 3数据集的建筑物平面图示意图,描绘了多个建筑物的空间分布和布局,反映出城市区域内建筑的排列特征。 该图像是论文中用于展示Dataset 3数据集的建筑物平面图示意图,描绘了多个建筑物的空间分布和布局,反映出城市区域内建筑的排列特征。

Fig. 2. The proposed framework for extracting building patterns.

4.2. Core Methodology In-depth (Layer by Layer)

The framework consists of three main parts:

  1. Clustering buildings (bottom-up process).
  2. Extracting collinear patterns.
  3. Extracting grid patterns (which builds upon collinear patterns and parallel/perpendicular groups).

4.2.1. Clustering Buildings

This process involves four steps: building a neighborhood graph, defining similarity measurements, identifying weights for these similarities, and finally clustering using multilevel graph partition.

4.2.1.1. Building the Neighborhood Graph

To represent spatial relationships between buildings, a neighborhood graph is constructed.

  • Interpolation: To ensure a smooth and continuous Delaunay triangulation, additional points with approximately equal distances are interpolated along the contours of buildings. The interpolation intervals are determined by the minimum distance between points on all building contours.

  • Triangulation Construction: With these interpolated contours, Delaunay triangulation is constructed for all points.

  • Improved Triangulation: Not all triangles generated by Delaunay triangulation are relevant for modeling neighboring relationships between buildings. The paper filters out irrelevant triangles:

    • Triangles entirely inside building contours.
    • Triangles connecting only with the study area's boundary.
    • Triangles connecting with both the study area's boundary and a single building. Only triangles that solely connect two neighboring buildings are preserved, leading to an improved triangulation. The following figure (Figure 3 from the original paper) illustrates this process:

    Fig. 13. Extracted collinear patterns in Dataset 3. 该图像是图表,展示了论文中数据集3的提取共线建筑模式。图中用不同颜色和边界表示建筑物及其空间关系,蓝色连接线突出显示共线排列的建筑群落结构。

    Fig. 3. Neighboring relations between buildings.

    • In the figure, (a) shows the original Delaunay triangulation where triangles inside buildings or connected to the boundary are visible. (b) shows the improved triangulation where only triangles connecting two distinct buildings remain.
  • Neighborhood Graph Formation: If two buildings are connected by an edge in this improved triangulation, they are considered neighbors. This forms the neighborhood graph G=(V,E)G = (V, E), where VV is the set of buildings (nodes) and EE is the set of edges representing neighboring relationships.

4.2.1.2. Defining Similarity Measurements

These measurements quantify how similar neighboring buildings are in terms of their spatial positions and geometric attributes. They are calculated only for buildings connected by an edge in the neighborhood graph GG.

  • Area Similarity: Measures the similarity in the sizes of two buildings. For two buildings AA and BB, if Area(A) and Area(B) are their respective areas, and Area(A)Area(B)Area(A) \leqslant Area(B), then the area similarity EArea(A,B)E_{Area}(A, B) is given by: $ E_{Area}(A, B) = \frac{Area(A)}{Area(B)} $ where Area(A) is the area of building AA, and Area(B) is the area of building BB. The value ranges from (0, 1], with a larger value indicating greater similarity.

  • Shape Similarity: Measured by rectangularity and length-width ratio.

    • Rectangularity Similarity: Rectangularity is defined as the ratio of a building's area to the area of its smallest bounding rectangle (SBR). The following figure (Figure 4 from the original paper) illustrates the SBR:

      Fig. 14. Extracted grid patterns in Dataset 3. 该图像是论文中图14的示意图,展示了数据集3中提取的网格状建筑模式,图中用蓝色标注出具有典型网格结构的建筑群,反映了多层次图划分方法在识别城市建筑布局中的应用效果。

      F 4. The mallest bounding recangleo a bulig. If Rec(A) and Rec(B) are the rectangularities of buildings AA and BB, and Rec(A)Rec(B)Rec(A) \leqslant Rec(B), then their similarity ERec(A,B)E_{Rec}(A, B) is: $ E_{Rec}(A, B) = \frac{Rec(A)}{Rec(B)} $ where Rec(A) is the rectangularity of building AA, and Rec(B) is the rectangularity of building BB.

    • Similarity of the Length-Width Ratio: For a building AA, its length-width ratio LTW(A) is the ratio of the length to the width of its SBR. If LTW(A)LTW(B)LTW(A) \leqslant LTW(B), the length-width ratio similarity ELTW(A,B)E_{LTW}(A, B) is: $ E_{LTW}(A, B) = \frac{LTW(A)}{LTW(B)} $ where LTW(A) is the length-width ratio of building AA, and LTW(B) is the length-width ratio of building BB.

    • Overall Shape Similarity: The overall shape similarity EShape(A,B)E_{Shape}(A, B) for buildings AA and BB is a weighted average of their rectangularity and length-width ratio similarities: $ E_{Shape}(A, B) = w_{Rec} \times E_{Rec}(A, B) + w_{LTW} \times E_{LTW}(A, B) $ where wRecw_{Rec} and wLTWw_{LTW} are the weights for rectangularity similarity and length-width ratio similarity, respectively. These weights sum to 1.0.

  • Distance Similarity: Uses visual distance, which is considered closer to human cognition and is measured using triangles with strong connections. For buildings AA and BB, if T(A, B) is the collection of triangles representing strong connections between them, their distance similarity EDis(A,B)E_{Dis}(A, B) is: $ E_{Dis}(A, B) = \frac{\sum_{t_i \in T(A, B)} d_i \times w_i}{\sum_{t_i \in T(A, B)} w_i} $ where:

    • tit_i: an individual triangle in the collection T(A, B).

    • wiw_i: the length between the two midpoints of the two sides of a triangle tit_i linking buildings AA and BB. This acts as a weight for the local distance.

    • did_i: the local distance between buildings. If the triangle tit_i is acute or right-angled, did_i is the height from the side shared with buildings. If the triangle tit_i is obtuse, did_i is the shortest side of the triangle connecting the two buildings. The following figure (Figure 5 from the original paper) illustrates the visual distance concept:

      Fig. 15. Index distributions of collinear patterns. 该图像是一个柱状图,展示了图15中共线图案的索引分布情况,横坐标为不同索引区间,纵坐标为对应长度的频数或计数。图中数据集中反映了共线图案的频率分布特征。

    Fig. 5. The visual distance based on triangles.

  • Total Similarity: The overall similarity between neighboring buildings is the weighted sum of the normalized area, shape, and distance similarities. $ E_{Total}(A, B) = \sum w_i \cdot E_i(A, B) $ where:

    • ETotal(A,B)E_{Total}(A, B): the overall similarity between buildings AA and BB.
    • i=1,2,3i = 1, 2, 3: refer to area, shape, and distance similarities, respectively.
    • Ei(A,B)E_i(A, B): the individual similarity measurement (e.g., EAreaE_{Area}, EShapeE_{Shape}, EDisE_{Dis}).
    • wiw_i: the weight for each similarity component, with the sum of all wiw_i being 1.0. All individual similarity values Ei(A,B)E_i(A,B) range from (0, 1], with larger values indicating greater similarity.

4.2.1.3. Identifying the Weights of All Similarities

Determining the weights (wiw_i in Eq. (4) and Eq. (6)) is crucial. Instead of subjective manual selection, the paper employs the Relief-F algorithm for automatic optimization.

  • Relief-F Algorithm: This algorithm estimates the quality of features by assigning an initial weight to each feature and then iteratively updating these weights. For a set of buildings S={S1,S2,,SN}\mathcal{S} = \{S_1, S_2, \ldots, S_N\}, where each building SiS_i has features {Si1,Si2,,SiM}\{S_{i1}, S_{i2}, \ldots, S_{iM}\} and belongs to one of PP clusters {Class1,Class2,,ClassP}\{Class_1, Class_2, \ldots, Class_P\}: The weight wjw_j for the jj-th feature of building SiS_i is updated using: $ w_j^i = w_j^{i-1} + \frac{1}{k \cdot N} [dis_{miss}(S_i, k) - dis_{hit}(S_i, k)] $ where:
    • wjiw_j^i: the weight of the jj-th feature after the ii-th iteration.

    • kk: the number of nearest neighbors considered.

    • NN: the total number of buildings.

    • dishit(Si,k)dis_{hit}(S_i, k): the distinction (distance) between SiS_i and its kk nearest neighbors belonging to the same cluster.

    • dismiss(Si,k)dis_{miss}(S_i, k): the distinction (distance) between SiS_i and its kk nearest neighbors belonging to different clusters.

      These distinction terms are defined as: $ dis_{hit}(S_i, k) = \sum_{m=1}^{k} \frac{|S_{ij} - S_{hitmj}|}{max(S_{*j}) - min(S_{*j})} $ $ dis_{miss}(S_i, k) = \sum_{C \neq Class(S_i)}^{k} \left[ \frac{P(C)}{1 - P(Class(S_i))} \sum_{m=1}^{k} \frac{|S_{ij} - S_{missmj}|}{max(S_{*j}) - min(S_{*j})} \right] $ where:

    • SijS_{ij}: the value of the jj-th feature for building SiS_i.

    • ShitmjS_{hitmj}: the value of the jj-th feature for the mm-th hit (nearest neighbor from the same class).

    • SmissmjS_{missmj}: the value of the jj-th feature for the mm-th miss (nearest neighbor from a different class).

    • max(Sj)max(S_{*j}) and min(Sj)min(S_{*j}): the largest and smallest values, respectively, of the jj-th feature across all buildings. This normalizes the feature differences.

    • P(C): the probability of occurrence of cluster CC (ratio of buildings in CC to total buildings).

    • Class(Si)Class(S_i): the cluster to which building SiS_i belongs.

4.2.1.4. Clustering Buildings with the Multilevel Graph Partition

The multilevel graph partitioning framework (Karypis and Kumar, 1998) is used to cluster buildings into homogeneous groups. This framework consists of four steps:

  1. Graph Construction: A weighted graph is constructed. Each building becomes a node. An edge connects two neighboring buildings (as defined in Section 4.2.1.1). The weight of each edge is set to the total similarity (ETotal(A,B)E_{Total}(A, B) from Eq. (6)), where the weights in Eq. (6) are determined by the Relief-F algorithm. This initial graph is the finest graph.
  2. Graph Coarsening: The finest graph is recursively transformed into a sequence of smaller, coarser graphs. This is achieved by collapsing nodes and their connecting edges. At each level, neighboring nodes with maximum similarity are merged into a single node at the next coarser level. This creates a hierarchy of graphs from fine to coarse.
  3. Graph Partition: This phase partitions the coarsest graph. In this study, lacking prior knowledge to guide partitioning at this stage, each node in the coarsest graph is initially considered as a single cluster.
  4. Graph Refinement: This step optimizes the initial clustering results obtained from the coarsening and partitioning phases. It works by iteratively moving nodes (buildings) between neighboring clusters. The goal is to make the similarity within clusters maximum and the similarity between clusters minimum. This ensures that the final clusters are as coherent as possible.

4.2.2. Extracting Collinear Building Patterns

Collinear patterns are fundamental and serve as a basis for other patterns. Their extraction relies on several criteria:

  • Similarity: Ensured by the initial multilevel graph partition, where buildings in a cluster already share high similarity in area, shape, and distance.
  • Proximity: Modeled by the neighborhood graph, where only spatially neighboring buildings are considered.
  • Continuity and Directionality: These are the primary criteria specifically evaluated for collinear patterns.

4.2.2.1. Definition and Extraction Rules

A pattern is defined by integrating the overall similarity (from clustering) with continuity and directionality specific to the pattern type. For buildings A, B, C, they form a collinear pattern if they satisfy the following conditions, expressed in Eq. (10): $ \left{ \begin{array}{ll} (E_{Total}(A, B) > \varepsilon_{Sim}) \wedge (E_{Total}(B, C) > \varepsilon_{Sim}) \ (E_{Con}(A, B) > \varepsilon_{Con}) \wedge (E_{Con}(B, C) > \varepsilon_{Con}) \end{array} \right. $ where:

  • εSim\varepsilon_{Sim}: threshold for overall similarity (ETotalE_{Total}).

  • εCon\varepsilon_{Con}: threshold for continuity and directionality (EConE_{Con}).

  • ETotal(A,B)E_{Total}(A, B) and ETotal(B,C)E_{Total}(B, C): overall similarities (from Eq. (6)).

  • ECon(A,B)E_{Con}(A, B) and ECon(B,C)E_{Con}(B, C): measures of continuity and directionality, which are specified for collinear patterns by Eq. (11).

    For collinear patterns, the continuity and directionality criteria (EConE_{Con}) are further detailed by Eq. (11): $ \left{ \begin{array}{ll} (E_{Dir}(A, B) < \varepsilon_{Dir}) \wedge (E_{Dir}(B, C) < \varepsilon_{Dir}) \

|\pi - E_{Ori}(A, B, C)| < \varepsilon_{Ori}^{Min} \

(E_{Pro}(A, B) \geqslant \varepsilon_{Pro}) \wedge (E_{Pro}(B, C) \geqslant \varepsilon_{Pro}) \end{array} \right. $ where:

  • EDir(A,B)E_{Dir}(A, B) and EDir(B,C)E_{Dir}(B, C): angle differences of the main directions of two buildings. This ensures buildings are aligned.

  • εDir\varepsilon_{Dir}: threshold for angle difference of main directions.

  • EOri(A,B,C)E_{Ori}(A, B, C): azimuth angle formed by the three buildings. This measures how straight the alignment is. A value close to π\pi indicates collinearity.

  • εOriMin\varepsilon_{Ori}^{Min}: minimum threshold for the azimuth angle.

  • EPro(A,B)E_{Pro}(A, B) and EPro(B,C)E_{Pro}(B, C): projection overlapping degrees between buildings. This differentiates straight patterns (requiring overlap) from oblique patterns (no overlap required).

  • εPro\varepsilon_{Pro}: threshold for projection overlapping degree. The following figure (Figure 6 from the original paper) illustrates some of these geometric parameters:

    该图像是一个柱状图,展示了Dataset 3中不同区间的数据分布情况,横轴为区间范围,纵轴为对应频数。主要数值集中在0.0-0.1区间,其他区间数据较少。 该图像是一个柱状图,展示了Dataset 3中不同区间的数据分布情况,横轴为区间范围,纵轴为对应频数。主要数值集中在0.0-0.1区间,其他区间数据较少。

Fig. 6. Illustration of geometric parameters for collinear patterns. If buildings A, B, and C satisfy Eq. (11), they form a collinear pattern, Coll(A, B, C). Larger patterns are grown by repeatedly applying these rules.

4.2.2.2. Extraction Parameters

The three geometric parameters in Eq. (11) are defined as follows:

  • Projection Overlapping (EProE_{Pro}): Measures the overlap between projections of two buildings along a specific direction. If building CC is the reference, L1L_1 is the maximum overlapping length of BB's and CC's projections in the main direction of CC, and L2L_2 is the total length of the two projections. $ E_{Pro}(B, C) = \frac{L_1}{L_2} $ where L1L_1 is the maximum overlapping length of projections, and L2L_2 is the total length of the two projections.

  • Angle Difference of Main Directions (EDirE_{Dir}): The main direction VDir(A)V_{Dir}(A) of a building AA is the angle from the long axis of its SBR(A) to the horizontal axis. The angle difference is the absolute difference between main directions. $ E_{Dir}(A, B) = |V_{Dir}(A) - V_{Dir}(B)| $ where VDir(A)V_{Dir}(A) is the main direction of building AA, and VDir(B)V_{Dir}(B) is the main direction of building BB. To handle periodicity, main directions are mapped to [0,π)[0, \pi) and their differences to [0,π/2][0, \pi/2].

  • Azimuth Angle (EOriE_{Ori}): Measures the relative orientation between buildings. The paper compares centroid model, Voronoi graph, and F-histogram. F-histogram is chosen for its ability to account for size, distance, and rotation. F-histogram describes relative direction from building A to B as (θi,FAB(θi))(\theta_i, \mathcal{F}_{AB}(\theta_i)), where FAB(θi)\mathcal{F}_{AB}(\theta_i) is the "force" of A on B in direction θi\theta_i, representing the weight of B belonging to A's direction θi\theta_i. Distance influences FAB(θi)\mathcal{F}_{AB}(\theta_i). The following figure (Figure 7 from the original paper) illustrates F-histograms:

    该图像是一个图表,显示了Dataset 2中不同点数量对应的平均指数和准确率变化趋势,横轴为点数,纵轴左侧是平均指数,右侧是准确率百分比。 该图像是一个图表,显示了Dataset 2中不同点数量对应的平均指数和准确率变化趋势,横轴为点数,纵轴左侧是平均指数,右侧是准确率百分比。

    Fig. 7. F-histogram. Two methods are proposed to derive azimuth from F-histogram:

    • Maximum-weight azimuth: EOri(A,B)=argmaxθi(FAB(θi))E_{Ori}(A, B) = \mathrm{argmax}_{\theta_i} (\mathcal{F}_{AB}(\theta_i)). This selects the direction with the highest "force".

    • Weighted-average azimuth: EOri(A,B)=FAB(θi)θiE_{Ori}(A, B) = \sum \mathcal{F}_{AB}(\theta_i) \cdot \theta_i. This calculates a weighted average of all directions.

      The azimuth angle for multiple buildings (e.g., A, B, C) refers to the angle θ\theta (in Figure 8) that reflects their collinearity. The closer to π\pi, the better the collinear pattern. Let EOri(A,B)E_{Ori}(A, B) be the azimuth between AA and BB, and EOri(B,C)E_{Ori}(B, C) be the azimuth between BB and CC. The azimuth angle EOri(A,B,C)E_{Ori}(A, B, C) is defined by two cases: $ E_{Ori}(A, B, C) = \left{ \begin{array}{ll}

    |E_{Ori}(A, B) - E_{Ori}(B, C)| & \text{if } E_{Ori}(A, B) \geqslant E_{Ori}(B, C) \

    \pi - |E_{Ori}(A, B) - E_{Ori}(B, C)| & \text{if } E_{Ori}(A, B) < E_{Ori}(B, C) \end{array} \right. $ The following figure (Figure 8 from the original paper) illustrates the azimuth angle of multiple buildings:

    Fig. 16. Index distributions of grid patterns. 该图像是图16,展示了两组数据(Dataset 2和Dataset 3)中网格模式指数的分布柱状图,反映不同指数区间内的数值频率差异。

    Fig. 8. The azimuth angle of multiple buildings. θ\theta refers to the azimuth angle, while EMPY δ\delta is the bisector direction angle of the azimuth angle.

4.2.2.3. Extraction Methods (Algorithm 1)

Algorithm 1 outlines the extraction of collinear building patterns:

Algorithm 1: Extracting collinear building patterns
Input: building clusters; constraints
Output: collection of collinear building patterns

1: Initializing directed edges according to the projection overlapping degree;
   // This step categorizes neighboring buildings into straight-line and oblique-line groupings
   // based on whether their projection overlapping degree meets a certain threshold.

2: Extracting the initial collinear building patterns;
   For each directed edge (υ0,υ1)(\upsilon_0, \upsilon_1) from the groupings:
     Set υ0\upsilon_0 as the begin node, υ1\upsilon_1 as the end node.
     Loop, continuously searching for an adjacent node υ2\upsilon_2 of υ1\upsilon_1:
       if EDir(υ0,υ1)E_{Dir}(\upsilon_0, \upsilon_1) and EDir(υ1,υ2)E_{Dir}(\upsilon_1, \upsilon_2) satisfy the angle difference rule of main directions then
         if EOri(υ0,υ1,υ2)E_{Ori}(\upsilon_0, \upsilon_1, \upsilon_2) satisfy the azimuth angle rule then
           add υ2\upsilon_2 to the building pattern and form a new directed edge (υ1,υ2)(\upsilon_1, \upsilon_2);
           // The pattern grows by adding υ2\upsilon_2 if it maintains collinearity and main direction alignment.
         else
           break; // The current sequence does not form a collinear pattern.
       else
         break; // The current sequence does not form a collinear pattern due to main direction misalignment.
     endloop;
   // This step grows patterns by adding buildings that satisfy the collinearity rules.
   // This is done for both straight-line and oblique-line groupings.

3: Acquiring the final collinear patterns by post-processing.
   // This step resolves overlapping patterns by retaining only the larger ones with more buildings.
   // For oblique-line patterns, those with fewer than three buildings might be deleted (a heuristic).

4.2.3. Extracting Curvilinear Patterns

Curvilinear patterns are extracted similarly to collinear patterns but with adjusted rules to account for their curved nature.

4.2.3.1. Extraction Rules

The main directions and azimuths in curvilinear patterns can vary more significantly than in collinear patterns. The extraction considers permissible ranges for azimuth angles and direction deviations. For buildings A, B, C, they form a curvilinear pattern if they satisfy Eq. (16): $ \left{ \begin{array}{ll} (| \pi - E_{Ori}(A, B, C) | \leqslant \varepsilon_{Ori}^{Min}) \wedge (| E_{Dir}^{N}(A, B, C) - V_{Dir}(B) | \leqslant \varepsilon_{Dir}^{Max}) \ (\varepsilon_{Ori}^{Min} < | \pi - E_{Ori}(A, B, C) | \leqslant \varepsilon_{Ori}^{Max}) \wedge (| E_{Dir}^{N}(A, B, C) - V_{Dir}(B) | \leqslant \varepsilon_{Dir}^{Var}) \end{array} \right. $ where:

  • VDir(B)V_{Dir}(B): main direction of building BB.
  • EOri(A,B,C)E_{Ori}(A, B, C): azimuth angle of the three buildings A, B, C.
  • EDirN(A,B,C)E_{Dir}^{N}(A, B, C): the bisector direction angle of EOri(A,B,C)E_{Ori}(A, B, C).
  • [εOriMin,εOriMax][\varepsilon_{Ori}^{Min}, \varepsilon_{Ori}^{Max}]: permissible range for the supplementary angle of the azimuth angle.
  • [εDirVar,εDirMax][\varepsilon_{Dir}^{Var}, \varepsilon_{Dir}^{Max}]: permissible range for the direction deviation. The paper notes cognitive thresholds: supplementary angle of azimuth from 4040^{\circ} to 6060^{\circ}, and direction deviation ±15\pm 15^{\circ}. Specifically, when the supplementary angle is 15\leqslant 15^{\circ}, the direction deviation threshold is 1515^{\circ}; it decreases linearly as the supplementary angle increases, reaching 0 when the supplementary angle is 2525^{\circ}.

4.2.3.2. Extraction Parameters

The bisector direction angle of the azimuth angle, EDirN(A,B,C)E_{Dir}^{N}(A, B, C), is defined as the angle from the bisector direction (denoted by δ\delta in Figure 8) to the horizontal axis. Using EOri(A,B)E_{Ori}(A, B) (azimuth between AA and BB) and EOri(B,C)E_{Ori}(B, C) (azimuth between BB and CC): $ E_{Dir}^{N}(A, B, C) = \left{ \begin{array}{ll} (E_{Ori}(A, B) + E_{Ori}(B, C)) / 2 & \text{if case 1 (e.g., in Figure 8) } \ (\pi - |E_{Ori}(A, B) - E_{Ori}(B, C)|) / 2 + E_{Ori}(A, B) & \text{if case 2 (e.g., in Figure 8)} \end{array} \right. $ The two conditions correspond to the two cases illustrated in Figure 8, where the calculation depends on the relative values of EOri(A,B)E_{Ori}(A, B) and EOri(B,C)E_{Ori}(B, C).

4.2.3.3. Extraction Methods

The extraction method for curvilinear patterns is analogous to Algorithm 1 for collinear patterns, applying the specific rules and parameters for curvilinear arrangements.

4.2.4. Evaluating Building Patterns

A quality index is assigned to extracted patterns based on human cognitive psychology.

4.2.4.1. Evaluating Collinear Patterns

Four principles guide evaluation: homogeneity, number of buildings, main direction, and azimuth angle. Better patterns have higher homogeneity and number of buildings, and smaller average angle difference of main directions and standard deviation of azimuth angles. The quality index Qcoll(k)Q_{coll}(k) for a collinear pattern Coll(k) is defined as: $ Q_{coll}(k) = \frac{Hom_{coll}(k)}{Num_{coll}(k)} + \frac{Mean_{coll}^{Dir}(k) + Std_{coll}^{Ori}(k)}{90} $ where:

  • Homcoll(k)Hom_{coll}(k): homogeneity of pattern Coll(k). This is the weighted average of area, shape, and distance homogeneities.

  • Numcoll(k)Num_{coll}(k): number of buildings in pattern Coll(k).

  • MeancollDir(k)Mean_{coll}^{Dir}(k): average angle difference of main directions in the pattern.

  • StdcollOri(k)Std_{coll}^{Ori}(k): standard deviation of the azimuth angles of buildings in the pattern.

  • 90: a normalization factor.

    The homogeneity Hom(k) is calculated as: $ Hom(k) = \sum w_i \cdot hom_i(k) $ where wiw_i is the weight for each homogeneity component (identified by Relief-F). And individual homogeneity homi(k)hom_i(k) is: $ hom_i(k) = \left{ \begin{array}{ll} Std_s(k) / Mean_s(k) & \text{for area or distance similarity (s)} \ Std_t(k) & \text{for rectangularity and length-width ratio (t)} \end{array} \right. $ where Stds(k)Std_s(k) and Means(k)Mean_s(k) are the standard deviation and mean for similarity ss, and Stdt(k)Std_t(k) is the standard deviation for similarity tt. A smaller Qcoll(k)Q_{coll}(k) (normalized to be less than 3.0) indicates a better quality pattern.

4.2.4.2. Evaluating Curvilinear Patterns

Three factors are considered: homogeneity, number of buildings, and direction deviation. Better patterns have higher homogeneity and number of buildings, and smaller direction deviation. The quality index QCurv(k)Q_{Curv}(k) for a curvilinear pattern Curv(k) is defined as: $ Q_{Curv}(k) = \frac{Hom_{Curv}(k)}{Num_{Curv}(k)} + \frac{Std_{Curv}^{Dir}(k)}{90} $ where:

  • HomCurv(k)Hom_{Curv}(k): homogeneity of curvilinear pattern Curv(k), calculated using Eqs. (19) and (20).
  • NumCurv(k)Num_{Curv}(k): number of buildings in pattern Curv(k).
  • StdCurvDir(k)Std_{Curv}^{Dir}(k): standard deviation of direction deviations within the pattern. A smaller QCurv(k)Q_{Curv}(k) (normalized to be less than 2.0) indicates a better quality pattern.

4.2.5. Extraction of Grid Building Patterns

Grid patterns represent a higher-level urban structure.

4.2.5.1. Definition of Grid Patterns

Grid patterns can be regular (all collinear patterns perpendicular) or irregular (at least one perpendicular relationship). They consist of two groups of collinear patterns. Let ParaA={ColAi}i=1MPara_A = \{Col_A^i\}_{i=1}^M and ParaB={ColBj}j=1NPara_B = \{Col_B^j\}_{j=1}^N be two groups of collinear patterns. Let ColDiriCol_{Dir}^i and ColDirjCol_{Dir}^j be their global orientations. Let TProxi,jT_{Prox}\langle i, j \rangle be the neighboring relation and TIntei,jT_{Inte}\langle i, j \rangle be the topological intersection between patterns. Two groups ParaAPara_A and ParaBPara_B form a grid pattern, Grid(i, j), if they satisfy Eq. (22): $ \left{ \begin{array}{ll} (T_{Prox}\langle i, i+1 \rangle = 1) \land (|Col_{Dir}^i - Col_{Dir}^{i+1}| < \varepsilon_{Dir}^{Para}) \ (T_{Prox}\langle j, j+1 \rangle = 1) \land (|Col_{Dir}^j - Col_{Dir}^{j+1}| < \varepsilon_{Dir}^{Para}) \ (T_{Inte}\langle i, j \rangle = 1) \land (|Col_{Dir}^i - Col_{Dir}^j - \pi / 2| < \varepsilon_{Dir}^{Vert}) \end{array} \right. $ where:

  • εDirPara\varepsilon_{Dir}^{Para}: threshold for parallel patterns (i.e., angle difference between parallel collinear patterns).
  • εDirVert\varepsilon_{Dir}^{Vert}: threshold for perpendicular patterns (i.e., angle difference from 9090^{\circ} between perpendicular collinear patterns). The first two conditions define parallel groups of collinear patterns that are spatially adjacent and have similar orientations. The third condition checks for perpendicularity and topological intersection between these two parallel groups. The following figure (Figure 9 from the original paper) illustrates grid patterns:

该图像是一个折线图,展示了Dataset 2中不同数据点下指标平均值和准确率的变化趋势。图中用折线表示指标平均值和准确率两组数据,横轴为数据点,纵轴分别对应指标值和准确率百分比。 该图像是一个折线图,展示了Dataset 2中不同数据点下指标平均值和准确率的变化趋势。图中用折线表示指标平均值和准确率两组数据,横轴为数据点,纵轴分别对应指标值和准确率百分比。

Fig. 9. Grid building patterns.

4.2.5.2. Extraction of Grid Patterns

The extraction proceeds in three steps: parallel groups, perpendicular groups, and grid patterns.

  • Extracting Parallel Groups of Collinear Patterns: Two collinear patterns ColACol_A and ColBCol_B are parallel, Para(ColA,ColB)Para(Col_A, Col_B), if they satisfy Eq. (23): $ \left{ \begin{array}{ll} T_{Prox}\langle A, B \rangle = 1 \

    |Col_{Dir}^A - Col_{Dir}^B| < \varepsilon_{Dir}^{Para}

    \end{array} \right. $ where:

    • TProxA,B=1T_{Prox}\langle A, B \rangle = 1: indicates that ColACol_A and ColBCol_B are topologically adjacent.

    • ColDirAColDirB<εDirPara|Col_{Dir}^A - Col_{Dir}^B| < \varepsilon_{Dir}^{Para}: indicates that their global orientations are similar (within threshold εDirPara\varepsilon_{Dir}^{Para}). This groups multiple collinear patterns into a ParaGroup.

    • Topological Adjacency Relations: Two collinear patterns ColkiCol_k^i and ColkjCol_k^j in the same cluster are spatially neighboring if at least one building in ColkiCol_k^i is spatially neighboring to a building in ColkjCol_k^j (based on the building neighborhood graph G). This forms a collinear-pattern neighborhood graph GkCol=(VkCol,EkCol)G_k^{Col} = (V_k^{Col}, E_k^{Col}).

    • Global Orientation of Collinear Patterns (ColDirCol_{Dir}): This describes the overall orientation trend of a collinear pattern, measured by the average azimuth angle of all neighboring buildings within the pattern. For Coll(V1,V2,,VN)Coll(V_1, V_2, \ldots, V_N), with EOri(Vi,Vj)E_{Ori}(V_i, V_j) as azimuth between neighboring buildings, its global orientation ColDirCol_{Dir} is: $ Col_{Dir} = \frac{\sum E_{Ori}(V_i, V_j)}{N} $ where EOri(Vi,Vj)E_{Ori}(V_i, V_j) must be normalized. If EOri(Vi,Vj)[πεOriMin,π)E_{Ori}(V_i, V_j) \in [\pi - \varepsilon_{Ori}^{Min}, \pi), it is transformed using: $ E_{Ori}(V_i, V_j) = \pi - E_{Ori}(V_i, V_j) $

  • Extracting Perpendicular Groups of Collinear Patterns: Two collinear patterns ColACol_A and ColBCol_B form a perpendicular group, Vert(ParaA,ParaB)Vert(Para_A, Para_B), if they satisfy Eq. (26): $ \left{ \begin{array}{ll} T_{Inte}\langle A, B \rangle = 1 \

    |Para_{Dir}^A - Para_{Dir}^B - \pi / 2| < \varepsilon_{Dir}^{Vert}

    \end{array} \right. $ where:

    • TInteA,B=1T_{Inte}\langle A, B \rangle = 1: indicates topological intersection.

    • ParaDirAParaDirBπ/2<εDirVert|Para_{Dir}^A - Para_{Dir}^B - \pi/2| < \varepsilon_{Dir}^{Vert}: indicates their global orientations are approximately perpendicular (within threshold εDirVert\varepsilon_{Dir}^{Vert}).

    • Topological Intersection Relation: Collinear patterns ColkiCol_k^i and ColkjCol_k^j topologically intersect if they share common buildings. Two groups of collinear patterns ParaAPara_A and ParaBPara_B intersect if at least one pair of patterns from each group intersects.

    • Global Orientation of Parallel Patterns (ParaDirPara_{Dir}): This measures the overall trend of all parallel patterns in a group, calculated as the average of their individual global orientations. For Para(Col1,Col2,,ColN)Para(Col_1, Col_2, \ldots, Col_N), its global orientation ParaDirPara_{Dir} is: $ Para_{Dir} = \frac{\sum Col_{Dir}^i}{N} $ where ColDiriCol_{Dir}^i is the global orientation of pattern ColiCol_i.

  • Extracting Grid Patterns: For perpendicular groups ParaAPara_A and ParaBPara_B, they form a grid pattern, Grid(i, j), if they satisfy Eq. (28): $ \left{ \begin{array}{ll} T_{Inte}\langle i, j \rangle = 1 \ Degree_{V_i} \geqslant 2 \end{array} \right. $ where:

    • TIntei,j=1T_{Inte}\langle i, j \rangle = 1: indicates topological intersection between the groups.
    • DegreeVi2Degree_{V_i} \geqslant 2: ensures connectivity within the grid. The connectivity (DegreeViDegree_{V_i}) of a building ViV_i in a perpendicular group refers to the number of its adjacent buildings within that pattern. This condition implies that buildings within a grid should have multiple connections to form a robust grid structure.

4.2.5.3. Evaluating Grid Patterns

The quality of grid patterns is evaluated based on homogeneity, number of buildings, and global orientation. Specifically, a smaller standard deviation of global orientations of collinear patterns within a grid pattern indicates better quality. For a grid pattern Grid(i, j), the quality index QGrid(i,j)Q_{Grid}(i, j) is defined as: $ Q_{Grid}(i, j) = \frac{Hom_{Grid}(i, j)}{Num_{Grid}(i, j)} + \frac{Std_{Grid}^{Dir}(i) + Std_{Grid}^{Dir}(j)}{90} $ where:

  • HomGrid(i,j)Hom_{Grid}(i, j): homogeneity of the grid pattern, calculated using Eqs. (19) and (20).
  • NumGrid(i,j)Num_{Grid}(i, j): number of buildings in the grid pattern.
  • StdGridDir(i)Std_{Grid}^{Dir}(i) and StdGridDir(j)Std_{Grid}^{Dir}(j): standard deviations of the global orientations of collinear patterns in the two perpendicular groups ii and jj. A smaller QGrid(i,j)Q_{Grid}(i, j) (normalized to be less than 3.0) indicates a better quality grid pattern.

5. Experimental Setup

5.1. Datasets

The experiments used building data at a scale of 1:10,000. Due to the wide coverage and dispersed distribution of buildings in the full dataset, three specific areas with dense buildings and regular arrangements were selected for analysis.

The following figure (Figure 10 from the original paper) shows the experimental datasets:

Fig. 2. The proposed framework for extracting building patterns. 该图像是论文中图2,展示了提出的建筑模式提取框架的流程示意图,包含自底向上的建筑聚类、多层次图划分,以及自顶向下的共线、网格模式提取步骤。

Fig. 3. Neighboring relations between buildings. 该图像是论文中图3的示意图,展示了两种不同的Delaunay三角剖分方式。左图为原始剖分,右图为改进后的剖分,通过调整连接关系以更好地反映建筑邻接关系。

该图像是一个示意图,展示了多边形区域A及其空间扩展区域`SBR(A)`的关系,图中用虚线框出扩展区域,形象说明了建筑分组中的空间邻接概念。 该图像是一个示意图,展示了多边形区域A及其空间扩展区域SBR(A)的关系,图中用虚线框出扩展区域,形象说明了建筑分组中的空间邻接概念。

The image shows Dataset 1 (top left), Dataset 2 (middle), and Dataset 3 (bottom).

  • Dataset 1: Contains 64 buildings. This dataset was primarily used to analyze the influences of different orientation models.
  • Dataset 2: Contains 139 buildings. This dataset, along with Dataset 3, was used to verify the applicability and reliability of the proposed methods, including quality evaluation, accuracy assessment, and parameter sensitivity analysis.
  • Dataset 3: Contains 2141 buildings. This is the largest dataset, located in an urban area of Beijing City, China. It is characterized by dense distribution and great diversity in building types, sizes, and shapes (e.g., commercial, cultural, educational, industrial, informal settlements, developing zones). This diversity makes it a robust test for the methods' applicability and reliability.

5.2. Evaluation Metrics

The accuracy of the extracted building patterns was evaluated using two standard measurements: correctness and completeness. The ground truth for pattern recognition was established by visual and manual interpretation by 10 engineers and 10 students with cartographic backgrounds.

  • Correctness:
    • Conceptual Definition: Correctness quantifies the proportion of the extracted patterns that are actually true patterns according to the ground truth. It measures how many of the identified patterns are genuinely correct.
    • Mathematical Formula: $ Correctness = \frac{TP}{TP + FP} $
    • Symbol Explanation:
      • TP (True Positives): The number of patterns correctly extracted (i.e., identified patterns that match a ground truth pattern).
      • FP (False Positives): The number of patterns incorrectly extracted (i.e., identified patterns that do not match any ground truth pattern).
  • Completeness:
    • Conceptual Definition: Completeness quantifies the proportion of all true patterns in the dataset that were successfully extracted by the method. It measures how many of the actual patterns in the environment were found.
    • Mathematical Formula: $ Completeness = \frac{TP}{TP + FN} $
    • Symbol Explanation:
      • TP (True Positives): The number of patterns correctly extracted.
      • FN (False Negatives): The number of patterns that exist in the ground truth but were not extracted by the method.

5.3. Baselines

For the comparison of orientation models in extracting building patterns, the proposed F-histogram model was compared against two widely used models:

  • Centroid Model: A simpler model that defines orientation based on the line connecting the centroids of two buildings.
  • Voronoi Graph Model: A model that derives orientation relations based on the Voronoi graph of building data.

6. Results & Analysis

6.1. Core Results Analysis

6.1.1. Identified Weights and Thresholds

The Relief-F algorithm was used to automatically determine the weights for the area, rectangularity, length-width ratio, and distance similarity components, as well as the overall similarity threshold, for each of the three datasets.

The following are the results from Table 1 of the original paper:

Data Weights The threshold of overall similarity
Area Rectangularity Length-width ratio Distance
Dataset 1 0.320 0.247 0.309 0.124 0.850
Dataset 2 0.246 0.327 0.329 0.098 0.850
Dataset 3 0.230 0.332 0.217 0.221 0.850

Analysis:

  • The Relief-F algorithm produced different weights for each dataset, indicating its adaptability to the specific characteristics of the input data. For instance, in Dataset 1, area and length-width ratio had higher weights, while in Datasets 2 and 3, rectangularity had the highest weight. This suggests that the relative importance of shape attributes varies depending on the urban morphology of the dataset.
  • The threshold of overall similarity was consistently set at 0.850 across all datasets. This implies a high degree of required similarity for buildings to be grouped into the same cluster.

6.1.2. Influences of Orientation Models

Dataset 1 (64 buildings) was used to compare the centroid model, Voronoi graph model, and F-histogram for pattern extraction.

  • Clustering Results: Using the weights from Table 1, 14 clusters were obtained for Dataset 1. These clusters were deemed high quality due to visual homogeneity and high correctness/completeness. The following figure (Figure 11 from the original paper) shows the clusters and extracted patterns for Dataset 1:

    Fig. 5. The visual distance based on triangles. 该图像是论文中的示意图,展示了基于三角形的视觉距离计算方法,图中标注了距离 did_i 和权重 wiw_i,用于表征建筑物空间关系。

    The image shows (a) clusters, (b) centroid model, (c) Voronoi graph model, (d) F-histogram maximum weight azimuth, (e) F-histogram weighted average azimuth, and (f) grid patterns.

  • Pattern Extraction Comparison: The extracted patterns based on different orientation models are shown in Figure 11b-e. The following are the results from Table 2 of the original paper:

    Orientation model Correctness (%) Completeness (%)
    Centroid model 100.0 91.3
    Voronoi graph model 95.2 87.0
    F-histogram (maximum weight azimuth) 100.0 95.7
    F-histogram (weighted average azimuth) 100.0 100.0

Analysis:

  • Both the centroid model and F-histogram (especially its variants) generally produced better results than the Voronoi graph model. The Voronoi graph model showed lower correctness (95.2%) and completeness (87.0%).
  • The centroid model achieved 100% correctness but slightly lower completeness (91.3%). However, the paper notes that the centroid model is primarily suitable for regular-shaped buildings, limiting its general applicability.
  • The F-histogram variants demonstrated the best performance. The maximum weight azimuth variant achieved 100% correctness and 95.7% completeness, while the weighted average azimuth variant achieved 100.0% correctness and 100.0% completeness. This strongly validates the F-histogram as the most reliable and applicable model for measuring orientation relations.
  • Grid Patterns: From the extracted collinear patterns, three grid patterns were successfully produced (Figure 11f) by setting global orientation thresholds. These grids were described as regular and high quality, further demonstrating the method's capability.

6.1.3. Applicability and Reliability (Datasets 2 and 3)

Datasets 2 (139 buildings) and 3 (2141 buildings) were used to test the method's general applicability and reliability on more diverse and larger urban scenes.

  • For Dataset 2, 37 clusters, 55 collinear patterns (Figure 12a, not explicitly provided but implied by context), and 5 grid patterns (Figure 12b, not explicitly provided but implied by context) were extracted. The following figure (Figure 13 from the original paper) shows the extracted collinear patterns in Dataset 3:

    Fig. 8. The azimuth angle of multiple buildings. \(\\theta\) refers to the azimuth angle, while EMPY \(\\delta\) is the bisector direction angle of the azimuth angle. 该图像是示意图,展示了多栋建筑物的方位角关系。图中用 heta 表示方位角,EMPY oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{oldsymbol{ heta}}}}}}}}}}}}}}}}}}} 表示方位角的平分角,描述建筑群的空间方向特征。

    Fig. 13. Extracted collinear patterns in Dataset 3.

  • For Dataset 3, 811 clusters, 619 collinear patterns (Figure 13), and 116 grid patterns (Figure 14) were extracted. The following figure (Figure 14 from the original paper) shows the extracted grid patterns in Dataset 3:

    Fig. 7. F-histogram. 该图像是图表,包含两组建筑示意及对应的F-直方图,展示了建筑A和建筑B在不同排列方式下的角度分布,左图建筑较为平行,右图建筑呈L形排列,直方图分别显示了建筑方向的主要角度。

    Fig. 14. Extracted grid patterns in Dataset 3.

6.1.4. Quality Evaluation

The quality indices (QcollQ_{coll} and QCurvQ_{Curv}) were calculated for the patterns in Datasets 2 and 3.

The following are the results from Table 3 of the original paper:

Data Quality indices of collinear patterns Quality indices of grid patterns
Maximum Minimum Average Maximum Minimum Average
Dataset 2 0.141 0.003 0.044 0.093 0.035 0.067
Dataset 3 1.082 0.001 0.073 0.135 0.010 0.053

Analysis:

  • For collinear patterns, Dataset 2 showed a maximum index of 0.141 and an average of 0.044. Dataset 3 had a higher maximum (1.082) but a similar average (0.073). The high maximum in Dataset 3 could be due to a few outliers or less optimal patterns.

  • For grid patterns, both datasets showed relatively low maximums (0.093 and 0.135) and low averages (0.067 and 0.053), indicating overall good quality.

  • Index Distributions: Histograms of quality indices (Figure 15 for collinear, Figure 16 for grid) show that the majority of patterns for both types fall within the [0, 0.1] interval, which is considered satisfactory. Few patterns fall into the [0.2, 1.0] (unacceptable) range. The following figure (Figure 15 of the original paper) shows the index distributions of collinear patterns:

    Fig. 9. Grid building patterns. 该图像是论文中图9的示意图,展示了两种建筑布局模式:左侧为规则网格状排列,右侧为不规则排列。图中矩形代表建筑,用颜色区分不同类型。

    Fig. 15. Index distributions of collinear patterns. The following figure (Figure 16 of the original paper) shows the index distributions of grid patterns:

    该图像是建筑群布局的示意图,展示了利用多层次图划分方法提取出的建筑分布形态,反映了建筑的空间关系和排列规律。 该图像是建筑群布局的示意图,展示了利用多层次图划分方法提取出的建筑分布形态,反映了建筑的空间关系和排列规律。

    Fig. 16. Index distributions of grid patterns.

  • Excellent Rate: The paper defines an excellent rate as the ratio of satisfactory and normal patterns (index < 0.2) to the total. This rate is consistently high, indicating that the vast majority of extracted patterns are of acceptable quality.

6.1.5. Accuracy Assessment

The overall accuracy of the extracted patterns is presented.

The following are the results from Table 4 of the original paper:

Data Collinear patterns Grid patterns
Correctness (%) Completeness (%) Correctness (%) Completeness (%)
Dataset 2 90.9 92.6 100.0 83.3
Dataset 3 92.8 95.6 93.9 95.1

Analysis:

  • The method achieved high accuracies across both datasets and pattern types. For collinear patterns, correctness was 90.9% (Dataset 2) and 92.8% (Dataset 3), with completeness at 92.6% and 95.6% respectively.
  • For grid patterns, correctness was 100.0% (Dataset 2) and 93.9% (Dataset 3), with completeness at 83.3% and 95.1% respectively. The lower completeness for grid patterns in Dataset 2 (83.3%) might suggest some grid structures were missed, possibly due to strict thresholds or complex arrangements.
  • Overall, the results demonstrate that the proposed methods can produce satisfactory results and are reliable and applicable even for large and diverse datasets like Dataset 3.

6.1.6. Parameter Sensitivity Analysis

The influence of two key parameters for collinear pattern extraction was analyzed: the angle difference of main directions (EDirE_{Dir}) and the azimuth angle (EOriE_{Ori}). The total number of patterns, the excellent rate, and the average quality indices were used as reference metrics.

  • Influence of Angle Difference of Main Directions (EDirE_{Dir}):

    • Thresholds for projection overlapping and azimuth angle were fixed (0.5 and 15 respectively).

    • The threshold for EDirE_{Dir} was varied from 6, 8, 10, 12 to 14. The following figure (Figure 17 of the original paper) shows the parameter sensitivity analysis for angle difference of main directions:

      该图像是多模型建筑群空间结构示意图,展示了建筑簇(a)、中心模型(b)、沃罗诺依图模型(c)、F-直方图最大权值方向(d)、加权平均方向(e)及网格模式(f)的对比,体现建筑模式的多层次分析。 该图像是多模型建筑群空间结构示意图,展示了建筑簇(a)、中心模型(b)、沃罗诺依图模型(c)、F-直方图最大权值方向(d)、加权平均方向(e)及网格模式(f)的对比,体现建筑模式的多层次分析。

    The image shows the trends of average indices and excellent rates for Dataset 2 and Dataset 3 as the angle difference threshold increases. Analysis: As the threshold for EDirE_{Dir} increased: * The average quality indices increased (became worse). * The excellent rate decreased. * The number of extracted collinear patterns increased. This indicates a negative correlation between a larger angle difference threshold and pattern quality/accuracy. A higher threshold relaxes constraints, leading to more patterns (including potentially incorrect ones) and reduced quality.

  • Influence of Azimuth Angle (EOriE_{Ori}):

    • The threshold for EOriE_{Ori} was varied from 11, 13, 15, 17 to 19. The following figure (Figure 18 of the original paper) shows the parameter sensitivity analysis for the azimuth angle:

      该图像是论文中展示的示意图,比较了建筑排列的两种模式:a部分为共线模式,b部分为网格模式,并用虚线和红色框标示了建筑群的边界和典型区域,体现了多层次图划分和建筑分组的应用。 该图像是论文中展示的示意图,比较了建筑排列的两种模式:a部分为共线模式,b部分为网格模式,并用虚线和红色框标示了建筑群的边界和典型区域,体现了多层次图划分和建筑分组的应用。

    The image shows the trends of average indices and excellent rates for Dataset 2 and Dataset 3 as the azimuth angle threshold increases. Analysis: As the threshold for EOriE_{Ori} increased: * The average quality indices tended to increase (became worse) for both datasets. * The excellent rate remained unchanged for Dataset 2 but decreased for Dataset 3. * The number of extracted collinear patterns dropped. This seemingly counterintuitive result is explained by the paper: a larger threshold for the azimuth angle means more buildings satisfy the conditions of collinear patterns, leading to the merging of smaller patterns into larger ones. This effectively reduces the count of individual patterns, although the quality might decrease. This also shows a negative correlation between a larger azimuth angle threshold and accuracy.

6.2. Discussion

The paper effectively addresses the limitations of existing studies by providing an integrated solution.

  • Integrated Approach: Unlike previous methods that focused either on building clustering (like MST, DBSCAN, ASCDT) without extracting meaningful patterns, or on pattern analysis for specific patterns (linear templates, align-along-road), this study combines both. The multilevel graph partition efficiently generates high-quality clusters, which then feed into a graph-based grouping method for diverse pattern extraction. This integration leads to high accuracy (over 90%).
  • Comprehensive Pattern Typology: The approach handles a broader range of patterns (collinear, curvilinear, parallel, perpendicular, grid) in a systematic way, contrasting with studies limited to specific patterns (H-pattern, Z-pattern, etc.) or generic linear/non-linear types.
  • Automated Weight Optimization: The use of the Relief-F algorithm to automatically determine similarity weights is a significant improvement over manual selection, which is subjective and time-consuming. While acknowledging that optimal weights might be local rather than global for very large datasets, the data-driven nature of Relief-F is a clear advantage.
  • Superior Orientation Model: The empirical comparison demonstrated that F-histogram is superior to the centroid and Voronoi graph models for representing orientation relations, leading to more accurate pattern extraction. This is crucial for capturing the nuanced spatial relationships between areal objects.
  • Parameter Sensitivity: The analysis of angle difference and azimuth angle thresholds highlights their critical impact on the number and quality of extracted patterns. The need for users to specify these values based on their requirements is acknowledged, and future work is suggested for automatic parameter calibration.

7. Conclusion & Reflections

7.1. Conclusion Summary

This study successfully developed and validated an integrated strategy for extracting building patterns from remote sensing imagery. The methodology innovatively combines multilevel graph partition for robust, bottom-up building clustering with a top-down, graph-based grouping method for detailed pattern recognition. This approach allows for the extraction of a comprehensive typology of patterns, including collinear (straight and oblique), curvilinear, parallel and perpendicular groups, and grid patterns. A key advancement is the use of the Relief-F algorithm to automatically determine optimal weights for similarity measurements, making the clustering process more objective and efficient. Furthermore, the F-histogram model was empirically shown to be more effective than traditional centroid and Voronoi graph models for representing building orientation relationships. Experimental results across various urban scenes demonstrated the strategy's ability to produce high-quality patterns with over 90% accuracy, proving its scalability, efficiency, and reliability for automated urban analysis and geographic information systems.

7.2. Limitations & Future Work

The authors pointed out the following limitations and future research directions:

  • Parameter Calibration: While the influence of some key parameters (e.g., angle difference of main directions, azimuth angle) was analyzed, other parameters were fixed manually. Future work needs to explore how these parameters interact and develop methods to automatically calibrate all parameters used in the strategy.
  • Adaptive Weights: Although Relief-F automatically learns weights, different local data regions might benefit from locally adaptive weights rather than global ones. The paper suggests further quantitative evaluation on whether globally identified weights or locally learned weights are better, as there's no principled way to know the answer currently.
  • Application to Diverse Fields: The paper concludes by stating that future work will focus on applying the extracted patterns to diverse fields, such as analyzing urban landscape configurations and functions and assisting the interpretation of remote sensing images.

7.3. Personal Insights & Critique

This paper presents a robust and well-structured approach to a challenging problem in urban analysis. The integration of multilevel graph partitioning for clustering with specific pattern recognition rules is a powerful combination, effectively bridging the gap between low-level geometric similarities and high-level semantic patterns. The automatic weight determination using Relief-F is a particularly strong contribution, addressing a common subjectivity in similar studies. The empirical validation of F-histogram over other orientation models also provides valuable practical guidance for future research.

Critique and Potential Improvements:

  • Computational Cost: While multilevel graph partitioning is efficient, the overall computational cost for very large-scale urban areas with millions of buildings could still be substantial, especially considering the Delaunay triangulation and Relief-F computations. Exploring more scalable or parallelized implementations could be beneficial.

  • Pattern Overlap and Hierarchy: Although the paper mentions post-processing for overlapping collinear patterns, the complexity of urban landscapes often involves nested or partially overlapping patterns (e.g., a grid pattern containing smaller linear alignments that are themselves part of a larger, irregular cluster). A more explicit framework for handling such complex hierarchical or overlapping relationships could enhance the method's robustness.

  • Semantics and Context: The method is largely geometric. While it aims to infer urban functions, it doesn't explicitly incorporate external semantic information (e.g., land-use maps, road networks, building type labels from other sources). Integrating such contextual data could further improve the accuracy and semantic richness of the extracted patterns, especially for identifying align-along-road patterns which are only mentioned in related work but not explicitly extracted by this method.

  • Curvilinear Pattern Definition: The definition of curvilinear patterns relies on ranges of angles and deviations. While practical, the current definition might struggle with highly irregular or complex curvilinear forms that deviate significantly from a smooth curve. More advanced curve fitting or shape analysis techniques might be needed for such cases.

  • Interpretability of Relief-F Weights: While Relief-F provides weights, the interpretability of how these weights truly map to human cognitive importance could be further explored. Do the learned weights align with intuitive understanding of area, shape, or distance importance in pattern perception?

    Overall, the paper provides a strong foundation for automated urban pattern extraction, and its methodological rigor and comprehensive evaluation make it a valuable contribution to the field. Its methods could be transferable to other domains requiring spatial pattern recognition from geometric primitives, such as identifying geological formations, agricultural field patterns, or even abstract data visualizations.

Similar papers

Recommended via semantic vector search.

No similar papers found yet.