VB ASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity
TL;DR Summary
VB ASE unifies approximate vector similarity search and relational queries via relaxed monotonicity, overcoming TopK index limits and boosting complex query performance by up to 1000x while preserving semantics.
Abstract
This paper is included in the Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation. July 10–12, 2023 • Boston, MA, USA 978-1-939133-34-2 Open access to the Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation is sponsored by VB ase : Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity Qianxi Zhang, Shuotao Xu, Qi Chen, and Guoxin Sui, Microsoft Research Asia; Jiadong Xie, Microsoft Research Asia and East China Normal University; Zhizhen Cai and Yaoqi Chen, Microsoft Research Asia and University of Science and Technology of China; Yinxuan He, Microsoft Research Asia and Renmin University of China; Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou, Microsoft Research Asia https://www.usenix.org/conference/osdi23/presentation/zhang-qianxi
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
VBAse: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity
The title clearly states the paper's core contribution: a system named VBASE that unifies two distinct query types—vector similarity search and traditional relational queries. The key mechanism for this unification is a novel concept called "Relaxed Monotonicity."
1.2. Authors
Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou.
The authors are primarily from Microsoft Research Asia, with collaborators from East China Normal University, University of Science and Technology of China, and Renmin University of China. This team combines expertise in database systems, large-scale systems, and machine learning, which is fitting for a paper that bridges these domains.
1.3. Journal/Conference
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI '23).
OSDI is a premier, top-tier conference in the field of computer systems research, covering areas like operating systems, distributed systems, and the intersection of systems with other fields like machine learning and databases. Publication at OSDI signifies that the work is considered highly innovative, rigorous, and impactful by the systems community.
1.4. Publication Year
1.5. Abstract
The abstract introduces the growing need to integrate approximate vector similarity search with relational databases for complex online services. It identifies a key challenge: high-dimensional vector indices lack monotonicity, a property fundamental to conventional database indices (like B-trees). This forces existing systems to use a suboptimal, two-step process: first, fetch a large set of TopK candidate vectors, and then apply relational operators on this temporary, sorted set. The core problem is the difficulty in predicting the right .
This paper presents VBASE, a system that overcomes this issue by identifying a unifying property called relaxed monotonicity, which is common to both vector and scalar indices. This property allows VBASE to bypass the rigid TopK-only interface and create a unified query engine. The authors claim this engine is significantly more efficient while provably maintaining the semantic equivalence of TopK-based solutions. The experimental results highlight VBASE's superior performance, showing up to three orders-of-magnitude speedup on complex queries and enabling new types of analytical queries (like vector joins) with a 7,000x speedup over exact methods.
1.6. Original Source Link
- Official Link: https://www.usenix.org/system/files/osdi23-zhang-qianxi_1.pdf
- Publication Status: Officially published in the proceedings of OSDI '23.
2. Executive Summary
2.1. Background & Motivation
The core problem addressed by this paper lies at the intersection of two powerful but fundamentally different data processing paradigms: relational databases and vector search systems.
-
The Rise of Vector Data: Modern AI models (embeddings) can convert complex data like images, text, and audio into high-dimensional numerical vectors. Searching for "similar" items now means finding vectors that are "closest" in a high-dimensional space. This "vector similarity search" is crucial for applications like recommendation systems, semantic search, and AI-powered chatbots.
-
The Need for Hybrid Queries: Applications increasingly need to combine vector similarity search with traditional database queries. For example: "Find the 10 dresses most visually similar to this image, but only show those under $50 and available in my size." This query involves a vector search (visual similarity) and relational filters (price < 50, size = 'M').
-
The Fundamental Mismatch (The "Monotonicity" Gap):
- Relational Databases are built on indices like B-trees, which have a property called monotonicity. This means data is sorted, so if you're looking for the top 10 cheapest items, you can scan the index in ascending price order and stop once you've found 10. The index guarantees you won't find a cheaper item later.
- Vector Search Systems use specialized indices (like HNSW graphs or IVF clusters) for Approximate Nearest Neighbor Search (ANNS). Due to the "curse of dimensionality," these indices cannot guarantee monotonicity. When traversing a vector index, the distance to the target vector can fluctuate unpredictably. You might find a close vector, then a far one, then an even closer one. These indices are only optimized to find approximately the
TopKnearest neighbors.
-
The Inefficient Workaround: To bridge this gap, existing "vector databases" use a clumsy, two-stage workaround:
-
Stage 1 (Vector Search): For a query like "find 10 similar items with price <
50," they first ask the vector index for a large number of candidates, say,TopK'whereK'$ is a big number (e.g., 1000). This is a guess, hoping that at least 10 of these 1000 items will satisfy the price filter. -
Stage 2 (Relational Processing): They take these 1000 vectors, sort them by distance, and then apply the price filter to get the final results.
This approach is highly inefficient. If is too small, you might not get 10 results. If is too large, you waste a lot of time fetching and processing unnecessary vectors. Predicting the "optimal" is nearly impossible.
-
2.2. Main Contributions / Findings
VBASE introduces a more elegant and efficient solution by fundamentally rethinking the integration.
-
Identified "Relaxed Monotonicity": The authors' key insight is that while vector indices are not strictly monotonic, well-designed ones exhibit a predictable behavior: after an initial search phase of finding the approximate neighborhood of a query vector, the traversal consistently moves away from it. They formalize this common behavior as Relaxed Monotonicity. This property serves as a generalized form of monotonicity that applies to both vector and traditional indices.
-
A Unified Query Engine: Based on relaxed monotonicity,
VBASEbuilds a unified query execution engine. Instead of relying on aTopKinterface, it exposes the internal traversal of vector indices through a standard databaseNext(iterator) interface. This allows the database engine to process vectors one by one, just like it would with a B-tree. It can apply filters on the fly and use the relaxed monotonicity property as a signal to terminate the search dynamically and efficiently, without pre-guessing a . -
Provable Equivalence: The paper proves that this new execution model produces results that are equivalent to a
TopK-based solution that magically knows the "optimal" . This meansVBASEachieves the performance of a perfectly tuned system without the need for prediction or trial-and-error. -
Superior Performance and New Capabilities:
VBASEis shown to be up to three orders of magnitude (1000x) faster than state-of-the-art systems on complex hybrid queries.- It enables entirely new query types that were previously impractical, such as performing a
JOINoperation based on vector similarity. For this, it achieves a 7,000x speedup over brute-force methods with over 99.9% accuracy.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
-
High-Dimensional Vectors (Embeddings): In machine learning, an embedding is a way to represent complex, non-numeric data (like words, sentences, images) as dense numerical vectors in a multi-dimensional space. For example, an image can be converted into a 1024-dimensional vector. The key idea is that semantically similar items (e.g., two pictures of cats) will have vectors that are close to each other in this space.
-
Vector Similarity Search: This is the task of finding vectors in a dataset that are "closest" to a given query vector. Closeness is measured by a distance function. Common metrics include:
- Euclidean Distance (L2 Distance): The straight-line distance between two points (vectors). Lower is better.
- Inner Product: A measure of projection. For normalized vectors, a higher inner product means greater similarity.
- Cosine Similarity: The cosine of the angle between two vectors. A value of 1 means identical direction (most similar), while -1 means opposite.
-
Curse of Dimensionality: As the number of dimensions in a space increases, the volume of the space grows exponentially. This makes many classical data structures and algorithms (like k-d trees) that work well in low dimensions become extremely inefficient. In high-dimensional spaces, the concept of "neighborhood" becomes less meaningful, and finding the exact nearest neighbors often requires a full scan of the dataset, which is too slow for online services.
-
Approximate Nearest Neighbor Search (ANNS): To overcome the curse of dimensionality, ANNS algorithms were developed. They trade perfect accuracy for massive speed improvements. Instead of guaranteeing the discovery of the absolute nearest neighbors, they aim to find vectors that are among the true nearest neighbors with high probability. The quality of an ANNS algorithm is often measured by recall, the percentage of true nearest neighbors found.
-
Monotonicity: This is a property of sorted data structures. If you are traversing an index (like a B-tree sorted by price) in a specific order, you have a guarantee about the data you will see next. For example, when scanning items by increasing price, the price of each subsequent item will be greater than or equal to the previous one. This allows for efficient query termination. For a
TopKquery, you can stop after finding K items. For a range query (), you can stop as soon as you see an item with a price of 100 or more. -
Volcano/Iterator Model: This is a standard query execution model in most relational database systems. A query plan is represented as a tree of operators (e.g.,
Scan,Filter,Join). To get results, the root operator callsNext()on its child operator(s), which in turn callNext()on their children, and so on, down to theScanoperators at the leaves which read data from tables or indices. Data flows up the tree one tuple (row) at a time. Each operator implements three basic functions:Open()(initialization),Next()(get the next tuple), andClose()(cleanup).
3.2. Previous Works
The paper positions itself against two main categories of existing systems:
-
ANNS Libraries (e.g., FAISS, HNSW):
- FAISS (Facebook AI Similarity Search): A library for efficient similarity search. It includes many ANNS index types, one of the most popular being
IVFFlat. This is a partition-based index that first clusters all vectors into partitions (like buckets) and then, during a query, only searches the partitions closest to the query vector. - HNSW (Hierarchical Navigable Small World graphs): A graph-based ANNS index. It represents vectors as nodes in a multi-layered graph. Search starts on a sparse, high-level graph for coarse navigation and progressively moves to denser, lower-level graphs to refine the search.
- Limitation: These are libraries, not full-fledged databases. They typically expose only a
TopK(query_vector, K)interface, which is insufficient for complex relational queries.
- FAISS (Facebook AI Similarity Search): A library for efficient similarity search. It includes many ANNS index types, one of the most popular being
-
Vector Databases (e.g., Milvus, PASE, Elasticsearch): These systems attempt to integrate vector search into a database-like framework.
- Milvus: A purpose-built vector data management system. For complex queries (e.g., multi-vector search), the paper notes that Milvus resorts to a trial-and-error approach. It starts with a small , and if it doesn't get enough results after filtering, it doubles and re-runs the entire
TopKquery, leading to massive inefficiency. - PASE (PostgreSQL Ultra-high-dimensional Approximate nearest neighbor Search Extension): An extension for PostgreSQL that integrates ANNS. To handle filtered search, PASE fetches a conservatively large number of candidates () from the vector index and then applies filters. This avoids trial-and-error but often fetches and processes far more data than necessary.
- AnalyticDB-V & Elasticsearch: These systems follow a similar
TopK-then-filter strategy. They create a temporary, monotonic index on the results to perform subsequent operations.
- Milvus: A purpose-built vector data management system. For complex queries (e.g., multi-vector search), the paper notes that Milvus resorts to a trial-and-error approach. It starts with a small , and if it doesn't get enough results after filtering, it doubles and re-runs the entire
3.3. Technological Evolution
The field has evolved from having two separate worlds—relational databases for structured data and specialized libraries for vector search—to a new stage where integration is a primary goal.
- Stage 1: Separation. Developers had to manage two systems: a database (like PostgreSQL) for metadata and a vector search index (like FAISS) for embeddings. The application logic had to query both and join the results manually.
- Stage 2: Coarse-Grained Integration. Systems like Milvus and PASE emerged. They package a database and a vector index together. However, the integration is "shallow" or "coarse-grained." The database engine treats the vector index as a black box that only speaks
TopK. This leads to the suboptimalTopK-then-filter workaround. - Stage 3 (VBASE's Contribution): Deep Integration.
VBASEproposes a "deep" integration. Instead of using theTopKblack box, it opens it up and integrates the fundamental index traversal logic directly into the database's own execution model (the Volcano/Iterator model). This is made possible by theRelaxed Monotonicityabstraction.
3.4. Differentiation Analysis
The core difference between VBASE and all prior vector databases is how and where the unification happens.
- Prior Systems (Milvus, PASE, etc.): Unify at the query result level. They run a
TopKvector query first, get a static list of results, and then apply relational logic to this list. The vector index traversal and the relational processing are two separate, sequential phases. - VBASE: Unifies at the index traversal level. It treats vector index traversal as just another iterator that can be composed with other database operators like
FilterandJoin. The search termination is dynamic, based on theRelaxed Monotonicityproperty, allowing the system to stop as soon as enough valid results are found. This avoids fetching a large, arbitrary number of candidates, leading to significant performance gains.
4. Methodology
The methodology of VBASE is centered around the formalization of an observed property in vector indices and its application in a unified query engine.
4.1. Principles
The guiding principle of VBASE is that while high-dimensional vector index traversals are not strictly monotonic, they are not completely random either. The authors observed a common two-phase pattern in state-of-the-art ANNS indices like FAISS IVFFlat and HNSW.
The following chart from the paper illustrates this pattern:
该图像是图表,展示了两种向量索引的遍历模式,分别是(a) FAISS IVFFlat 和 (b) HNSW。横轴表示步骤数,纵轴表示距离,反映两种索引在搜索过程中的距离变化趋势。
-
Phase 1: Approach Phase. The search starts and quickly navigates towards the region of the query vector. During this phase, the distance to the query vector can oscillate significantly, but the general trend is to find closer and closer neighbors.
-
Phase 2: Departing Phase. After exploring the nearest region, the search starts to move progressively away. In this phase, it is highly unlikely that a vector closer than the ones already found will be discovered.
The key idea of
VBASEis to formally define this transition into Phase 2. If the system can reliably detect that it has entered the departing phase, it can terminate the query early if enough results have been collected, even without a strict monotonicity guarantee. This formalized property is named Relaxed Monotonicity.
4.2. Core Methodology In-depth
4.2.1. The Formal Definition of Relaxed Monotonicity
To formalize the two-phase pattern, the authors introduce a model based on the intuition of a "neighbor sphere" around the query vector , as illustrated below.
该图像是一幅示意图,展示了图2中松弛单调性的直观含义。它说明了目标向量的邻域为半径的邻居球,邻居球内包含与最近的个向量,遍历路径中当前位置的窗口大小为,窗口内向量到的中位距离记为。
The goal is to determine when the index traversal has moved outside this sphere and is unlikely to return. This requires defining two key quantities: the radius of the sphere and the current position of the traversal.
-
Radius of the Neighbor Sphere (): This radius is defined by the nearest neighbors to the query vector that have been discovered so far during the traversal up to step
s-1. is a parameter that must be at least as large as the desired number of results for aTopKquery. is the distance of the farthest of these neighbors.The formal definition is given by Equation 1: $ R _ { q } = M a x ( T o p E ( { D i s t a n c e ( q , \nu _ { j } ) | j \in [ 1 , s - 1 ] } ) ) $
- Explanation of Symbols:
- : The query vector.
- : A vector visited at traversal step .
- : The distance between the query vector and the visited vector.
s-1: The number of steps taken so far.- : A function that returns the set of the smallest distances from the set provided.
- : A function that returns the maximum value from the
TopEset. In essence, is the distance to the E-th closest vector found so far.
- Explanation of Symbols:
-
Current Traversal Position's Distance (): Simply using the distance of the single vector at the current step can be noisy due to outliers. To get a more stable measure of the traversal's current region, the authors define as the median distance of all vectors visited in the most recent steps (a "traversal window").
The formal definition is given by Equation 2: $ M _ { q } ^ { s } = M e d i a n ( { D i s t a n c e ( q , \nu _ { i } ) | i \in [ s - w + 1 , s ] } ) $
- Explanation of Symbols:
- : The size of the traversal window, a hyperparameter.
- : The current traversal step.
- : The median function, used to provide a robust estimate of the central tendency of distances in the current window, ignoring outliers.
- Explanation of Symbols:
-
The Definition of Relaxed Monotonicity: An index traversal is said to satisfy relaxed monotonicity if there exists a point in time (step ) after which the traversal's current position () consistently remains outside the neighbor sphere (radius ).
This is formally stated in Definition 1 (Equation 3): $ \exists s , M _ { q } ^ { t } \geq R _ { q } , \forall t \geq s . $
- Explanation: This means "there exists a step , such that for all future steps (where ), the median distance of the current window is greater than or equal to the distance of the E-th nearest neighbor found so far." When this condition is met, the system has entered the "departing phase" and can consider stopping the search. The parameters and capture the internal characteristics of different index types and can be tuned. For example, for HNSW, corresponds to the search parameter
ef, and is 1. For IVFFlat, is the query and is the number of vectors in the clusters being scanned.
- Explanation: This means "there exists a step , such that for all future steps (where ), the median distance of the current window is greater than or equal to the distance of the E-th nearest neighbor found so far." When this condition is met, the system has entered the "departing phase" and can consider stopping the search. The parameters and capture the internal characteristics of different index types and can be tuned. For example, for HNSW, corresponds to the search parameter
4.2.2. Unified Query Execution Engine
With relaxed monotonicity as the unifying principle, VBASE builds its engine by making minimal but powerful changes to a traditional database engine (PostgreSQL).
-
Iterator Model for Vector Indices:
VBASEmodifies vector index libraries like HNSW and FAISS. Instead of their nativeTopKinterface, it exposes their internal traversal logic through the standard database iterator interface:Open(),Next(), andClose(). A call toNext()on a vector index now returns the next closest candidate vector according to the index's internal algorithm. -
Generalized Termination Condition: The query execution logic is extended. A query operator (like an index scan) will terminate only when two conditions are met:
-
The original query condition is satisfied (e.g., for a
LIMIT Kquery, results have been found). -
The relaxed monotonicity check passes (i.e., ). This check provides the confidence to stop, as it's now unlikely that better results will be found.
This unified engine allows for powerful optimizations. For a query, the filter can be applied to each vector as it's returned by the
Next()call. The system keeps track of how many vectors have passed the filter and only terminates when both the required number of filtered results are found AND the relaxed monotonicity check is passed. This dynamically finds the minimal number of vectors that need to be visited.
-
4.2.3. Result Equivalence Proof
The paper proves that this dynamic approach is not only faster but also produces equivalent results to an ideal TopK-based system. Here is a breakdown of the proof for a query, as illustrated in the figure below.
该图像是论文中图3的示意图,展示了基于TopK接口的系统与VB ASE系统在索引遍历和查询处理上的结果等价性及流程对比,强调VB ASE利用松弛单调性替代传统的TopK方法。
-
TopK-based System (Left):
- It must first guess a candidate size . It traverses the index, fetching vectors, to satisfy this
TopK(K'). - The resulting vectors are sorted.
- A
Filteris applied, and finally, the result is limited to the desired items. The final result is . The process is formulated as: $ r _ { 1 } = L i m i t _ K ( F i l t e r ( L i m i t _ K ^ { \prime } ( S o r t ( R _ { 1 } ) ) ) ) $ For this to work correctly and optimally, the system must choose the "optimal K-tilde" (), which is the smallest that yields exactly results after filtering. Assuming this optimal is known, the equation simplifies because the outer will be satisfied exactly.
- It must first guess a candidate size . It traverses the index, fetching vectors, to satisfy this
-
VBASE System (Right):
VBASEtraverses the index using itsNext()interface, producing a stream of vectors .- It applies the
Filterto each vector on the fly. - The vectors that pass the filter are collected, sorted, and limited to . The final result is . The process is formulated as: $ r _ { 2 } = L i m i t _ K ( S o r t ( F i l t e r ( R _ { 2 } ) ) ) $
-
The Equivalence: The paper argues that since both systems use the same underlying vector index and the same termination logic (relaxed monotonicity), they will traverse the exact same set of vectors to find the required results. Therefore, . Since
SortandFilteroperators are commutative in this context, the final result sets and are identical.VBASEis superior because it discovers the size of this set () dynamically, whereas theTopK-based system must guess it (often sub-optimally).
4.2.4. VBASE Implementation Details
-
Query Planning and Cost Estimation:
VBASEextends PostgreSQL's cost-based optimizer to handle vector queries.- Vector Computation Cost: Acknowledges that vector distance calculation is expensive and proportional to the dimension.
$
t _ { \nu } ( d i m ) = t \cdot c \cdot d i m
$
where is a base cost, is a coefficient for SIMD optimization, and
dimis the vector dimension. - Selectivity Estimation: For filters, it uses a sampling-based method to estimate the selectivity (the fraction of rows that will pass the filter) for a given query, which is more accurate than a fixed default.
- Index Scan Cost: It provides cost models for both partition-based indices (like
IVFFlat, Eq. 14) and graph-based indices (likeHNSW, Eq. 15), considering factors like the number of centroids, partitions to scan, and traversal steps. This allows the optimizer to make informed choices, for example, deciding whether to use a B-tree index on a scalar attribute or a vector index.
- Vector Computation Cost: Acknowledges that vector distance calculation is expensive and proportional to the dimension.
$
t _ { \nu } ( d i m ) = t \cdot c \cdot d i m
$
where is a base cost, is a coefficient for SIMD optimization, and
-
Multi-Column Scan Optimization: For queries involving multiple vector columns, a naive round-robin traversal of indices can be inefficient.
VBASEimplements a more intelligent Non-uniform Rank Aggregation (NRA) algorithm.
该图像是论文中展示多列遍历优化的示意图,描述了包含多个向量索引的检索流程及模块间的数据流动关系,核心包括全局优先队列、遍历决策模块和终止检查。The traversal decision module balances exploitation (favoring the index that has recently produced better scores) and exploration (ensuring all indices get a chance to be scanned to avoid local optima). It allocates more traversal steps to indices that have a better global average score, using the formula: $ W _ { i } = \left\lceil n _ { 2 } \times \frac { 1 / a \nu g _ { i } } { \sum _ { j = 1 } ^ { m } 1 / a \nu g _ { j } } \right\rceil $ where is the number of extra steps for index , is its historical average score, and is a hyperparameter. This ensures that higher-quality indices (lower average distance/score) are explored more frequently.
5. Experimental Setup
5.1. Datasets
The authors created a hybrid vector-scalar dataset by extending the Recipe1M dataset, which contains over 1 million cooking recipes with images, ingredients, and instructions.
They organized the data into two main tables:
-
Recipe Table: Contains 330,922 recipes.
-
Schema: The following is the schema from Table 2 of the original paper:
Column Name Data Type Example recipe_id identifier 1 images list of strings ["data/images/1/0.jpg" , . . ] description text [ingredients] + [instruction] images_embedding vector [0.0421, 0.0296, , . , 0.0273] description_embedding vector [0.0056, 0.0487, … * , 0.0034] popularity integer 300 -
The
images_embeddinganddescription_embeddingare 1024-dimensional vectors. Thepopularityscore is a randomly generated integer for scalar filtering experiments.
-
-
Tag Table: Contains 10,000 tagged recipes used for
JOINqueries.-
Schema: The following is the schema from Table 3 of the original paper:
Column Name Data Type Example id identifier 1 tag_name text "salad tag_vector vector [0.0137, 0.0421, · · , 0.0183]
-
The benchmark consists of 8 SQL queries (Q1-Q8) designed to test various scenarios, from simple TopK to complex , multi-column search, and vector joins.
5.2. Evaluation Metrics
The experiments were evaluated on two primary metrics: performance (latency) and accuracy (recall).
-
Latency: Measured as the time taken to execute a query. The paper reports the average, median, and 99th percentile latency in milliseconds (ms) to provide a comprehensive view of performance, including typical and worst-case scenarios.
-
Recall:
- Conceptual Definition: Recall measures the accuracy of an approximate search. It quantifies what fraction of the true top-K nearest neighbors were successfully found by the approximate algorithm. A recall of 1.0 (or 100%) means the approximate search found all the same results as an exhaustive, exact search.
- Mathematical Formula: $ Recall@K = \frac{| \text{Approximate Top-K} \cap \text{Ground Truth Top-K} |}{|\text{Ground Truth Top-K}|} $
- Symbol Explanation:
Approximate Top-K: The set of K results returned by the ANNS system.Ground Truth Top-K: The set of the true K nearest neighbors, typically found by running a brute-force (exact) search.- : The size of the set.
5.3. Baselines
VBASE was compared against a set of state-of-the-art systems representing different approaches:
-
PostgreSQL(v13): A powerful open-source relational database. It serves as the baseline for exact search via sequential scan. It provides the ground truth for recall calculations but is expected to be very slow. -
PASE: A PostgreSQL extension that integrates ANNS using theTopK-then-filter approach with a large, fixed candidate set. It represents the "conservative guess" strategy. -
Milvus: A popular open-source vector database. For complex queries, it represents the "trial-and-error" strategy, re-running queries with a larger if needed. -
Elasticsearch(Open Distro v1.13): A distributed search engine that also supports vector search via HNSW. It represents anotherTopK-then-filter implementation.All systems using approximate search were configured to use the same HNSW vector index with identical parameters to ensure a fair comparison of the query execution strategies, not the underlying index quality.
6. Results & Analysis
The experimental results strongly support the paper's claims, demonstrating VBASE's superior performance and versatility.
6.1. Core Results Analysis
The following is the overall results summary from Table 4 of the original paper:
| System | Recall | Q1: Single-Vector TopK | Q2: Single-Vector TopK+Numeric Filter | Q3: Single-Vector TopK+String Filter | Q4: Multi-Column TopK | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Latency | Recall | Latency | Recall | Latency | Recall | Latency | ||||||||||
| average | median | 99th | average | median | 99th | average | median | 99th | average | median | 99th | |||||
| PostgreSQL | 1 | 2,980.1 | 3,021.7 | 3,133.6 | 1 | 1,108.3 | 1,124.1 | 2,286.2 | 1 | 4,322.2 | 3529.3 | 9,953.0 | 1 | 5,610.0 | 5,604.7 | 5,769.8 |
| PASE | 0.9949 | 4.8 | 3.5 | 5.1 | 0.9987 | 29.3 | 28.7 | 61.7 | 0.9982 | 13.2 | 10.7 | 17.9 | - | - | - | |
| Milvus | 0.9949 | 9.4 | 9 | 12.7 | 0.9919 | 33.7 | 23.9 | 121.4 | - | - | - | - | 0.9041 | 6,696.4 | 8,349.3 | 9,299.0 |
| Elasticsearch | 0.9949 | 43.1 | 41.8 | 48.9 | 0.5010 | 97.9 | 98.1 | 118.1 | 0.8378 | 79.9 | 90.0 | 100.9 | - | - | - | |
| VBase | 0.9949 | 4.9 | 3.9 | 5.3 | 0.9989 | 11.7 | 6.3 | 51.7 | 0.9983 | 7.9 | 6.7 | 10.4 | 0.9696 | 19.8 | 18.4 | 46.4 |
| System | Q5: Multi-Column TopK+Numeric Filter | Q6: Multi-Column TopK+String Filter | Q7: Vector Range Filter | Q8: Join | ||||||||||||
| Recall | Latency | Recall | Latency | Latency | Latency | |||||||||||
| average | median | 99th | average | median | 99th | Recall | average | median | 99th | Recall | average | median | 99th | |||
| PostgreSQL | 1 | 1,192.9 | 1,234.4 | 2,343.6 | 1 | 6,543.2 | 5,996.3 | 16,734.6 | 1 | 8,244.9 | 8,212.6 | 8,641.6 | 1 | 129,051,273.9 | - | - |
| PASE | - | - | - | - | - | - | - | - | - | - | - | - | ||||
| Milvus | 0.9691 | 12,637.9 | 5,617.4 | 36,887.9 | - | - | - | - | - | - | ||||||
| Elasticsearch | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| VBase | 0.9805 | 35.8 | 24.9 | 160.7 | 0.9626 | 21.6 | 18.3 | 64.8 | 0.9840 | 10.8 | 2.2 | 168.9 | 0.9992 | 16,335.9 | -1 | -1 |
-
Q1 (Single-Vector TopK): All approximate systems perform similarly in terms of recall, as they use the same underlying HNSW index.
VBASE's latency (4.9ms avg) is comparable toPASE(4.8ms avg), showing that its more flexible engine introduces minimal overhead for simple queries. -
Q2-Q3 (TopK + Scalar Filter): This is where
VBASE's advantage becomes clear.VBASEachieves high recall (0.9989) with low latency (11.7ms avg for Q2).PASEalso has high recall but is much slower (29.3ms avg) because its fixed large forces it to fetch and process many unnecessary vectors.Elasticsearchis fast but has very poor recall (0.5010) because its default is too small, failing to find enough candidates to satisfy the filter.- This confirms that
VBASE's ability to dynamically determine the search scope is key to achieving both high accuracy and high performance.
-
Q4-Q6 (Multi-Column TopK):
VBASE's superiority is even more dramatic here.- For Q4,
VBASE's average latency is 19.8ms.Milvustakes 6,696.4ms—over 300x slower—while also having lower recall. - The reason is
Milvus's trial-and-error approach. It repeatedly executesTopKqueries with increasing , accumulating huge costs.VBASE, in contrast, traverses both indices concurrently and efficiently using its optimized NRA algorithm.
- For Q4,
-
Q7 (Vector Range Filter): Only
VBASEcan execute this query efficiently and natively. Other systems do not support it directly.VBASE's performance is excellent (10.8ms avg), demonstrating the flexibility of its engine beyondTopKqueries. -
Q8 (Join): This is the most impressive result.
PostgreSQL(the exact baseline) takes over 129 million ms (about 35 hours) to perform the brute-force nested-loop join.VBASEcompletes the same join in just 16,335.9ms (about 16 seconds), a speedup of nearly 8,000x, while maintaining an extremely high recall of 0.9992. This enables a class of analytical queries that were previously infeasible.
6.2. Ablation Studies / Parameter Analysis
-
Cost Estimation & Query Planning: The experiments in Figure 8 show that
VBASE's cost model is accurate enough to make the correct query plan choices. For a query with both a vector filter and a scalar filter,VBASEcorrectly chooses to use the B-tree index when scalar selectivity is low and the vector index when vector selectivity is low. A system likePASEwith a naive cost model consistently makes the wrong choice, leading to much higher latency. The following is the data from Figure 8 of the original paper:
该图像是图8,展示了不同估计方法下查询执行时间的曲线图,图(a)中标量过滤器选择性固定为0.13,图(b)中区间过滤器选择性固定为0.90,比较了Ground Truth、VBase和默认方法的性能表现。 -
Multi-Column Traversal Strategy: Table 7 analyzes
VBASE's optimized NRA algorithm against round-robin and greedy strategies. The results show that when vector weights are imbalanced (e.g., 1:10), the greedy approach is fast. However, when weights are balanced (1:1), greedy gets stuck in local optima and performs poorly on recall.VBASE's adaptive strategy consistently achieves high recall across all weight distributions while maintaining a performance edge over the simple round-robin approach. The following are the results from Table 7 of the original paper:Weight1 Algorithm NumOfScans2 Latency Recall 1 : 1 Round-Robin Greedy3 VBASE 651.93 699.02 638.56 20.91 21.70 20.56 0.9715 0.9313 0.9705 1 : 2 Round-Robin Greedy3 VBASE 617.22 612.51 593.99 20.25 19.94 19.78 0.9802 0.9655 0.9818 1 :5 Round-Robin Greedy3 VBASE 463.39 372.96 409.31 16.93 14.90 15.69 0.9946 0.9949 0.9961 1 :10 Round-Robin Greedy3 VBASE 363.47 274.86 311.66 14.81 12.69 13.97 0.9981 0.9985 0.9987 -
Generality with other Index Types: The evaluation with
SPANN(a disk-based, partition-style index) in Table 8 demonstrates that theVBASEframework and the concept of relaxed monotonicity are not limited to in-memory graph indices like HNSW. It can be successfully applied to different index families, showing the generality of the approach.
7. Conclusion & Reflections
7.1. Conclusion Summary
The paper presents VBASE, a system that successfully unifies online vector similarity search and relational queries within a single database engine. The key innovation is the identification and formalization of Relaxed Monotonicity, a property that abstracts the common traversal behavior of high-dimensional vector indices. By leveraging this property, VBASE builds a unified query engine based on the standard database iterator model, bypassing the inefficient and rigid TopK-only interfaces of existing vector search systems.
This deep integration allows VBASE to dynamically and efficiently execute complex hybrid queries, achieving performance that is up to three orders of magnitude better than state-of-the-art vector databases. Furthermore, VBASE unlocks new analytical capabilities, such as high-speed approximate vector joins, which were previously impractical. The work provides a principled and highly effective foundation for building next-generation databases in the era of AI.
7.2. Limitations & Future Work
The paper itself does not explicitly list its limitations, but some can be inferred:
- Parameter Tuning: The
Relaxed Monotonicitydefinition relies on parameters (neighborhood size) and (window size). The paper states these can be mapped from index-specific hyperparameters (likeefin HNSW), but it's not fully clear how sensitive the performance and accuracy are to these settings or how they could be automatically tuned for any arbitrary index or dataset. - Focus on Single-Node: The evaluation and implementation are based on a single-node PostgreSQL instance. While this is a critical first step, many real-world applications require distributed execution. Extending the
VBASEmodel to a distributed database environment would be a significant next step. - Write and Update Performance: The paper focuses exclusively on query (
read) performance. The performance of building, updating, and deleting from the integrated indices, especially in a transactional context, is not discussed but is crucial for real-world online services.
7.3. Personal Insights & Critique
This paper offers a powerful and elegant solution to a very practical and increasingly important problem. My key takeaways are:
-
The Power of Abstraction: The most brilliant aspect of this work is the "aha!" moment of identifying
Relaxed Monotonicity. Instead of treating vector indices as alien structures, the authors found a common language to describe them alongside traditional indices. This is a classic example of great systems research: finding the right abstraction to unify seemingly disparate components. -
Deep Integration over Shallow Integration: The paper serves as a strong argument for deep, principled integration over shallow, ad-hoc workarounds. The
TopK-then-filter approach was a pragmatic but clumsy patch.VBASE's approach of moving the integration point down to the index traversal level is fundamentally more correct and, as the results show, vastly more efficient. -
Enabling New Possibilities: The most exciting outcome is not just making existing queries faster, but enabling entirely new ones. The 7,000x speedup on vector joins is transformative. It moves vector similarity from a simple "lookup" tool to a powerful primitive for large-scale data analytics, opening doors for tasks like massive document clustering, product-to-product recommendation analysis, and large-scale deduplication.
-
A Blueprint for Future Systems: The philosophy of
VBASEcould be applied to integrate other "messy," non-monotonic, or approximate data structures into the rigorous world of relational databases. It provides a blueprint for how to build systems that are both flexible enough to handle modern data types and principled enough to deliver high performance and predictable behavior.In critique, while the concept is powerful, its real-world deployment would depend on the robustness of the
Relaxed Monotonicityheuristic across a wider variety of datasets and query distributions than tested. The stability of the "departing phase" might vary, and edge cases could potentially impact accuracy. However, as a research contribution, the paper is exceptionally strong, providing both a novel theoretical concept and a compelling practical demonstration of its value.
Similar papers
Recommended via semantic vector search.