kade/mappy

Fork 0

Benchmark Analysis #1

New issue

Open

opened 2026-04-21 20:17:49 +02:00 by kade · 0 comments

kade commented

2026-04-21 20:17:49 +02:00

Owner

Benchmark Analysis

This analysis examines Mappy's benchmarking infrastructure and empirical performance data.

Benchmark Setup

Mappy uses Criterion.rs for benchmarking. The benchmark suite includes:

Benchmark File	Purpose
basic_quotient_filter_benchmarks.rs	Core quotient filter operations
quotient_filter_benchmarks.rs	Quotient filter features
quotient_filter_benchmarks_slot_finding.rs	Slot finding performance
simple_quotient_filter_benchmarks.rs	Simple quotient filter operations
maplet_vs_std.rs	Comparison with std collections
storage_benchmarks.rs	Storage backend performance

Empirical Data

Insert Performance

Size	Time (mean)	Throughput	Outliers
1,000	55.555 µs	18.000 Melem/s	6.00%
10,000	555.09 µs	18.015 Melem/s	11.00%
100,000	12.848 ms	7.783 Melem/s	0%

Insert performance scales linearly with dataset size. Throughput decreases at larger sizes due to cache effects and collision handling.

Query Performance

Size	Time (mean)	Throughput	Outliers
1,000	20.840 µs	47.985 Melem/s	8.00%
10,000	246.45 µs	40.576 Melem/s	1.00%
100,000	8.8069 ms	11.355 Melem/s	14.00%

Query performance is faster than insert operations (2-3x). Throughput degradation at 100k items suggests cache pressure.

Delete Performance

Size	Time (mean)	Throughput	Outliers
1,000	110.93 µs	9.014 Melem/s	3.00%
10,000	1.1630 ms	8.598 Melem/s	10.00%
100,000	35.773 ms	2.795 Melem/s	5.00%

Delete operations are slowest, requiring tombstone management and run reconstruction.

Hash Function Comparison

Hash Function	Time (mean)	Relative Speed	Outliers
AHash	834.37 µs	1.00x (baseline)	7.00%
TwoX	939.38 µs	0.89x	2.00%
Fnv	742.28 µs	1.12x (fastest)	4.00%

Fnv performs best in this benchmark, though AHash is recommended for general use.

Slot Finding Performance

Size	Time (mean)	Throughput	Outliers
1,000	16.401 µs	60.973 Melem/s	9.00%
10,000	179.97 µs	55.564 Melem/s	9.00%
100,000	4.3894 ms	22.782 Melem/s	0%

Slot finding is fast, benefiting from the quotient filter structure.

Multiset Operations

Size	Time (mean)	Throughput	Outliers
1,000	41.814 µs	23.915 Melem/s	7.00%
10,000	478.81 µs	20.885 Melem/s	3.00%
100,000	5.2887 ms	18.908 Melem/s	7.00%

Multiset operations show stable throughput across sizes.

Memory Usage

Size	Time (mean)
1,000	460.61 ns
10,000	3.8064 µs
100,000	36.554 µs
1,000,000	398.03 µs

Memory allocation time scales sub-linearly.

Mathematical Analysis

Space Complexity

S(n) = n \cdot (O(\log(1/\varepsilon)) + v)

Where:

n = number of items
\varepsilon = false-positive rate
v = value size in bits

Time Complexity

Insert: T_{insert}(n) = O(1) average case, O(\log n) worst case
Query: T_{query}(n) = O(1) average case, with \Pr[\text{false positive}] \leq \varepsilon
Delete: T_{delete}(n) = O(1 + \alpha) where \alpha is fraction of deleted entries

graph LR
    A[Quotient Filter Operations] --> B[Insert: 18M ops/s]
    A --> C[Query: 41M ops/s]
    A --> D[Delete: 9M ops/s]
    A --> E[Slot Finding: 56M ops/s]

Benchmarking Practices

Based on Rust API Guidelines and Criterion.rs documentation:

Criterion.rs is the de facto standard for Rust benchmarking
Micro-benchmarks operate at unit test level
Macro-benchmarks operate at integration test level
Benchmark at the lowest level of abstraction for maintainability
Include macro-benchmarks for critical path validation

Assessment

Strengths:

Uses Criterion.rs (industry standard)
Comprehensive micro-benchmarks for core operations
Multiple data sizes for scalability analysis
Statistical analysis with confidence intervals
Outlier detection and reporting

Areas for Improvement:

Limited macro-benchmarks (end-to-end scenarios)
No continuous benchmarking in CI
Outlier rates are high (6-14% in some benchmarks)
No performance regression detection
Missing comparative benchmarks with competitors

Recommendations

Immediate

Reduce outliers by increasing warm-up time and using CPU frequency scaling governors
Add CI integration with baseline tracking
Set up performance regression detection using Bencher or similar tool

Long-term

Add macro-benchmarks for end-to-end workflows
Benchmark against Bloom filters, Cuckoo filters, and Redis
Add performance characteristics to API docs
Create performance tuning guide

References

Rust API Guidelines: https://rust-lang.github.io/api-guidelines/
Criterion.rs Documentation: https://docs.rs/criterion/
Bencher: https://bencher.dev/learn/benchmarking/rust/criterion/

# Benchmark Analysis This analysis examines Mappy's benchmarking infrastructure and empirical performance data. ## Benchmark Setup Mappy uses Criterion.rs for benchmarking. The benchmark suite includes: | Benchmark File | Purpose | |---------------|---------| | basic_quotient_filter_benchmarks.rs | Core quotient filter operations | | quotient_filter_benchmarks.rs | Quotient filter features | | quotient_filter_benchmarks_slot_finding.rs | Slot finding performance | | simple_quotient_filter_benchmarks.rs | Simple quotient filter operations | | maplet_vs_std.rs | Comparison with std collections | | storage_benchmarks.rs | Storage backend performance | ## Empirical Data ### Insert Performance | Size | Time (mean) | Throughput | Outliers | |------|-------------|------------|----------| | 1,000 | 55.555 µs | 18.000 Melem/s | 6.00% | | 10,000 | 555.09 µs | 18.015 Melem/s | 11.00% | | 100,000 | 12.848 ms | 7.783 Melem/s | 0% | Insert performance scales linearly with dataset size. Throughput decreases at larger sizes due to cache effects and collision handling. ### Query Performance | Size | Time (mean) | Throughput | Outliers | |------|-------------|------------|----------| | 1,000 | 20.840 µs | 47.985 Melem/s | 8.00% | | 10,000 | 246.45 µs | 40.576 Melem/s | 1.00% | | 100,000 | 8.8069 ms | 11.355 Melem/s | 14.00% | Query performance is faster than insert operations (2-3x). Throughput degradation at 100k items suggests cache pressure. ### Delete Performance | Size | Time (mean) | Throughput | Outliers | |------|-------------|------------|----------| | 1,000 | 110.93 µs | 9.014 Melem/s | 3.00% | | 10,000 | 1.1630 ms | 8.598 Melem/s | 10.00% | | 100,000 | 35.773 ms | 2.795 Melem/s | 5.00% | Delete operations are slowest, requiring tombstone management and run reconstruction. ### Hash Function Comparison | Hash Function | Time (mean) | Relative Speed | Outliers | |---------------|-------------|----------------|----------| | AHash | 834.37 µs | 1.00x (baseline) | 7.00% | | TwoX | 939.38 µs | 0.89x | 2.00% | | Fnv | 742.28 µs | 1.12x (fastest) | 4.00% | Fnv performs best in this benchmark, though AHash is recommended for general use. ### Slot Finding Performance | Size | Time (mean) | Throughput | Outliers | |------|-------------|------------|----------| | 1,000 | 16.401 µs | 60.973 Melem/s | 9.00% | | 10,000 | 179.97 µs | 55.564 Melem/s | 9.00% | | 100,000 | 4.3894 ms | 22.782 Melem/s | 0% | Slot finding is fast, benefiting from the quotient filter structure. ### Multiset Operations | Size | Time (mean) | Throughput | Outliers | |------|-------------|------------|----------| | 1,000 | 41.814 µs | 23.915 Melem/s | 7.00% | | 10,000 | 478.81 µs | 20.885 Melem/s | 3.00% | | 100,000 | 5.2887 ms | 18.908 Melem/s | 7.00% | Multiset operations show stable throughput across sizes. ### Memory Usage | Size | Time (mean) | |------|-------------| | 1,000 | 460.61 ns | | 10,000 | 3.8064 µs | | 100,000 | 36.554 µs | | 1,000,000 | 398.03 µs | Memory allocation time scales sub-linearly. ## Mathematical Analysis ### Space Complexity $$S(n) = n \cdot (O(\log(1/\varepsilon)) + v)$$ Where: - $n$ = number of items - $\varepsilon$ = false-positive rate - $v$ = value size in bits ### Time Complexity Insert: $T_{insert}(n) = O(1)$ average case, $O(\log n)$ worst case Query: $T_{query}(n) = O(1)$ average case, with $\Pr[\text{false positive}] \leq \varepsilon$ Delete: $T_{delete}(n) = O(1 + \alpha)$ where $\alpha$ is fraction of deleted entries ```mermaid graph LR A[Quotient Filter Operations] --> B[Insert: 18M ops/s] A --> C[Query: 41M ops/s] A --> D[Delete: 9M ops/s] A --> E[Slot Finding: 56M ops/s] ``` ## Benchmarking Practices Based on Rust API Guidelines and Criterion.rs documentation: - Criterion.rs is the de facto standard for Rust benchmarking - Micro-benchmarks operate at unit test level - Macro-benchmarks operate at integration test level - Benchmark at the lowest level of abstraction for maintainability - Include macro-benchmarks for critical path validation ## Assessment **Strengths:** - Uses Criterion.rs (industry standard) - Comprehensive micro-benchmarks for core operations - Multiple data sizes for scalability analysis - Statistical analysis with confidence intervals - Outlier detection and reporting **Areas for Improvement:** - Limited macro-benchmarks (end-to-end scenarios) - No continuous benchmarking in CI - Outlier rates are high (6-14% in some benchmarks) - No performance regression detection - Missing comparative benchmarks with competitors ## Recommendations ### Immediate 1. Reduce outliers by increasing warm-up time and using CPU frequency scaling governors 2. Add CI integration with baseline tracking 3. Set up performance regression detection using Bencher or similar tool ### Long-term 1. Add macro-benchmarks for end-to-end workflows 2. Benchmark against Bloom filters, Cuckoo filters, and Redis 3. Add performance characteristics to API docs 4. Create performance tuning guide ## References - Rust API Guidelines: https://rust-lang.github.io/api-guidelines/ - Criterion.rs Documentation: https://docs.rs/criterion/ - Bencher: https://bencher.dev/learn/benchmarking/rust/criterion/

No labels

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

kade/mappy#1

No description provided.

Rows
Columns