Skip to main content
Swytch Documentation
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Benchmarks

Swytch is designed for production workloads where performance and durability both matter. This page presents benchmark results comparing Swytch to Redis under various conditions.

Test Environment

All benchmarks were run on:

  • CPU: AMD Ryzen 5 3600 (6 cores / 12 threads)
  • RAM: 64GB DDR4
  • Storage: Samsung NVMe in RAID0
  • OS: Ubuntu 24.04
  • Connection: Unix socket (lowest latency)

Where Does the Speed Come From?

Swytch’s performance advantage comes from a lock-free architecture designed from the ground up for multicore systems:

  • Lock-free transactional index: Concurrent reads and writes proceed without blocking each other. No thread waits for another to release a lock.

  • Novel eviction algorithm: A self-tuning, lock-free eviction system that adapts to your workload in real-time. This is an area of active research—expect further improvements as we refine the algorithm.

  • FASTER-inspired storage: The persistent storage layer uses techniques from Microsoft’s FASTER research project, enabling lock-free append-only writes with indexed lookups that don’t block the hot path.

Redis processes commands single-threaded. Swytch processes commands in parallel across all available cores while maintaining the same consistency guarantees. The result: linear scaling with core count instead of a single-threaded bottleneck.

Synthetic Benchmarks

redis-benchmark (Single Operations)

Using redis-benchmark with 4 threads, 100 clients, 500K operations per command, 16-byte values, Unix socket:

CommandSwytchRedisRatio
PING199,760 ops/s90,876 ops/s2.2x
SET181,620 ops/s83,292 ops/s2.2x
GET199,760 ops/s86,926 ops/s2.3x
INCR166,556 ops/s83,292 ops/s2.0x
LPUSH181,686 ops/s83,306 ops/s2.2x
LPOP181,620 ops/s83,306 ops/s2.2x
SADD166,611 ops/s90,876 ops/s1.8x
HSET153,799 ops/s83,278 ops/s1.8x
LRANGE_100133,191 ops/s64,483 ops/s2.1x
LRANGE_60047,519 ops/s28,128 ops/s1.7x
MSET (10 keys)111,062 ops/s64,483 ops/s1.7x

Swytch achieves 2x+ throughput on most operations while providing full per-operation durability. Redis was configured with appendfsync everysec (1 second of potential data loss).

memtier_benchmark (Realistic Workloads)

High-throughput pipeline test (4 threads, 10 clients, pipeline 50, 10 M write operations):

SystemThroughputp50 Latency
Swytch643,836 ops/s2.34ms
Redis622,854 ops/s3.18ms

Large value test (4 threads, 20 clients, 4KB values, 1:10 write:read ratio, rate-limited):

MetricSwytchRedisRatio
Throughput203,266 ops/s86,238 ops/s2.4x
GET p500.35ms0.90ms2.6x
GET p991.00ms1.76ms1.8x

Zipf 0.99 distribution (4 threads, 50 clients, 256-byte values, 1:10 write:read ratio):

MetricSwytchRedisRatio
Throughput213,912 ops/s89,235 ops/s2.4x
p500.91ms2.21ms2.4x
p991.53ms4.38ms2.9x

High-concurrency pipeline (8 threads, 100 clients, pipeline 10, Zipf 1.1, 128-byte values):

MetricSwytchRedisRatio
Throughput1,473,367 ops/s580,041 ops/s2.5x
p504.51ms13.63ms3.0x
p9919.20ms26.88ms1.4x

Production Trace Replay

We replayed production cache traces from published academic datasets to measure real-world hit rates and backend impact. These traces capture actual access patterns from hyperscale deployments.

Disk-Tiered Storage

Swytch can use disk as extended storage, not just for durability. While Redis evicts data when RAM fills up, Swytch transparently tiers cold data to disk and serves it with minimal latency penalty. This is ideal for key-value store workloads where you want all data accessible, or highly cacheable workloads with predictable access patterns.

Alibaba Block Storage Trace (10MB RAM, 8.6M operations, appendfsync everysec):

MetricSwytchRedis
Hit Rate99.69%33.06%
Hits8,581,1092,845,487
Misses26,3175,761,939
Throughput14,253/s12,951/s
Avg GET Latency70.2µs77.2µs
Backend Reduction99.5%baseline

Redis is RAM-only: when the 10MB limit is reached, it evicts data aggressively, resulting in a 33% hit rate. Swytch keeps hot data in RAM and tiers the rest to NVMe, achieving near-perfect hit rates with **no meaningful latency penalty ** (70µs avg).

When to use this mode:

  • Key-value store replacing a database, where all keys should remain accessible
  • Workloads with predictable, cacheable access patterns
  • Situations where disk is cheap but cache misses are expensive

When NOT to use this mode:

  • Traditional cache-aside patterns where misses are expected and acceptable
  • Workloads with unbounded key growth (disk isn’t infinite either)

For pure caching workloads, see the memory-constrained benchmarks below where both systems operate under the same RAM limits.

Memory-Constrained Caching (Eviction Algorithm Comparison)

When both systems are constrained to the same RAM limit—a fair apples-to-apples comparison—Swytch’s adaptive eviction algorithm outperforms Redis LRU.

Alibaba Block Storage (18MB cache limit, 48-hour trace):

MetricSwytchRedis
Hit Rate82.13%73.31%
Backend Reduction33%

Swytch maintains a 9 percentage point advantage in hit rate under identical memory constraints. The algorithm accounts for access frequency, recency, and object size—not just recency like Redis LRU.

Adequate Memory Scenarios

When memory is sufficient for the working set, both systems achieve similar hit rates, but Swytch maintains its throughput advantage.

Alibaba Block Storage (40GB cache, 48-hour trace, appendfsync everysec):

MetricSwytchRedis
Hit Rate99.69%99.69%
Throughput21,186/s19,493/s
Avg GET Latency47.2µs51.3µs

Twitter Cluster (40GB cache, 20-minute trace, appendfsync everysec):

MetricSwytchRedis
Hit Rate86.94%86.95%
Throughput29,793/s30,909/s
Avg GET Latency33.6µs32.4µs

Hit rates are effectively identical. Choose Swytch for the durability guarantees without sacrificing performance.

Variable Object Sizes

Tencent Photo CDN (40GB cache, 5.5M operations, appendfsync everysec):

MetricSwytchRedis
Hit Rate41.94%39.14%
Backend Reduction4.6%
RPS Saved128 req/s

With highly variable object sizes (typical of CDN workloads), Swytch’s size-aware eviction provides a modest but consistent advantage.

Latency Distribution

Swytch consistently delivers more requests in the sub-100µs bucket:

GET Latency (Alibaba trace, adequate memory):

BucketSwytchRedis
<100µs98.7%96.0%
<500µs1.2%4.0%
<1ms0.0%0.0%

GET Latency (Alibaba trace, memory-constrained 18MB):

BucketSwytchRedis
<100µs95.4%97.8%
<500µs4.4%2.2%
<1ms0.0%0.0%

Under memory pressure, Swytch trades slightly more latency variance for dramatically better hit rates—a worthwhile tradeoff when each miss costs a database round-trip.

Tiered Storage Performance

Swytch’s tiered storage provides full durability (10 ms max data loss) with minimal performance impact.

Test: memtier_benchmark, 4 threads, 50 clients, 256-byte values, Unix socket

Throughput

WorkloadWrite-ThroughGhost Mode
100% writes247,000 ops/s397,000 ops/s
100% reads418,000 ops/s
50/50 mixed336,000 ops/s

Latency

WorkloadModep50p99p99.9
100% writesWrite-through0.52ms3.36ms6.50ms
100% writesGhost0.43ms2.19ms5.41ms
100% readsWrite-through0.42ms1.77ms4.51ms
50/50 mixedWrite-through0.44ms2.98ms5.44ms

Write-through mode (full durability) adds minimal latency overhead. Ghost mode (write-back) offers higher write throughput when eventual persistence is acceptable.

Summary

ScenarioSwytch Advantage
Single-op throughput2-2.3x faster
Pipeline throughput2.5x faster at high concurrency
Disk-tiered storageNear-perfect hit rates with NVMe backend
Memory-constrained caching9 percentage points better hit rate
Durability10ms vs 1000ms max data loss

Swytch delivers higher throughput, lower latency, and better cache efficiency under memory pressure—all while providing stronger durability guarantees than Redis. For workloads that benefit from disk-tiered storage, Swytch can serve as a high-performance key-value store with near-perfect data availability.

Reproducing These Benchmarks

redis-benchmark

# Single operations (matches our test parameters)
redis-benchmark -s /path/to/socket \
  -t ping_inline,ping_mbulk,set,get,incr,lpush,rpush,lpop,rpop,sadd,hset,spop,lrange_100,lrange_300,lrange_500,lrange_600,mset \
  --csv -d 16 --threads 4 -c 100 -n 500000

memtier_benchmark

# High-throughput pipeline (write-heavy)
memtier_benchmark --protocol=redis -S /path/to/socket \
  -t 4 -c 10 --pipeline=50 \
  --key-minimum=1 --key-maximum=10000000 \
  --key-pattern=P:P --ratio=1:0 -n allkeys \
  --hide-histogram

# Large values with rate limiting
memtier_benchmark --protocol=redis -S /path/to/socket \
  -t 4 -c 20 --pipeline=1 --rate-limiting=50000 \
  --key-minimum=1 --key-maximum=2000000 \
  --ratio=1:10 -n 200000 --data-size=4096 \
  --hide-histogram

# Zipf distribution (hot keys)
memtier_benchmark --protocol=redis -S /path/to/socket \
  --threads=4 --clients=50 --requests=100000 \
  --ratio=1:10 --key-pattern=Z:Z \
  --key-zipf-exp=0.99 --key-maximum=100000 \
  --data-size=256 --hide-histogram

# High-concurrency pipeline
memtier_benchmark --protocol=redis -S /path/to/socket \
  --threads=8 --clients=100 --requests=100000 \
  --ratio=1:20 --key-pattern=Z:Z \
  --key-zipf-exp=1.1 --key-maximum=50000 \
  --data-size=128 --pipeline=10 --hide-histogram

Production Trace Replay

Our trace-bench tool replays real production traces against both Redis and Swytch:

# Memory-constrained with persistence
./trace-bench --real --real-vsize \
  --swytch-path ./swytch \
  --time-limit 48h \
  --gb 0.010 --ram 1 --cpus 4 \
  --noscale \
  --trace alibabaBlock_277.oracleGeneral.zst \
  --persistent-everysec

# Adequate memory
./trace-bench --real --real-vsize \
  --swytch-path ./swytch \
  --time-limit 48h \
  --gb 40 --ram 50 --cpus 16 \
  --noscale \
  --trace alibabaBlock_277.oracleGeneral.zst

Trace files are available from the CacheMon cache_dataset project.