Designing for Partitions

Swytch is a distributed cache, which means you have to think about what happens when nodes can’t reach each other. This page covers how Swytch handles partitions today, the fundamental race at the heart of partition detection, a specific failure mode worth knowing about, and the design patterns that keep partitions from becoming incidents.

What partitions look like in Swytch today

By default, Swytch runs in safe mode: MULTI/EXEC transactions against data whose subscribers can’t all be reached return an error to the application, and the partition shows up as write-unavailability for those transactions on the affected data. Non-transactional writes (SET, INCR, LPUSH, HSET, XADD, and the rest of the standard Redis commands) keep accepting writes on both sides of the partition and reconcile deterministically on the DAG when the network heals. Reads still serve from local state on every side, because subscription is already synchronous.

This is the conservative choice for transactions specifically. It’s the behavior a reader familiar with Raft-based systems, primary-replica Postgres, or Redis Cluster would recognize: when transactions can’t safely commit across a partition, they fail rather than risk silent divergence. Where Swytch differs is that this conservatism applies only to MULTI/EXEC; the rest of the cache keeps writing, which is exactly what cache workloads need from a partition-tolerant cache.

The less-conservative choice (keeping writes available through the partition, letting the two sides diverge, reconciling on heal) is holographic divergence. The implementation exists; the user-facing configuration does not yet. More on that below.

The partition detection race

Every distributed system has the same fundamental problem: detecting a partition means detecting the absence of expected messages, and the absence of messages is indistinguishable from a node that’s slow, a link that’s congested, or a peer that’s genuinely unreachable. You cannot tell these apart from the information available to the detector. You can only set a threshold at which “not yet” becomes “probably won’t.”

Different systems pick different thresholds:

Raft-based systems (Cockroach, etcd, Consul) use an election timeout, typically 150–300ms base, randomized to avoid split-vote storms. If a follower hasn’t heard a heartbeat by that threshold, it starts an election.
Primary-replica PostgreSQL or MySQL don’t detect partitions themselves; external tools (Patroni, pg_auto_failover, orchestrator) watch replication lag and decide when to failover, typically at the seconds-to-tens-of-seconds scale.
Spanner uses TrueTime’s bounded clock uncertainty, collapsing a lot of the race into clock-interval math; typical uncertainty is under 10ms on Google’s hardware.
Dynamo-style systems (Cassandra, DynamoDB, Riak) use gossip-based failure detection with configurable suspicion thresholds, typically 5–10 seconds.
Redis Cluster defaults to a 15-second node-timeout before declaring a node failing.

Swytch’s threshold is roughly 2× the round-trip time to the furthest subscriber. That could mean 1 second or 5 seconds depending on your cluster geography. The principle: the threshold needs to be high enough that “slow” doesn’t get mistaken for “unreachable,” and low enough that partitions are detected before they cause too much damage. 2× RTT to the furthest subscriber is a reasonable choice in that tradeoff space, landing on the tighter end compared to most of the systems above.

None of these thresholds are magic. They’re all bets about how long “just slow” is allowed to look like “unreachable” before the system decides. A partition can always land inside the window and produce a race with causality. Swytch’s handling of that race is what the rest of this page is about.

Inside the race window: a specific failure mode

The commit path in Swytch is straightforward:

Preflight. The committing node checks that all subscribers are reachable and that no competing local commit is in flight. If both pass, proceed.
Announce. The node broadcasts the commit envelope. At this point, the commit is in the DAG; there’s no back-and-forth confirmation, no two-phase handshake.
Listen for 1 RTT. During this window, a competing envelope from a peer could still arrive. If one does, the node handles the conflict. If none arrives by the end of the window, the commit stands.

Simple enough. The failure mode appears when a partition lands in step 3.

Here’s the sequence:

Node A completes preflight successfully. All subscribers were reachable a moment ago.
Node A announces the commit. The envelope propagates to every peer Node A can currently reach. Those peers record the commit in their DAG.
A partition occurs, isolating Node A from some of its peers.
Node A waits through its 1 RTT listen window. No competing envelope arrives (the partition is silencing the peers who would otherwise have sent one).
After the suspicion threshold passes, Node A detects the partition and returns ABORT to the application.
Meanwhile, on the other side of the partition, a different node may commit something that conflicts with A’s envelope — because that side doesn’t know A’s envelope committed either.

When the partition heals, both sides walk the merged DAG and find two committed envelopes touching the same data. Both are valid by their own side’s history. The application on Node A was told the commit aborted, but every peer that received the announcement before the partition has the envelope recorded as committed.

This race is not unique to Swytch. It’s the same race every distributed system has, just with different specific shapes. A Raft-based system can return ABORT to a client when leadership is lost mid-write, but the entry may still be in this node’s log; whether it survives depends on the next election. A primary-replica system can acknowledge a write that never reaches the replica before failover, leaving the write’s fate up to the failover script. The shape differs; the race doesn’t go away.

The race window in Swytch is narrow (the 1 RTT listen window plus the suspicion threshold), and it only fires on MULTI/EXEC transactions that race against a partition landing in that specific window. Non-transactional writes ( SET, GET, INCR, LPUSH, HSET, XADD, the standard Redis commands not wrapped in MULTI/EXEC) don’t go through preflight; they propagate and merge deterministically on the DAG regardless of partition. So the footgun above is specifically a MULTI/EXEC-meets-partition phenomenon, not a general property of writes against Swytch. Cache workloads that don’t use MULTI/EXEC aren’t exposed to it at all.

Detection and recovery, today

When divergence occurs in the current implementation (from the race above or any other cause), the only signal is log output. Both sides of the now-healed partition log that divergence has been detected, with enough detail to identify which keys are affected. There is no programmatic detection API. There is no recovery tooling.

Recovery, right now, is: stop the cluster, repopulate the affected keys from whatever upstream source of truth your application has. For most cache workloads, this is straightforward (the cache is a derivative of the system of record, repopulating is a cache warm-up). For workloads where Swytch is the only source of state for the affected keys, this is harder, and you need to think about it before deploying.

Swytch Cloud will provide the recovery path. Cloud holds the authoritative causal log across regions, can arbitrate between divergent branches, and gives you tools for reconciliation. Until Cloud ships, the recovery path for the specific failure case above is application-level.

Holographic mode as a future design axis

Safe mode is conservative and conventional: partitions cost write availability on affected keys, and you get back what Raft users are used to. Holographic mode is the other direction: both sides of a partition keep writing, diverge cleanly, and reconcile when the network heals.

The use case for holographic mode is offline-capable nodes. Field equipment on remote sites. Edge devices on unreliable links. Anywhere the network going down is a scheduled part of the workflow rather than an incident. In those shapes, " writes stall until the network returns" is unacceptable; holographic divergence lets the disconnected side keep working and hands you a reconciliation problem afterward. For those workloads, that’s the better tradeoff.

Holographic mode is implemented but not user-configurable in the current release. If your workload requires offline operation and you want to talk about running holographic mode, email us at holographic@getswytch.com. The capability is real; the self-service configuration for it is a future deliverable.

When holographic mode does become generally available, designing for it will mean designing for the reconciliation. What does “both sides made a valid decision” look like for your data? What does the application do when it discovers two valid histories? Those are questions the application has to answer; Swytch can show you the divergence precisely, but deciding which branch is canonical is a business problem.

Designing around contention

Whether you’re in safe mode today or holographic mode later, the best mitigation for partition-related drama is to arrange your writes so that contested data has a natural owner. This is a design discipline, not a Swytch feature:

Per-region ownership. Data that primarily belongs to one geographic region (a user’s profile if your users don’t roam, regional operational data) gets written from that region. Other regions subscribe and read, but don’t write. Partitions separate the writer from the readers, which costs read freshness but can’t diverge.
Per-tenant ownership. A multi-tenant system where each tenant has a home region writes tenant data from that region. Partitions between tenants are harmless; partitions within a tenant’s region are handled by whatever your single-region write path does.
Per-device ownership. Devices that generate their own data (IoT telemetry, field devices, mobile clients) write their own records. A partition between a device and the central cluster just means the device queues locally and syncs when the link returns — no conflict because no two writers ever touched the same data.

The principle: minimize the amount of data where two different writers might collide across a partition. Safe mode then has fewer opportunities to fire the race described above, and holographic mode (when you get it) has fewer conflicts to reconcile. This is true of every distributed system; Swytch just makes the reconciliation structure explicit.