Documentation Index
Fetch the complete documentation index at: https://docs.pg-sharding.tech/llms.txt
Use this file to discover all available pages before exploring further.
When to use it
Use this flow when you need explicit control over what moves and when:- Add a new empty shard and fill it with data from existing shards.
- Remove a shard — first move all its data elsewhere, then delete it.
- Quickly fix a hotspot (overloaded data range) without waiting for the balancer.
- Move a specific tenant’s data to a separate shard.
Prerequisites
Before starting, make sure that:- The cluster is deployed with a coordinator (or via spqrinfra). Data rebalancing is not available in bare-router deployments that run without a coordinator.
- The destination shard is registered in the cluster and visible via
SHOW shards. See Deployment overview and Coordinator for how shards are added. - Both
spqrguardandpostgres_fdwextensions are installed on the source and destination shards. spqrguardis added toshared_preload_librariesinpostgresql.conf.
spqrguard is optional but strongly recommended. It prevents direct inserts
into shards that bypass SPQR, protecting against data corruption.1. Connect to the coordinator
Rebalancing commands are executed against the coordinator’s administrative console, not the router. Usepsql over the PostgreSQL protocol as described in
How to connect.
In high-availability setups, SPQR runs multiple coordinator instances, but only
one of them is active at a time (see
Coordinator configuration).
2. Inspect the current topology
Check the shards and the current key range layout:3. Plan the splits
You can redistribute an entire key range in one go — there is no technical limitation preventing it. In practice, however, it is common to split the source key range into several smaller pieces first and move them one by one. This gives you more control: you can monitor each piece independently, stop at any point, and limit the impact on the running application. When in doubt, prefer smaller pieces — you can always UNITE them later.4. Split the source key range
SPLIT KEY RANGE turns one
key range into two, splitting it at the given bound. Both resulting ranges
continue to point to the same shard until you explicitly redistribute one of
them.
Example — splitting krid1 (covering 0..335000 on sh1) into four pieces:
SHOW key_ranges.
5. Redistribute key ranges to the destination shard
Move the new key ranges withREDISTRIBUTE KEY RANGE.
It migrates both the metadata and the actual data.
Internally the coordinator splits the source key range into batches of
BATCH SIZE rows and moves them one by one as separate move tasks, all
tracked under a single task group. While a batch is in flight, the rows it
covers are briefly unavailable for writes.
NOWAIT returns control immediately and lets you observe the task group
asynchronously; without it, the session blocks until the task group finishes.
Use the CHECK modifier if you want to validate the operation without
performing the move.
Choosing a batch size
BATCH SIZE controls how many rows each move task transfers at a time. It is
a tradeoff:
- Smaller batches — shorter per-batch unavailability, smoother impact on the application, more total batches and therefore longer total wall-clock time.
- Larger batches — fewer round trips and faster overall, but each batch locks a larger slice of the key range for longer.
6. Monitor the move
While a redistribute is running, watch:SHOW task_groupon the coordinator — current state of every task group and its move tasks. Possible values forstatearePLANNED,RUNNINGandERROR; a task group disappears from the listing once it has finished successfully.- Application error rate and latency — in particular client-visible 5xx errors and write latency on the affected key range.
- Shard load — CPU, disk and replication lag on both source and destination shards.
7. React to problems
If something goes wrong during the rebalancing, follow this escalation order.Stop the problematic task group
If one redistribute starts misbehaving (for example, replication lag on the destination shard is growing), stop just that task group and leave the others running:Retry if the issue was transient
If a task group ended up in an error state due to a transient issue (network blip, restarted shard, etc.), retry it:Recovering from partial failure
If a task group fails mid-flight:- Run
SHOW task_groupto identify the failed task group and its error. - Run
SHOW key_rangesto see which key ranges are locked and where the data currently lives. - Investigate why the task failed (check coordinator and shard logs).
- Fix the underlying issue and
RETRY TASK GROUPto let it finish — this is the safest way to unlock the affected key ranges.
8. Post-rebalancing cleanup
Once all redistributes have finished, verify withSHOW key_ranges that the
layout matches what you expect and spot-check that data actually lives on the
right shards. If two adjacent key ranges now point to the same shard, you can
merge them with
UNITE KEY RANGE.