Engineering

Schema Design for Sync

How to structure your Postgres schema so that clients can write offline, replicate safely, and evolve without breakage.

Learning Objectives

By the end of this module you will be able to:

  • Explain why server-generated sequential IDs break offline writes and select an appropriate client-generated alternative (UUID, UUIDv7, ULID).
  • Design a soft-delete pattern using tombstone columns that is compatible with sync engine replication.
  • Plan a schema evolution strategy that prevents data loss when multiple client versions coexist.
  • Identify which Postgres constraints are safe under partial replication and which require application-level enforcement.

Core Concepts

The fundamental constraint: clients must own their IDs

A local-first application must be able to create records while disconnected. That single requirement invalidates the most common pattern in server-side Postgres development: the SERIAL or IDENTITY column.

A SERIAL primary key is a server-managed sequence. Every INSERT must round-trip to the server to obtain the next value. If the client is offline, that round-trip is impossible. The client cannot buffer the record, cannot reference it from other records, and cannot continue working. Server sequences are not just inconvenient — they are architecturally incompatible with offline-first workflows.

The replacement is client-generated IDs: values the client can produce locally, without any coordination, that will still be globally unique when eventually synced to the server.

Drop SERIAL, keep UUID

If your existing Postgres schema uses SERIAL, BIGSERIAL, or GENERATED ALWAYS AS IDENTITY, those columns must be replaced before introducing a sync layer. There is no workaround.

UUID, UUIDv7, and ULID

Three options dominate the space. They share the property of being generatable on the client, but differ meaningfully in performance characteristics on the server.

UUID v4 is the baseline. It is natively supported in every client environment, trivial to generate in JavaScript (crypto.randomUUID()), and rounds correctly through Postgres. The cost is index fragmentation: because v4 values are random, each new insert lands at a random position in the B-tree index, causing page splits and scattered writes.

UUIDv7 solves this. Its 48 most-significant bits are a Unix epoch timestamp, so new inserts sort to the end of the index rather than scattering. The result is append-like behaviour — similar to what you had with sequences, without the server dependency. Postgres 18 adds a native uuidv7() function, making this the recommended default for new schemas targeting modern Postgres. For Postgres 17 and earlier, extensions like pg_uuidv7 provide equivalent functionality.

ULID is a third option with a different encoding: 26-character base32 strings, lexicographically sortable by time prefix. The pgx_ulid extension stores them in binary, yielding roughly 30% faster ID generation than both UUIDv4 and UUIDv7 with more compact storage. The trade-off is an explicit Postgres extension dependency and slightly more ecosystem friction. ULID is worth considering for time-series or event-sourcing data where chronological ordering is load-bearing.

Fig 1
Strategy Client generation Index behavior Dependency UUIDv4 crypto.randomUUID() Random (page splits) Native UUIDv7 library or PG18 Append-like Native (PG18) / ext ULID ulid() library Append-like Extension required
Client-generated ID strategies compared

Soft deletes and tombstones

In a conventional server-only application, deleting a row with DELETE FROM items WHERE id = $1 is the obvious approach. In a local-first system, a hard delete creates an immediate problem: how does a client that was offline when the delete happened learn that the record should disappear? There is no row to replicate — the data is simply gone.

The solution is soft deletes: instead of removing the row, you mark it as deleted. The row persists in the database, carries a deleted_at timestamp (or deleted boolean), and the sync engine can replicate that change to all clients. When clients see deleted_at IS NOT NULL, they know the record has been removed.

This means deletion becomes a data update, not a destructive operation. Late-arriving deletes from disconnected clients can be sequenced correctly — the timestamp tells you when the delete happened, which is sufficient to resolve ordering. Without this, deletions would be irreproducible events in a distributed system.

The dominant pattern uses a deleted_at TIMESTAMPTZ column. It encodes both the fact and the time of deletion in a single column, which is useful for auditing and for CRDT-style last-write-wins resolution.

Partial indexes keep soft deletes performant

The obvious concern with soft deletes is that your tables grow indefinitely and your indexes include records no application query will ever return. Partial indexes address this directly.

A partial index only indexes rows that match a given predicate. For soft-delete tables:

CREATE INDEX items_active_idx ON items (user_id, updated_at)
  WHERE deleted_at IS NULL;

This index only tracks non-deleted rows. It does not grow as you accumulate tombstones, and queries filtered on WHERE deleted_at IS NULL will use it efficiently. Partial indexes are natively supported in PostgreSQL with no extension dependencies.

You can also apply partial unique constraints — for example, enforcing that email is unique among non-deleted users:

CREATE UNIQUE INDEX users_email_active_unique ON users (email)
  WHERE deleted_at IS NULL;

This correctly allows a deleted user's email to be reused by a new account.

Schema evolution: the distributed client problem

In server-side development, deploying a schema migration is a discrete event. You run the migration, restart the server, done. In a local-first system, that model breaks because clients do not update simultaneously.

Consider what happens when you add a new column priority SMALLINT to a tasks table. You deploy the server migration and release a new app version. Some users update immediately. Others are offline, or simply haven't upgraded yet. Now your system is simultaneously serving:

  • New clients that read and write priority
  • Old clients that don't know priority exists

The dangerous case: an old client reads a task created by a new client (which has priority = 3), modifies the task's title, then writes it back. If the old client's deserialization drops unknown fields, the priority value is silently deleted when the old client writes back. This is data loss from a schema migration, not a bug in the application logic.

Backward-compatible migrations are necessary but not sufficient. You must also ensure that older clients do not overwrite and destroy fields they don't know about.

The industry-standard solution is the write-both/read-new pattern: during a migration transition window, new versions write data to both old and new columns (ensuring old clients can still read), while the application reads from the new column. Only after all clients have upgraded can you stop writing the old column and eventually drop it. This requires planning migrations across at least two deployment phases.

How sync engines decouple server and client schema

PowerSync uses a specific mechanism that reduces this migration surface area. Rather than replicating structured rows, it replicates Postgres data as schemaless JSON. The client-side SQLite database receives JSON blobs, with client-side views providing structured access.

The practical effect: a server-side schema change does not automatically require a client-side SQLite migration. Sync Rules can apply transformations to maintain compatibility with older clients. New columns added to the server schema do not appear on old clients until those clients update their view definitions. This provides a degree of schema independence that traditional replication lacks — though it shifts some complexity into Sync Rules authoring.

ElectricSQL takes a different approach, keeping the client schema closer to the server schema, which means tighter coupling but less transformation logic to maintain.

Foreign keys under partial replication

The final structural constraint is foreign keys. On a full-data server, tasks.project_id REFERENCES projects(id) is enforced at the database level and guaranteed at all times. Under partial replication, a client may have tasks but not the referenced project rows — for example, if the sync rules filter projects by workspace membership but the user has a task in a project they're no longer part of.

PostgreSQL's logical replication does not synchronize foreign key constraints to subscriber replicas. Enforcing them at the client SQLite level would cause legitimate inserts to fail. The correct position is to treat foreign key integrity as an application-layer concern rather than a database constraint in the sync context:

  • Keep REFERENCES constraints on the Postgres server where they belong.
  • Do not enforce foreign key constraints in client-side SQLite.
  • Design sync rules to include parent records when child records are included, but accept that this is a best-effort guarantee, not an absolute one.

When a referenced record hasn't yet replicated locally, the application must handle the dangling reference gracefully — show the child record, display a placeholder for the missing parent, and resolve correctly once the parent arrives.

Deferrable constraints and optimistic mutations

One Postgres feature that remains useful in local-first write paths is deferrable constraints. PostgreSQL allows UNIQUE, PRIMARY KEY, REFERENCES, and EXCLUDE constraints to be checked at commit time rather than per-statement. NOT NULL and CHECK constraints are always immediate.

This matters when a sync write involves multiple interdependent rows. Declaring a constraint as DEFERRABLE INITIALLY DEFERRED allows you to insert rows in any order within a transaction and have the database validate only at commit. If the final state is valid, the transaction succeeds. This can simplify server-side write path logic when applying batched mutations from clients.

Key Principles

  1. Client-generated IDs are non-negotiable. Any column using SERIAL, BIGSERIAL, or IDENTITY must be replaced before introducing a sync layer. Prefer UUIDv7 for new schemas on Postgres 18+, UUIDv4 for maximum compatibility.

  2. Deletion is a write, not an erasure. Model deletes as deleted_at updates. Use a partial index on WHERE deleted_at IS NULL to prevent index bloat and maintain query performance.

  3. Assume multiple schema versions are always live. Design every migration as if old clients will survive it. Add columns with defaults; never remove columns until you have confirmed all clients have migrated.

  4. Foreign key integrity is the server's job, not the client's. Do not enforce foreign key constraints in the client SQLite database. Design sync rules to replicate parent records with children, and write application code that handles dangling references gracefully.

  5. Decouple at the right layer. Sync engines like PowerSync that use schemaless replication give you more freedom to evolve the server schema independently. Understand what your chosen sync engine replicates and what it does not.

Worked Example

Migrating a task table from server-first to sync-ready

Suppose you have a standard Nuxt/Postgres task management table:

CREATE TABLE tasks (
  id        SERIAL PRIMARY KEY,
  title     TEXT NOT NULL,
  project_id INTEGER REFERENCES projects(id),
  created_at TIMESTAMPTZ DEFAULT now()
);

This schema has three problems for local-first:

  1. SERIAL primary key — blocks offline creation.
  2. Hard delete assumption — no mechanism to propagate deletions.
  3. project_id REFERENCES projects(id) — will cause integrity failures under partial replication.

Here is the sync-ready equivalent:

CREATE TABLE tasks (
  id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  title      TEXT NOT NULL,
  project_id UUID,           -- reference without FK constraint
  deleted_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ DEFAULT now(),
  updated_at TIMESTAMPTZ DEFAULT now()
);

-- Index for active tasks per project
CREATE INDEX tasks_active_by_project ON tasks (project_id, updated_at)
  WHERE deleted_at IS NULL;

-- Unique title per project, among active tasks only
CREATE UNIQUE INDEX tasks_unique_title_per_project ON tasks (project_id, title)
  WHERE deleted_at IS NULL;

Key decisions:

  • UUID primary key, client-generated via crypto.randomUUID() in JavaScript (or gen_random_uuid() as a server fallback).
  • project_id UUID with no REFERENCES clause — referential integrity enforced at the application layer.
  • deleted_at TIMESTAMPTZ for soft deletes — NULL means active, a timestamp means deleted.
  • updated_at timestamp to enable last-write-wins conflict resolution in the sync engine.
  • Partial indexes scoped to WHERE deleted_at IS NULL to keep them lean.

Adding a field during a live rollout

One month later, you need to add a priority field. The safe migration sequence:

Phase 1 — Deploy server migration with default:

ALTER TABLE tasks ADD COLUMN priority SMALLINT DEFAULT 0 NOT NULL;

Phase 2 — Deploy new app version that writes priority. Old clients read back 0 (the default) for tasks they load. New clients read the actual value.

Phase 3 (after all clients updated) — You can now treat the default as a legacy sentinel and remove it from the business logic if needed.

Never drop columns while old clients exist

Dropping a column while old clients are still in use will cause those clients to encounter unexpected errors or silently discard data. Wait until your analytics confirm the old version has negligible active usage.

Common Misconceptions

"I can keep SERIAL and just assign a temporary local ID." Temporary local IDs create a mapping problem: every reference from other tables, every sync record, every client-side relationship points to the temporary ID. When the server assigns the real ID, all those references must be rewritten atomically. This is far harder than it sounds and produces a class of bugs (dangling references, duplicate data, sync loops) that UUID-from-the-start eliminates entirely.

"Soft deletes will bloat my tables." Only if you don't use partial indexes. With WHERE deleted_at IS NULL partial indexes, queries against active records see no degradation. The deleted rows exist in the table but are excluded from index scans. This is the standard production pattern and is well-documented in Postgres literature.

"A backward-compatible migration means I'm safe." Adding a column with a default is backward-compatible from the database's perspective. It does not protect you if an older client deserializes the record, strips the new field because it's unknown, and writes the record back. Backward compatibility at the schema level must be paired with forward-compatible client serialization logic.

"My sync engine handles schema evolution automatically." Sync engines like PowerSync can reduce the migration surface by using schemaless replication, but they do not eliminate the problem. You still need to manage client-side view definitions, ensure old clients don't corrupt data they don't understand, and coordinate the transition window.

Boundary Conditions

UUIDv7 requires Postgres 18 or an extension. If your deployment runs Postgres 16 or 17, you need pg_uuidv7 or a client-side library that generates compliant UUIDv7 values. The ordering benefits are identical; only the native availability differs.

Soft deletes do not prevent table growth. Tombstones accumulate. In applications with high deletion rates or regulatory data retention requirements, you will need a separate archival strategy to eventually move old tombstones to cold storage. Partial indexes keep query performance stable, but they do not reclaim disk space.

Schema decoupling has limits with ElectricSQL. Unlike PowerSync's schemaless approach, ElectricSQL maintains a schema that mirrors the server schema more closely. The migration window problem applies more directly: breaking schema changes on the server can impact clients more immediately. Always consult your sync engine's documentation for its specific migration model.

Deferrable constraints only defer within a transaction. A transaction with deferred constraints still must produce a valid final state at commit. Deferral does not allow permanently invalid data to be committed — it only allows intermediate invalidity within a single transaction boundary.

Partial replication makes all referential integrity eventual. Even with the best sync rule design, there will be windows where a client holds a child record without its parent. Write application code that handles this gracefully (null-safe parent access, loading indicators, graceful degradation) rather than assuming the parent will always be present.

Key Takeaways

  1. Server sequences (SERIAL, IDENTITY) cannot be used in local-first schemas. Replace them with client-generated UUIDs (or ULIDs). UUIDv7 is the best default for new schemas on Postgres 18+.
  2. Soft deletes via deleted_at are the correct deletion primitive for sync. They turn destructive operations into ordinary data updates that replicate cleanly. Always pair them with a partial index on WHERE deleted_at IS NULL.
  3. Schema migration in a local-first system is a multi-version problem. Multiple client versions are always in flight. Use write-both/read-new patterns during transition windows, add defaults to new columns, and never drop columns until old versions are gone.
  4. Foreign key constraints belong on the server, not the client. Partial replication means the parent may not be locally present. Enforce referential integrity at the application layer and design for graceful handling of dangling references.
  5. Sync engine architecture shapes your migration strategy. Schemaless replication (PowerSync) reduces client-side migration surface; schema-coupled replication (ElectricSQL) requires tighter coordination. Understand your tool's model before designing your schema.

Further Exploration

Client-Generated IDs

Soft Deletes & Indexes

Schema Evolution & Replication