PostgreSQL Database Design Rules: Complete Guide with Examples

Ever had to fix a production problem at 3 a.m.? The real cause is often bad database design choices made long ago. Those we’ll fix it later or the app will handle it decisions can blow up months later, causing downtime and messed-up data.

These 25 rules come from building systems that handle tons of data. Think of it as a guide from the trenches, not a boring lecture. Each rule has a good way and a bad way example, so you can easily see the difference.

Explore our comprehensive list of Free AI Tools: 31 Best Powerful Tools You Can Use Every Day to find more incredible resources.

I. Basic Schema Stuff

1) Always use a substitute primary key

Important: Really important.

Real-world keys (like email or username) can change. If they do, you’ll have to update millions of rows and every related foreign key. Substitute keys (BIGSERIAL / UUID) are reliable and won’t cause headaches.

-- ✅ Good:
CREATE TABLE users (
  id         BIGSERIAL PRIMARY KEY,
  email      TEXT NOT NULL UNIQUE,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- ❌ Bad:
CREATE TABLE users (
  email TEXT PRIMARY KEY  -- This will cause problems down the road!
);

2) Every table needs `created_at` and `updated_at`

Important: Super important.

Without these, you can’t easily track what happened when. They’re crucial for debugging, audits, and moving data. Use TIMESTAMPTZ, not basic timestamps.

-- ✅ Good:
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

-- ❌ Bad:
-- No timestamp columns at all
-- Or TIMESTAMP without a timezone

3) Use `TIMESTAMPTZ` instead of `TIMESTAMP`

Important: Pretty important.

TIMESTAMP drops timezone info. Once you have servers in different timezones, your data gets confusing. TIMESTAMPTZ handles timezones correctly behind the scenes.

-- ✅ Good:
event_time TIMESTAMPTZ NOT NULL

-- ❌ Bad:
event_time TIMESTAMP  -- Timezone info is lost!

4) Use `TEXT` instead of `VARCHAR(n)`

Important: Okay, this one’s good to know.

In PostgreSQL, TEXT and VARCHAR perform the same. VARCHAR(n) just adds a length check that you’ll probably have to change later. If you need limits, use constraints.

-- ✅ Good:
name TEXT NOT NULL,
CONSTRAINT chk_name_len CHECK (length(name) &lt;= 255)

-- ❌ Bad:
name VARCHAR(255)  -- You'll probably need to change this later.

5) Use `BIGINT` / `BIGSERIAL` for IDs, not `INT`

Important: This could save you from a major problem.

INT only goes up to about 2.1 billion. Some companies have actually hit this limit! BIGINT gives you way more room to grow.

-- ✅ Good:
id BIGSERIAL PRIMARY KEY

-- ❌ Bad:
id SERIAL PRIMARY KEY  -- A potential limit in the future

II. Relationships and Keys

6) Always use foreign keys

Important: Really important.

Without them, you’ll end up with orphaned data. Saying the app will handle it isn’t good enough. Bugs happen!

-- ✅ Good:
user_id BIGINT NOT NULL
  REFERENCES users(id)
  ON DELETE CASCADE

-- ❌ Bad:
user_id BIGINT  -- The app will check it. (It won't always.)

7) Think carefully about `ON DELETE`

Important: This can prevent data loss or broken features.

The default is RESTRICT (which stops deletions). CASCADE deletes related data. SET NULL keeps the row but clears the link. Pick the wrong one, and you’ll either block operations or delete important stuff.

-- ✅ Good:
REFERENCES orders(id) ON DELETE CASCADE      -- Delete order items when the order is removed.
REFERENCES users(id)  ON DELETE SET NULL     -- Keep the order, but clear the user if the user is deleted.
REFERENCES accounts(id)                       -- Don't allow deleting accounts that have orders.

-- ❌ Bad:
-- Not thinking about what happens when you delete something.

8) Use junction tables for many-to-many relationships

Important: Seriously, do this.

Don’t store M:N relations as comma-separated stuff. Junction tables let you index, join, and add constraints.

-- ✅ Good:
CREATE TABLE user_roles (
  user_id    BIGINT REFERENCES users(id),
  role_id    BIGINT REFERENCES roles(id),
  granted_at TIMESTAMPTZ DEFAULT NOW(),
  PRIMARY KEY (user_id, role_id)
);

-- ❌ Bad:
role_ids INTEGER[]      -- Hard to join, no integrity
roles    TEXT           -- 'admin,editor' (yuck)

9) Index every foreign key column

Important: Helps performance a lot.

PostgreSQL doesn’t automatically make indexes for foreign keys. Without one, deleting or updating a parent row can be super slow.

-- ✅ Good:
CREATE INDEX idx_orders_user_id
  ON orders(user_id);

-- ❌ Bad:
-- Foreign key without an index = slow queries

10) Use soft deletes for important data

Important: You’ll thank yourself later.

Hard deletes are forever. Add a deleted_at column and filter by it. Use partial indexes to keep things fast.

-- ✅ Good:
deleted_at TIMESTAMPTZ DEFAULT NULL;

CREATE INDEX idx_users_active
  ON users(email)
  WHERE deleted_at IS NULL;

-- ❌ Bad:
DELETE FROM users WHERE id = 42;
-- Gone forever! And no record of it.

III. Keeping Your Data Clean

11) Normalize first; denormalize only when needed

Start with a good, normalized design. Denormalizing (adding redundant data) can speed up reads, but it makes writes harder and can lead to inconsistent data. Document why you did it.

-- ✅ Good:
orders.user_id → users.id → users.email

-- Only denormalize after measuring:
-- Copied email into orders to speed billing report:
--  800ms → 12ms on 50M rows

-- ❌ Bad:
-- Copying `user_name`, `user_email` everywhere because it's easier.

12) Default to `NOT NULL`; only use `NULL` on purpose

NULL causes tricky logic. NULL != NULL, comparisons return NULL, and aggregates ignore values. Every nullable column adds extra complexity.

-- ✅ Good:
status     TEXT NOT NULL DEFAULT 'pending',
deleted_at TIMESTAMPTZ  -- NULL means not deleted

-- ❌ Bad:
name  TEXT,     -- Nullable by accident
price NUMERIC   -- Is NULL the same as 0? Who knows?

13) Use `CHECK` constraints for validation

App validation can be bypassed. Database constraints are the last line of defense.

-- ✅ Good:
CONSTRAINT chk_price_positive CHECK (price &gt; 0),
CONSTRAINT chk_status_valid CHECK (status IN ('active','inactive','suspended'))

-- ❌ Bad:
-- The API validates it.
-- Meanwhile, the database has `price = -500` and `status = 'yolo'`

14) Use `NUMERIC` for money; never `FLOAT`/`DOUBLE`

Floating-point math isn’t accurate. 0.1 + 0.2 becomes a weird number. Use NUMERIC(precision, scale) or store cents as BIGINT.

-- ✅ Good:
price   NUMERIC(12,2) NOT NULL,
balance NUMERIC(15,2) NOT NULL
-- Or: price_cents BIGINT NOT NULL

-- ❌ Bad:
price FLOAT
price DOUBLE PRECISION

15) Be careful with `ENUM`s – Use `CHECK` or lookup tables if needed

PostgreSQL ENUMs can be tough to change. You can add values, but deleting or renaming is tricky. CHECK or reference tables are more flexible.

-- ✅ Good:
status TEXT NOT NULL
  CHECK (status IN ('draft','published'));

-- Or lookup table:
-- status_code TEXT REFERENCES statuses(code)

-- ❌ Bad:
CREATE TYPE status AS ENUM ('draft','published');

IV. Making Things Fast

16) Index your filters, joins, and sorts

Without indexes, PostgreSQL reads everything. On millions of rows, that’s the difference between fast and slow.

-- ✅ Good:
CREATE INDEX idx_orders_user_status
  ON orders(user_id, status)
  WHERE deleted_at IS NULL;

-- ❌ Bad:
-- We'll add indexes when it gets slow. (It will be slow, at 3 a.m.)

17) Use partial indexes when you only need a slice

If most rows are completed and you only query pending, a full index wastes space. Partial indexes are smaller and faster.

-- ✅ Good:
CREATE INDEX idx_orders_pending
  ON orders(created_at)
  WHERE status = 'pending';

-- ❌ Bad:
CREATE INDEX idx_orders_created
  ON orders(created_at);

18) `EXPLAIN ANALYZE` before you deploy

Don’t guess about performance. EXPLAIN ANALYZE shows the query plan, row counts, and time. A Seq Scan on a big table is a bad sign.

-- ✅ Good:
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 42 AND status = 'pending';

-- ❌ Bad:
-- Deploying because it worked on my machine with 10 rows.

19) Use connection pooling (PgBouncer)

Each PostgreSQL has overhead. Too many direct things can use up memory. PgBouncer shares things and prevents issues.

-- ✅ Good:
App → PgBouncer (6432) → PostgreSQL
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

-- ❌ Bad:
App → PostgreSQL directly (500 )
-- And then the server runs out of memory.

V. Changes and Keeping Things Running

20) Plan column changes carefully

ALTER TABLE can lock tables for a while. The safe way: add column → fill in data → switch → remove old column.

-- ✅ Good:
ALTER TABLE users ADD COLUMN name_new TEXT;  -- fast

-- Then fill in the data in small batches.
-- Update the app to use `name_new`.
-- Remove the old column later.

-- ❌ Bad:
ALTER TABLE users RENAME COLUMN name TO full_name;
-- The app breaks right away.

21) UUID v7 for distributed systems; BIGSERIAL for single-node

BIGSERIAL is small and fast. UUIDs are good when coordinating is hard (microservices, multi-region, offline data). UUIDv7 is better than v4 for sorting and indexing.

-- ✅ Good:
-- Single PostgreSQL :
id BIGSERIAL PRIMARY KEY

-- Distributed/microservices:
id UUID PRIMARY KEY DEFAULT gen_random_uuid();  -- v4
-- Or generate UUIDv7 in the application code.

-- ❌ Bad:
-- UUID v4 as the main key on big tables:
-- Random inserts = slow performance

22) Wrap operations in transactions

Without deals, each step saves on its own. If step 2 fails, your data problems.

-- ✅ Good:
BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- ❌ Bad:
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- If it crashes here = money gone
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

23) Divide big tables (100M+ rows)

Divide by time or group. Queries that look at one division don’t have to read the others. Maintenance becomes easier.

-- ✅ Good:
CREATE TABLE events (
  id         BIGSERIAL,
  created_at TIMESTAMPTZ NOT NULL,
  payload    JSONB
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2025_01
  PARTITION OF events
  FOR VALUES FROM ('2025-01-01')
  TO ('2025-02-01');

-- ❌ Bad:
-- 500M rows in one table
-- Maintenance takes forever.
-- Every query is slow.

24) Store JSON as JSONB, not JSON or TEXT

JSONB is binary, indexable, and supports operators. JSON is text and gets re-parsed. TEXT doesn’t even have a structure.

-- ✅ Good:
metadata JSONB NOT NULL DEFAULT '{}';

CREATE INDEX idx_meta_gin
  ON products USING GIN(metadata);

-- ❌ Bad:
metadata JSON   -- Gets re-parsed
metadata TEXT   -- No check

25) Use Row-Level Security (RLS) for multi-tenant things

Bugs can show data. RLS keeps things separate inside to prevent problems.

-- ✅ Good:
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON documents
  USING (tenant_id = current_setting('app.tenant_id'));

-- ❌ Bad:
-- `WHERE tenant_id = ?` in every query
-- If you miss it, you'll have leaks.

Naming Tips

Be consistent with names!

Tables: plural, snake_case → users, order_items
Primary keys: always id as BIGSERIAL or UUID
Foreign keys: {singular_table}_id → user_id, order_id
Indexes: idx_{table}_{columns} → idx_users_email
Constraints: chk_{table}_{desc} / uq_{table}_{cols}
Timestamps: created_at + updated_at everywhere; always TIMESTAMPTZ

Final Words

These rules are based on experience. The bad examples have caused real incidents.

Start by checking your design against the first five rules. You’ll probably find some.

Save this guide and share it.

FAQ

1. Why shouldn’t I use email as a primary key in PostgreSQL?

Because real-world values like emails can change. Updating them means touching related foreign keys and potentially millions of rows. A surrogate key like BIGSERIAL or UUID avoids that risk.

2. Is TIMESTAMPTZ really better than TIMESTAMP?

Yes. TIMESTAMPTZ safely handles time zones and prevents data confusion when your app or servers run in different regions. TIMESTAMP can silently cause inconsistent time data.

3. When should I use BIGSERIAL vs UUID for IDs?

Use BIGSERIAL for single-node PostgreSQL setups where performance and simplicity matter. Use UUID (preferably v7) in distributed or microservice systems where coordination is harder.

4. Do I really need foreign keys if my app validates everything?

Yes. Application validation can fail or be bypassed. Foreign keys protect your data at the database level and prevent orphaned or inconsistent records.

5. When should I partition a PostgreSQL table?

Consider partitioning when a table reaches 100M+ rows or when queries consistently filter by time or tenant. Partitioning improves performance and simplifies maintenance.

Source: Umidjon Tojiboyev

Suggestions

PostgreSQL Database Design Rules: 25 Proven Tips to Avoid Costly Mistakes