Understanding PostgreSQL TOAST: The 8KB Block Constraint and Oversized Data Storage

If you've worked with PostgreSQL for any length of time, you've probably encountered scenarios where you need to store large text fields, JSON documents, or binary data. Behind the scenes, PostgreSQL uses a sophisticated mechanism called TOAST (The Oversized-Attribute Storage Technique) to handle values that don't fit within its fundamental storage unit: the 8KB block.

Understanding TOAST isn't just academic—it directly impacts query performance, storage efficiency, and how you should design your schema for large data types.

The 8KB Block: PostgreSQL's Fundamental Constraint

PostgreSQL organizes all table data into fixed-size pages (also called blocks), which are 8KB (8192 bytes) by default. This is a compile-time constant that defines how PostgreSQL reads and writes data to disk.

Why 8KB?

The 8KB block size is a carefully chosen compromise:

Small enough to minimize I/O overhead when accessing individual rows
Large enough to store multiple rows per block for better disk utilization
Aligned with OS page sizes for efficient memory management
Matches common filesystem block sizes for optimized disk operations

The Problem: What Happens When Data Exceeds 8KB?

A single PostgreSQL row must fit within a single 8KB page. But what happens when you try to insert:

A large TEXT column with a 50KB document?
A BYTEA field storing a 5MB image?
A JSONB column containing a 100KB configuration object?

Without TOAST, PostgreSQL would either:

Fail the insert (unacceptable)
Require variable-sized pages (complex, slow)
Chain pages together (fragmentation nightmare)

Enter TOAST.

What is TOAST?

TOAST (The Oversized-Attribute Storage Technique) is PostgreSQL's solution for storing values that exceed the block size. It's an automatic, transparent mechanism that:

Compresses large values to try fitting them in the main table
Moves oversized values to a separate TOAST table if compression isn't enough
Chunks extremely large values into multiple TOAST rows
Maintains references in the main table pointing to TOAST storage

TOAST Architecture

Every table with potentially large columns gets a companion TOAST table automatically created:

-- Main table
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT,
    content TEXT,  -- Can be large, eligible for TOAST
    metadata JSONB -- Can be large, eligible for TOAST
);

-- PostgreSQL automatically creates:
-- pg_toast.pg_toast_<oid> (the TOAST table)
-- pg_toast.pg_toast_<oid>_index (index for TOAST chunks)

The TOAST table is hidden from normal queries but stores oversized attribute values in chunks.

When Does TOAST Activate?

TOAST doesn't activate for every value—only when necessary. Here's the decision process:

1. Compression First (TOAST Strategy: EXTENDED)

PostgreSQL attempts to compress the value using LZ compression. If the compressed value fits within the page, no TOAST storage is needed.

-- Example: Large text that compresses well
INSERT INTO documents (content)
VALUES (repeat('PostgreSQL is awesome! ', 5000));
-- This might compress enough to stay in the main table

2. Out-of-line Storage (When Compression Fails)

If compression doesn't reduce the value below ~2KB (the threshold), PostgreSQL moves the value to the TOAST table.

3. The 2KB Threshold

PostgreSQL uses a 2KB (2048 bytes) heuristic:

Values < 2KB: Stay inline in the main table
Values ≥ 2KB: Eligible for compression or TOASTing
After compression, if still > ~2KB → moved to TOAST table

4. Maximum Row Size

Even with TOAST, PostgreSQL enforces limits:

Main table row header: ~24 bytes
Inline data (before TOAST): ~2KB per column
Total uncompressed row: Must fit calculation before TOAST activation

TOAST Storage Strategies

PostgreSQL supports four TOAST strategies per column:

1. PLAIN (No TOAST)

ALTER TABLE documents
ALTER COLUMN id SET STORAGE PLAIN;

No compression, no out-of-line storage
Only for fixed-size types (INTEGER, BIGINT, etc.)
Cannot be changed for variable-length types

2. EXTENDED (Default for Most Types)

ALTER TABLE documents
ALTER COLUMN content SET STORAGE EXTENDED;

Compression first, then out-of-line storage if needed
Default for TEXT, BYTEA, JSONB
Best balance of performance and storage

3. EXTERNAL

ALTER TABLE documents
ALTER COLUMN content SET STORAGE EXTERNAL;

Out-of-line storage without compression
Useful when data is already compressed (JPEG, GZIP files)
Faster access since no decompression needed

4. MAIN

ALTER TABLE documents
ALTER COLUMN content SET STORAGE MAIN;

Compression allowed, but prefers inline storage
Avoids TOAST table unless absolutely necessary
Can hurt performance if large values stay inline

How TOAST Works: Step-by-Step

Let's trace what happens when you insert a large value:

Example: Inserting a 50KB Document

CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title VARCHAR(200),
    body TEXT  -- EXTENDED storage by default
);

INSERT INTO articles (title, body)
VALUES ('Large Article', repeat('Lorem ipsum...', 10000));
-- Generates ~50KB of text

Step 1: Initial Size Check

PostgreSQL sees body column is 50KB
Exceeds the 2KB inline threshold
TOAST activation triggered

Step 2: Compression Attempt

Applies LZ compression to the 50KB value
Assume compression reduces it to 15KB (still too large)

Step 3: Move to TOAST Table

Splits the 15KB compressed value into 2KB chunks
Stores chunks in pg_toast.pg_toast_<oid>:

Chunk 1: [2KB of compressed data]
Chunk 2: [2KB of compressed data]
Chunk 3: [2KB of compressed data]
...
Chunk 8: [Remaining compressed data]

Step 4: Store Reference in Main Table

Main table articles stores only a TOAST pointer (~18 bytes)
Pointer contains: TOAST table OID, chunk ID, length

Reading TOAST Data

When you query the row:

SELECT body FROM articles WHERE id = 1;

PostgreSQL:

Reads the main table row (~200 bytes)
Detects TOAST pointer in body column
Follows pointer to TOAST table
Reads all 8 chunks sequentially
Reassembles and decompresses
Returns the original 50KB value

Performance implication: TOAST reads require additional I/O operations.

Performance Implications

1. Sequential Scans Benefit

SELECT id, title FROM articles;  -- Fast
-- Only reads main table, no TOAST access needed

When you don't access TOASTed columns, PostgreSQL skips TOAST reads entirely.

2. Selective Column Access

SELECT id, title, body FROM articles WHERE id = 1;  -- Slower
-- Must fetch and decompress TOAST data

Fetching TOASTed columns adds overhead.

3. Index-Only Scans Break

CREATE INDEX idx_title ON articles(title);

SELECT title FROM articles WHERE title LIKE 'Large%';
-- If TOAST columns exist, may not be index-only scan

TOAST columns in the table can prevent index-only scans.

4. UPDATE Performance

UPDATE articles SET title = 'Updated' WHERE id = 1;
-- Even if not touching `body`, PostgreSQL may rewrite TOAST data

Updates can trigger TOAST rewrites if the row is moved to a new page.

Best Practices for Working with TOAST

1. Separate Large Columns into Different Tables

-- Instead of:
CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT,
    body TEXT,  -- Large
    summary TEXT
);

-- Consider:
CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT,
    summary TEXT
);

CREATE TABLE article_bodies (
    article_id INTEGER PRIMARY KEY REFERENCES articles(id),
    body TEXT  -- TOAST isolated
);

Benefit: Queries on articles avoid TOAST overhead unless explicitly joined.

2. Use EXTERNAL for Pre-Compressed Data

CREATE TABLE images (
    id SERIAL PRIMARY KEY,
    filename TEXT,
    data BYTEA  -- Stores JPEG/PNG
);

ALTER TABLE images ALTER COLUMN data SET STORAGE EXTERNAL;
-- Avoids double compression

3. Monitor TOAST Table Growth

-- Check TOAST table size
SELECT
    t.schemaname,
    t.tablename,
    pg_size_pretty(pg_total_relation_size(format('%I.%I', t.schemaname, t.tablename)::regclass)) AS total_size,
    pg_size_pretty(pg_relation_size(format('%I.%I', t.schemaname, t.tablename)::regclass))       AS table_size,
    pg_size_pretty(pg_total_relation_size(c.reltoastrelid))                                       AS toast_size
FROM pg_tables t
JOIN pg_class c
  ON c.relname = t.tablename
JOIN pg_namespace n
  ON n.oid = c.relnamespace
 AND n.nspname = t.schemaname
WHERE t.schemaname = 'public'
  AND c.reltoastrelid <> 0
ORDER BY pg_total_relation_size(c.reltoastrelid) DESC;

4. Limit Column Size When Possible

-- Instead of TEXT (unlimited):
ALTER TABLE articles ADD COLUMN summary VARCHAR(500);
-- Guarantees inline storage if < 2KB

5. Consider JSONB Compression

CREATE TABLE configs (
    id SERIAL PRIMARY KEY,
    settings JSONB  -- EXTENDED by default, compresses well
);

-- JSONB benefits from TOAST compression for large documents

Common Pitfalls

1. Assuming Small TOAST Values Are Cached

TOAST values are not cached separately—they're part of the buffer pool but require full chunk retrieval.

2. Forgetting TOAST on Partitioned Tables

Each partition gets its own TOAST table. Monitor all partitions.

3. Not Accounting for TOAST in Backup Sizes

TOAST tables can be larger than the main table. Always check total relation size.

Conclusion

PostgreSQL's TOAST mechanism elegantly solves the 8KB block size limitation by:

Compressing large values first
Chunking and storing oversized data out-of-line
Maintaining transparent access through pointer indirection

Understanding TOAST helps you:

Design better schemas (separate large columns)
Write faster queries (avoid TOASTed columns when not needed)
Monitor storage growth (track TOAST table sizes)
Optimize updates (minimize TOAST rewrites)

The next time you insert a large value into PostgreSQL, you'll know exactly what's happening under the hood—and how to design around it for optimal performance.

Want to dive deeper into PostgreSQL internals? Check out PostgreSQL Internals Mastery Volume I for comprehensive coverage of storage architecture, memory management, and advanced optimization techniques.

Have questions about TOAST or PostgreSQL performance? Reach out on X or Mastodon.