PostgreSQL Storage I/O Transformation Hooks: Decoupling Core Storage from Encryption Logic
Introduction
A pgsql-hackers thread proposes a new hook-based protocol for data transformation in PostgreSQL storage paths, with Transparent Data Encryption (TDE) as the reference use case. The core idea is to let extensions handle cryptographic transforms while PostgreSQL core keeps ownership of storage orchestration, checksums, and durability semantics.
Technical Analysis
What the patch set introduces
The patch series adds five new hook points in core I/O and WAL paths:
mdread_post_hookmdwrite_pre_hookmdextend_pre_hookxlog_insert_pre_hookxlog_decode_pre_hook
It also introduces a page-level transform marker in pd_flags and a WAL marker (XLR_BLOCK_ID_TRANSFORMED = 251) so replay can fail fast when transformed payload appears without the required extension.
Patch evolution (v1 to v4)
- v1: Initial infrastructure (
0001) andcontrib/test_tdereference extension (0002). - v2: Build-system integration updates for
test_tde(Meson and contrib wiring). - v3: Compile fix for
mdread_post_hookcall site ((void **)cast for pointer-type compatibility). - v4: Added
0003test portability fix for tablespace regression test behavior across environments.
SQL examples
These examples come from contrib/test_tde/sql/basic.sql in the posted patches. They are representative of the design and require a patched build with shared_preload_libraries = 'test_tde'; they are not available in released PostgreSQL versions.
SHOW test_tde.key;
CREATE TABLE test_encrypt (
id int,
secret_data text,
secret_number int
);
INSERT INTO test_encrypt VALUES
(1, 'sensitive data 1', 12345),
(2, 'sensitive data 2', 67890),
(3, NULL, 11111);
SELECT COUNT(*) FROM test_encrypt;
SELECT secret_data FROM test_encrypt WHERE secret_number = 12345;
CREATE TABLE test_set_tablespace (id int, data text);
INSERT INTO test_set_tablespace
SELECT g, 'data ' || g FROM generate_series(1, 50) g;
ALTER TABLE test_set_tablespace SET TABLESPACE regress_tde_tblspc;
SELECT COUNT(*) FROM test_set_tablespace;
Community Insights
Reviewer feedback focused less on cryptography itself and more on architecture boundaries:
- Whether WAL and buffer-manager hooks should be split into separate discussions.
- Whether these new hooks overlap with ongoing extensible-SMGR work that already enables storage-layer interception.
- Whether benchmark data is sufficient for real hook-enabled workloads, especially inside critical sections.
The author argued that narrow transform hooks reduce maintenance risk versus replacing broader storage implementations, while reviewers emphasized API overlap and long-term maintainability concerns.
Technical Details
The reference extension (test_tde) demonstrates several safety rules:
- Verify checksums on encrypted page images before decrypting.
- Clear transform marker and recompute checksum for plaintext buffers after reverse-transform.
- Pre-allocate and manage encryption contexts in a memory context allowed in critical sections.
- Chain hooks to preserve composability with other extensions.
On the WAL side, the transform marker and decode hook path are intended to prevent accidental interpretation of transformed records when required transform logic is absent.
Current Status
The thread reached a fourth revision (v4) with iterative fixes, but discussion remained in RFC/review territory. Core design questions (relationship to SMGR extensibility and the scope split between WAL and heap/buffer hooks) were still active, so this is best understood as an evolving infrastructure proposal rather than a committed PostgreSQL feature.
Conclusion
This thread is a useful example of PostgreSQL’s extension philosophy under pressure from security requirements: keep the core generic, expose carefully bounded hooks, and require explicit protocol markers to preserve safety. The technical direction is promising, but integration with broader storage extensibility efforts and performance validation will likely determine whether the approach lands upstream.