pg2sqlite: ALTER TABLE & PRIMARY KEY Support
Requirements & Design Specification
This document defines requirements and implementation design for adding ALTER TABLE support to pg2sqlite-rs, enabling proper conversion of PostgreSQL identity columns and primary keys to SQLite INTEGER PRIMARY KEY AUTOINCREMENT.
1. Problem Statement
Current Behavior (v0.1.3)
pg2sqlite currently only parses CREATE TABLE statements from PostgreSQL DDL. It ignores:
ALTER TABLE ... ALTER COLUMN ... ADD GENERATED BY DEFAULT AS IDENTITYALTER TABLE ONLY ... ADD CONSTRAINT ... PRIMARY KEY
Why This Matters
PostgreSQL's pg_dump (used with --schema-only --no-owner --no-privileges) outputs DDL in a specific order where identity and primary key definitions are separate ALTER TABLE statements, not inline in CREATE TABLE:
1. CREATE TYPE (enums)
2. CREATE TABLE (columns only - NO primary key, NO identity)
3. COMMENT ON (comments)
4. ALTER TABLE ... ADD GENERATED BY DEFAULT AS IDENTITY (auto-increment)
5. ALTER TABLE ONLY ... ADD CONSTRAINT ... PRIMARY KEY (primary keys)
6. CREATE [UNIQUE] INDEX (indexes)This is fundamental to pg_dump format - there is no pg_dump option to inline primary keys or identity into CREATE TABLE.
Result
Without parsing these ALTER TABLE statements, pg2sqlite converts:
-- PostgreSQL (from pg_dump)
CREATE TABLE seed (
id smallint NOT NULL, -- no PK, no identity info here
coin coin_btc_bch NOT NULL,
...
);
ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);
ALTER TABLE ONLY public.seed ADD CONSTRAINT seed_pkey PRIMARY KEY (id);To:
-- Current SQLite output (BROKEN)
CREATE TABLE seed (
id INTEGER NOT NULL, -- cannot auto-generate id on INSERT
coin TEXT NOT NULL,
...
);Expected:
-- Desired SQLite output
CREATE TABLE seed (
id INTEGER PRIMARY KEY AUTOINCREMENT, -- auto-generates id
coin TEXT NOT NULL,
...
);Real-World Impact
This causes NOT NULL constraint failed: seed.id errors when the application inserts rows without explicit id values (all E2E tests for multisig patterns fail).
2. Requirements
REQ-1: Parse ALTER TABLE ... IDENTITY
Description: Parse ALTER TABLE <schema>.<table> ALTER COLUMN <column> ADD GENERATED BY DEFAULT AS IDENTITY statements from pg_dump output.
Input format (multi-line block, ends with );):
ALTER TABLE public.address ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (
SEQUENCE NAME public.address_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);Expected behavior: Mark the column <table>.<column> as having identity (auto-increment) semantics.
Variants to handle:
GENERATED BY DEFAULT AS IDENTITY(most common from Atlas HCL)GENERATED ALWAYS AS IDENTITY(less common but valid PostgreSQL)
REQ-2: Parse ALTER TABLE ... PRIMARY KEY
Description: Parse ALTER TABLE ONLY <schema>.<table> ADD CONSTRAINT <name> PRIMARY KEY (<columns>) statements.
Input format (2-line block):
ALTER TABLE ONLY public.seed
ADD CONSTRAINT seed_pkey PRIMARY KEY (id);Expected behavior: Mark the column(s) as primary key for the corresponding table.
Scope:
- Single-column primary keys on
idcolumns (most common case) - Multi-column primary keys (e.g., composite keys) - store as table-level constraint
REQ-3: Emit INTEGER PRIMARY KEY AUTOINCREMENT
Description: When a column has both identity (REQ-1) and primary key (REQ-2), emit INTEGER PRIMARY KEY AUTOINCREMENT in the SQLite output.
Conversion rules:
| PostgreSQL Column | Has Identity? | Has PK? | SQLite Output |
|---|---|---|---|
id smallint NOT NULL | Yes | Yes | id INTEGER PRIMARY KEY AUTOINCREMENT |
id bigint NOT NULL | Yes | Yes | id INTEGER PRIMARY KEY AUTOINCREMENT |
id integer NOT NULL | Yes | Yes | id INTEGER PRIMARY KEY AUTOINCREMENT |
id bigint NOT NULL | No | Yes | id INTEGER NOT NULL PRIMARY KEY |
id bigint NOT NULL | Yes | No | id INTEGER NOT NULL (with warning) |
id bigint NOT NULL | No | No | id INTEGER NOT NULL (current behavior) |
Important: When INTEGER PRIMARY KEY AUTOINCREMENT is used, NOT NULL must be removed (it's implicit in SQLite's INTEGER PRIMARY KEY).
REQ-4: Schema Filtering for ALTER TABLE
Description: Apply the existing --schema filter to ALTER TABLE statements, just as it applies to CREATE TABLE.
ALTER TABLE public.seed ...should be included when--schema public(default)ALTER TABLE atlas_schema_revisions.atlas_schema_revisions ...should be excluded when--schema public
REQ-5: ALTER TABLE Statements Should NOT Appear in Output
Description: The ALTER TABLE statements themselves should NOT be emitted in the SQLite output. Their information should be merged into the corresponding CREATE TABLE output.
SQLite does not support:
ALTER TABLE ... ADD GENERATED BY DEFAULT AS IDENTITYALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY
3. Design
3.1 Two-Pass Architecture
Change from single-pass to two-pass processing:
Pass 1: Parse all statements and collect metadata
- Parse
CREATE TABLEstatements (existing) - Parse
ALTER TABLE ... IDENTITYstatements → build identity map:{table.column → true} - Parse
ALTER TABLE ... PRIMARY KEYstatements → build PK map:{table → [columns]}
Pass 2: Emit SQLite DDL
- For each
CREATE TABLE, check if any column has both identity and PK - If yes, emit
INTEGER PRIMARY KEY AUTOINCREMENTfor that column - Skip emitting
ALTER TABLEstatements entirely
3.2 Data Structures
IdentityInfo:
table_name: String
column_name: String
PrimaryKeyInfo:
table_name: String
constraint_name: String
columns: Vec<String>
TableMetadata:
identity_columns: HashMap<(table, column), IdentityInfo>
primary_keys: HashMap<table, PrimaryKeyInfo>3.3 Parsing Strategy
For ALTER TABLE ... IDENTITY:
Pattern: ALTER TABLE [schema.]<table> ALTER COLUMN <column> ADD GENERATED (BY DEFAULT|ALWAYS) AS IDENTITY (
... (multi-line sequence options)
);- Extract
table_nameandcolumn_name - Consume until closing
); - The SEQUENCE details (START WITH, INCREMENT BY, etc.) can be ignored for SQLite
For ALTER TABLE ONLY ... PRIMARY KEY:
Pattern: ALTER TABLE ONLY [schema.]<table>
ADD CONSTRAINT <name> PRIMARY KEY (<col1>[, <col2>, ...]);- Extract
table_nameand column list - Multi-column PKs: only convert to
INTEGER PRIMARY KEY AUTOINCREMENTif single-column on an integer type
3.4 Column Emission Logic
fn emit_column(col, table_metadata):
is_identity = table_metadata.identity_columns.contains(table, col.name)
is_pk = table_metadata.primary_keys[table].columns == [col.name] // single-column PK
if is_identity AND is_pk AND is_integer_type(col):
emit "INTEGER PRIMARY KEY AUTOINCREMENT"
// skip NOT NULL (implicit)
// skip DEFAULT (incompatible with AUTOINCREMENT)
else if is_pk AND NOT is_identity:
emit normal_type + " NOT NULL PRIMARY KEY"
else:
emit normal conversion (existing behavior)3.5 Statement Processing Order
pg_dump guarantees this order:
CREATE TYPE(enums)CREATE TABLECOMMENT ONALTER TABLE ... IDENTITYALTER TABLE ... PRIMARY KEYCREATE INDEX
With a two-pass approach, order doesn't matter - Pass 1 collects all metadata, Pass 2 emits.
If single-pass is preferred (for performance), buffer CREATE TABLE output and apply metadata retroactively when encountering ALTER TABLE statements. However, two-pass is simpler and the input files are small (< 2000 lines).
4. Input/Output Examples
Example 1: Single table with identity + PK
Input (from updated extract script):
CREATE TABLE seed (
id smallint NOT NULL,
coin coin_btc_bch NOT NULL,
seed character varying(255) NOT NULL,
updated_at timestamp with time zone DEFAULT now()
);
ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (
SEQUENCE NAME public.seed_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);
ALTER TABLE ONLY public.seed
ADD CONSTRAINT seed_pkey PRIMARY KEY (id);Expected output:
CREATE TABLE seed (
id INTEGER PRIMARY KEY AUTOINCREMENT,
coin TEXT NOT NULL,
seed TEXT NOT NULL,
updated_at TEXT DEFAULT (CURRENT_TIMESTAMP)
);Example 2: Table with composite primary key
Input:
CREATE TABLE composite_example (
col_a integer NOT NULL,
col_b integer NOT NULL,
data text
);
ALTER TABLE ONLY public.composite_example
ADD CONSTRAINT composite_example_pkey PRIMARY KEY (col_a, col_b);Expected output (no AUTOINCREMENT for composite PKs):
CREATE TABLE composite_example (
col_a INTEGER NOT NULL,
col_b INTEGER NOT NULL,
data TEXT,
PRIMARY KEY (col_a, col_b)
);Example 3: Multiple tables from real pg_dump
Input (keygen dump, abbreviated):
CREATE TABLE btc_account_key (
id bigint NOT NULL,
coin coin_btc_bch NOT NULL,
...
);
ALTER TABLE public.btc_account_key ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);
ALTER TABLE ONLY public.btc_account_key
ADD CONSTRAINT btc_account_key_pkey PRIMARY KEY (id);
CREATE TABLE seed (
id smallint NOT NULL,
coin coin_all NOT NULL,
...
);
ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);
ALTER TABLE ONLY public.seed
ADD CONSTRAINT seed_pkey PRIMARY KEY (id);Expected output:
CREATE TABLE btc_account_key (
id INTEGER PRIMARY KEY AUTOINCREMENT,
coin TEXT NOT NULL,
...
);
CREATE TABLE seed (
id INTEGER PRIMARY KEY AUTOINCREMENT,
coin TEXT NOT NULL,
...
);5. Test Cases
Unit Tests
| Test | Input | Expected |
|---|---|---|
| Identity + PK on id | id bigint NOT NULL + IDENTITY + PK | id INTEGER PRIMARY KEY AUTOINCREMENT |
| Identity + PK on smallint id | id smallint NOT NULL + IDENTITY + PK | id INTEGER PRIMARY KEY AUTOINCREMENT |
| PK only (no identity) | id bigint NOT NULL + PK (no IDENTITY) | id INTEGER NOT NULL PRIMARY KEY |
| Identity only (no PK) | id bigint NOT NULL + IDENTITY (no PK) | id INTEGER NOT NULL + warning |
| No identity, no PK | id bigint NOT NULL | id INTEGER NOT NULL |
| Composite PK | Two columns in PK | Table-level PRIMARY KEY (col_a, col_b) |
| Schema filtering | ALTER TABLE other_schema.t ... | Ignored when --schema public |
| Excluded table | ALTER TABLE public.atlas_schema_revisions ... | Filtered by schema |
GENERATED ALWAYS | GENERATED ALWAYS AS IDENTITY | Same as BY DEFAULT |
Integration Tests
Use the three real dump files from this project:
dump_watch.postgres.sql→ all tables getINTEGER PRIMARY KEY AUTOINCREMENTdump_keygen.postgres.sql→ all tables getINTEGER PRIMARY KEY AUTOINCREMENTdump_sign.postgres.sql→auth_account_keygetsINTEGER PRIMARY KEY AUTOINCREMENT
Regression Tests
Ensure existing CREATE TABLE-only input (without ALTER TABLE) still works correctly (backward compatibility).
6. Upstream Pipeline Context
This change fits into the following code generation pipeline:
Atlas HCL (source of truth)
↓ atlas migrate diff/apply
PostgreSQL DB
↓ pg_dump --schema-only
Raw dump SQL (CREATE TABLE + ALTER TABLE + CREATE INDEX + ...)
↓ extract-sqlc-schema-postgres.sh (updated to include ALTER TABLE)
Extracted SQL (CREATE TABLE + ALTER TABLE IDENTITY + ALTER TABLE PK)
↓ pg2sqlite (this change)
SQLite schema SQL (CREATE TABLE with INTEGER PRIMARY KEY AUTOINCREMENT)
↓ sqlc generate
Go code (database/sqlite/sqlcgen/)The extract script has already been updated to pass through ALTER TABLE ... IDENTITY and ALTER TABLE ... PRIMARY KEY blocks.
7. Out of Scope
The following are NOT required for this change:
CREATE INDEX/CREATE UNIQUE INDEXconversion (can be a separate feature)ALTER TABLE ... ADD FOREIGN KEY(handled by--enable-foreign-keysflag)ALTER TABLE ... ADD COLUMN(DDL migration, not schema dump)COMMENT ONstatements (not relevant for SQLite)CREATE SEQUENCEstatements (identity sequences are implicit in SQLite)