Skip to content

pg2sqlite: ALTER TABLE & PRIMARY KEY Support

Requirements & Design Specification

This document defines requirements and implementation design for adding ALTER TABLE support to pg2sqlite-rs, enabling proper conversion of PostgreSQL identity columns and primary keys to SQLite INTEGER PRIMARY KEY AUTOINCREMENT.


1. Problem Statement

Current Behavior (v0.1.3)

pg2sqlite currently only parses CREATE TABLE statements from PostgreSQL DDL. It ignores:

  • ALTER TABLE ... ALTER COLUMN ... ADD GENERATED BY DEFAULT AS IDENTITY
  • ALTER TABLE ONLY ... ADD CONSTRAINT ... PRIMARY KEY

Why This Matters

PostgreSQL's pg_dump (used with --schema-only --no-owner --no-privileges) outputs DDL in a specific order where identity and primary key definitions are separate ALTER TABLE statements, not inline in CREATE TABLE:

1. CREATE TYPE (enums)
2. CREATE TABLE (columns only - NO primary key, NO identity)
3. COMMENT ON (comments)
4. ALTER TABLE ... ADD GENERATED BY DEFAULT AS IDENTITY (auto-increment)
5. ALTER TABLE ONLY ... ADD CONSTRAINT ... PRIMARY KEY (primary keys)
6. CREATE [UNIQUE] INDEX (indexes)

This is fundamental to pg_dump format - there is no pg_dump option to inline primary keys or identity into CREATE TABLE.

Result

Without parsing these ALTER TABLE statements, pg2sqlite converts:

sql
-- PostgreSQL (from pg_dump)
CREATE TABLE seed (
    id smallint NOT NULL,    -- no PK, no identity info here
    coin coin_btc_bch NOT NULL,
    ...
);
ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);
ALTER TABLE ONLY public.seed ADD CONSTRAINT seed_pkey PRIMARY KEY (id);

To:

sql
-- Current SQLite output (BROKEN)
CREATE TABLE seed (
  id INTEGER NOT NULL,       -- cannot auto-generate id on INSERT
  coin TEXT NOT NULL,
  ...
);

Expected:

sql
-- Desired SQLite output
CREATE TABLE seed (
  id INTEGER PRIMARY KEY AUTOINCREMENT,  -- auto-generates id
  coin TEXT NOT NULL,
  ...
);

Real-World Impact

This causes NOT NULL constraint failed: seed.id errors when the application inserts rows without explicit id values (all E2E tests for multisig patterns fail).


2. Requirements

REQ-1: Parse ALTER TABLE ... IDENTITY

Description: Parse ALTER TABLE <schema>.<table> ALTER COLUMN <column> ADD GENERATED BY DEFAULT AS IDENTITY statements from pg_dump output.

Input format (multi-line block, ends with );):

sql
ALTER TABLE public.address ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (
    SEQUENCE NAME public.address_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1
);

Expected behavior: Mark the column <table>.<column> as having identity (auto-increment) semantics.

Variants to handle:

  • GENERATED BY DEFAULT AS IDENTITY (most common from Atlas HCL)
  • GENERATED ALWAYS AS IDENTITY (less common but valid PostgreSQL)

REQ-2: Parse ALTER TABLE ... PRIMARY KEY

Description: Parse ALTER TABLE ONLY <schema>.<table> ADD CONSTRAINT <name> PRIMARY KEY (<columns>) statements.

Input format (2-line block):

sql
ALTER TABLE ONLY public.seed
    ADD CONSTRAINT seed_pkey PRIMARY KEY (id);

Expected behavior: Mark the column(s) as primary key for the corresponding table.

Scope:

  • Single-column primary keys on id columns (most common case)
  • Multi-column primary keys (e.g., composite keys) - store as table-level constraint

REQ-3: Emit INTEGER PRIMARY KEY AUTOINCREMENT

Description: When a column has both identity (REQ-1) and primary key (REQ-2), emit INTEGER PRIMARY KEY AUTOINCREMENT in the SQLite output.

Conversion rules:

PostgreSQL ColumnHas Identity?Has PK?SQLite Output
id smallint NOT NULLYesYesid INTEGER PRIMARY KEY AUTOINCREMENT
id bigint NOT NULLYesYesid INTEGER PRIMARY KEY AUTOINCREMENT
id integer NOT NULLYesYesid INTEGER PRIMARY KEY AUTOINCREMENT
id bigint NOT NULLNoYesid INTEGER NOT NULL PRIMARY KEY
id bigint NOT NULLYesNoid INTEGER NOT NULL (with warning)
id bigint NOT NULLNoNoid INTEGER NOT NULL (current behavior)

Important: When INTEGER PRIMARY KEY AUTOINCREMENT is used, NOT NULL must be removed (it's implicit in SQLite's INTEGER PRIMARY KEY).

REQ-4: Schema Filtering for ALTER TABLE

Description: Apply the existing --schema filter to ALTER TABLE statements, just as it applies to CREATE TABLE.

  • ALTER TABLE public.seed ... should be included when --schema public (default)
  • ALTER TABLE atlas_schema_revisions.atlas_schema_revisions ... should be excluded when --schema public

REQ-5: ALTER TABLE Statements Should NOT Appear in Output

Description: The ALTER TABLE statements themselves should NOT be emitted in the SQLite output. Their information should be merged into the corresponding CREATE TABLE output.

SQLite does not support:

  • ALTER TABLE ... ADD GENERATED BY DEFAULT AS IDENTITY
  • ALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY

3. Design

3.1 Two-Pass Architecture

Change from single-pass to two-pass processing:

Pass 1: Parse all statements and collect metadata

  1. Parse CREATE TABLE statements (existing)
  2. Parse ALTER TABLE ... IDENTITY statements → build identity map: {table.column → true}
  3. Parse ALTER TABLE ... PRIMARY KEY statements → build PK map: {table → [columns]}

Pass 2: Emit SQLite DDL

  1. For each CREATE TABLE, check if any column has both identity and PK
  2. If yes, emit INTEGER PRIMARY KEY AUTOINCREMENT for that column
  3. Skip emitting ALTER TABLE statements entirely

3.2 Data Structures

IdentityInfo:
  table_name: String
  column_name: String

PrimaryKeyInfo:
  table_name: String
  constraint_name: String
  columns: Vec<String>

TableMetadata:
  identity_columns: HashMap<(table, column), IdentityInfo>
  primary_keys: HashMap<table, PrimaryKeyInfo>

3.3 Parsing Strategy

For ALTER TABLE ... IDENTITY:

Pattern: ALTER TABLE [schema.]<table> ALTER COLUMN <column> ADD GENERATED (BY DEFAULT|ALWAYS) AS IDENTITY (
         ... (multi-line sequence options)
         );
  • Extract table_name and column_name
  • Consume until closing );
  • The SEQUENCE details (START WITH, INCREMENT BY, etc.) can be ignored for SQLite

For ALTER TABLE ONLY ... PRIMARY KEY:

Pattern: ALTER TABLE ONLY [schema.]<table>
             ADD CONSTRAINT <name> PRIMARY KEY (<col1>[, <col2>, ...]);
  • Extract table_name and column list
  • Multi-column PKs: only convert to INTEGER PRIMARY KEY AUTOINCREMENT if single-column on an integer type

3.4 Column Emission Logic

fn emit_column(col, table_metadata):
    is_identity = table_metadata.identity_columns.contains(table, col.name)
    is_pk = table_metadata.primary_keys[table].columns == [col.name]  // single-column PK

    if is_identity AND is_pk AND is_integer_type(col):
        emit "INTEGER PRIMARY KEY AUTOINCREMENT"
        // skip NOT NULL (implicit)
        // skip DEFAULT (incompatible with AUTOINCREMENT)
    else if is_pk AND NOT is_identity:
        emit normal_type + " NOT NULL PRIMARY KEY"
    else:
        emit normal conversion (existing behavior)

3.5 Statement Processing Order

pg_dump guarantees this order:

  1. CREATE TYPE (enums)
  2. CREATE TABLE
  3. COMMENT ON
  4. ALTER TABLE ... IDENTITY
  5. ALTER TABLE ... PRIMARY KEY
  6. CREATE INDEX

With a two-pass approach, order doesn't matter - Pass 1 collects all metadata, Pass 2 emits.

If single-pass is preferred (for performance), buffer CREATE TABLE output and apply metadata retroactively when encountering ALTER TABLE statements. However, two-pass is simpler and the input files are small (< 2000 lines).


4. Input/Output Examples

Example 1: Single table with identity + PK

Input (from updated extract script):

sql
CREATE TABLE seed (
    id smallint NOT NULL,
    coin coin_btc_bch NOT NULL,
    seed character varying(255) NOT NULL,
    updated_at timestamp with time zone DEFAULT now()
);

ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (
    SEQUENCE NAME public.seed_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1
);

ALTER TABLE ONLY public.seed
    ADD CONSTRAINT seed_pkey PRIMARY KEY (id);

Expected output:

sql
CREATE TABLE seed (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  coin TEXT NOT NULL,
  seed TEXT NOT NULL,
  updated_at TEXT DEFAULT (CURRENT_TIMESTAMP)
);

Example 2: Table with composite primary key

Input:

sql
CREATE TABLE composite_example (
    col_a integer NOT NULL,
    col_b integer NOT NULL,
    data text
);

ALTER TABLE ONLY public.composite_example
    ADD CONSTRAINT composite_example_pkey PRIMARY KEY (col_a, col_b);

Expected output (no AUTOINCREMENT for composite PKs):

sql
CREATE TABLE composite_example (
  col_a INTEGER NOT NULL,
  col_b INTEGER NOT NULL,
  data TEXT,
  PRIMARY KEY (col_a, col_b)
);

Example 3: Multiple tables from real pg_dump

Input (keygen dump, abbreviated):

sql
CREATE TABLE btc_account_key (
    id bigint NOT NULL,
    coin coin_btc_bch NOT NULL,
    ...
);

ALTER TABLE public.btc_account_key ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);

ALTER TABLE ONLY public.btc_account_key
    ADD CONSTRAINT btc_account_key_pkey PRIMARY KEY (id);

CREATE TABLE seed (
    id smallint NOT NULL,
    coin coin_all NOT NULL,
    ...
);

ALTER TABLE public.seed ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (...);

ALTER TABLE ONLY public.seed
    ADD CONSTRAINT seed_pkey PRIMARY KEY (id);

Expected output:

sql
CREATE TABLE btc_account_key (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  coin TEXT NOT NULL,
  ...
);

CREATE TABLE seed (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  coin TEXT NOT NULL,
  ...
);

5. Test Cases

Unit Tests

TestInputExpected
Identity + PK on idid bigint NOT NULL + IDENTITY + PKid INTEGER PRIMARY KEY AUTOINCREMENT
Identity + PK on smallint idid smallint NOT NULL + IDENTITY + PKid INTEGER PRIMARY KEY AUTOINCREMENT
PK only (no identity)id bigint NOT NULL + PK (no IDENTITY)id INTEGER NOT NULL PRIMARY KEY
Identity only (no PK)id bigint NOT NULL + IDENTITY (no PK)id INTEGER NOT NULL + warning
No identity, no PKid bigint NOT NULLid INTEGER NOT NULL
Composite PKTwo columns in PKTable-level PRIMARY KEY (col_a, col_b)
Schema filteringALTER TABLE other_schema.t ...Ignored when --schema public
Excluded tableALTER TABLE public.atlas_schema_revisions ...Filtered by schema
GENERATED ALWAYSGENERATED ALWAYS AS IDENTITYSame as BY DEFAULT

Integration Tests

Use the three real dump files from this project:

  • dump_watch.postgres.sql → all tables get INTEGER PRIMARY KEY AUTOINCREMENT
  • dump_keygen.postgres.sql → all tables get INTEGER PRIMARY KEY AUTOINCREMENT
  • dump_sign.postgres.sqlauth_account_key gets INTEGER PRIMARY KEY AUTOINCREMENT

Regression Tests

Ensure existing CREATE TABLE-only input (without ALTER TABLE) still works correctly (backward compatibility).


6. Upstream Pipeline Context

This change fits into the following code generation pipeline:

Atlas HCL (source of truth)
    ↓ atlas migrate diff/apply
PostgreSQL DB
    ↓ pg_dump --schema-only
Raw dump SQL (CREATE TABLE + ALTER TABLE + CREATE INDEX + ...)
    ↓ extract-sqlc-schema-postgres.sh (updated to include ALTER TABLE)
Extracted SQL (CREATE TABLE + ALTER TABLE IDENTITY + ALTER TABLE PK)
    ↓ pg2sqlite (this change)
SQLite schema SQL (CREATE TABLE with INTEGER PRIMARY KEY AUTOINCREMENT)
    ↓ sqlc generate
Go code (database/sqlite/sqlcgen/)

The extract script has already been updated to pass through ALTER TABLE ... IDENTITY and ALTER TABLE ... PRIMARY KEY blocks.


7. Out of Scope

The following are NOT required for this change:

  • CREATE INDEX / CREATE UNIQUE INDEX conversion (can be a separate feature)
  • ALTER TABLE ... ADD FOREIGN KEY (handled by --enable-foreign-keys flag)
  • ALTER TABLE ... ADD COLUMN (DDL migration, not schema dump)
  • COMMENT ON statements (not relevant for SQLite)
  • CREATE SEQUENCE statements (identity sequences are implicit in SQLite)