Database Schema¶

Database schema definitions and table structures for the Prompt Versioner SQLite database.

Overview¶

The database schema defines the structure for storing prompts, versions, metrics, annotations, and related data. The schema uses SQLite with foreign key constraints and optimized indexes for performance.

Schema Module¶

Schema Definitions¶

from prompt_versioner.storage.schema import SCHEMA_DEFINITIONS, INDEXES, SCHEMA_VERSION

The schema module provides: - Table Definitions: DDL statements for all database tables - Index Definitions: Performance-optimized indexes - Schema Versioning: Migration support through version tracking

Database Tables¶

prompt_versions¶

Main table storing prompt versions and their content.

CREATE TABLE IF NOT EXISTS prompt_versions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    version TEXT NOT NULL,
    system_prompt TEXT NOT NULL,
    user_prompt TEXT NOT NULL,
    metadata TEXT,
    git_commit TEXT,
    timestamp TEXT NOT NULL,
    created_by TEXT,
    tags TEXT,
    UNIQUE(name, version)
)

Columns:

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY AUTOINCREMENT	Unique prompt version ID
`name`	TEXT	NOT NULL	Prompt name identifier
`version`	TEXT	NOT NULL	Semantic version (e.g., "1.2.0")
`system_prompt`	TEXT	NOT NULL	System prompt content
`user_prompt`	TEXT	NOT NULL	User prompt template
`metadata`	TEXT	-	JSON metadata (optional)
`git_commit`	TEXT	-	Git commit hash (optional)
`timestamp`	TEXT	NOT NULL	ISO timestamp of creation
`created_by`	TEXT	-	Author identifier (optional)
`tags`	TEXT	-	Comma-separated tags (deprecated, use version_tags)

Constraints: - UNIQUE(name, version): Prevents duplicate versions for same prompt

Example Data:

INSERT INTO prompt_versions (
    name, version, system_prompt, user_prompt, metadata, timestamp
) VALUES (
    'code_reviewer',
    '1.2.0',
    'You are an expert code reviewer.',
    'Review this code: {code}',
    '{"author": "john.doe", "purpose": "code review"}',
    '2025-01-15T10:30:00Z'
);

prompt_metrics¶

Performance metrics and usage data for prompt versions.

CREATE TABLE IF NOT EXISTS prompt_metrics (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    version_id INTEGER NOT NULL,
    model_name TEXT,
    input_tokens INTEGER,
    output_tokens INTEGER,
    total_tokens INTEGER,
    cost_eur REAL,
    latency_ms REAL,
    quality_score REAL,
    accuracy REAL,
    temperature REAL,
    top_p REAL,
    max_tokens INTEGER,
    success BOOLEAN,
    error_message TEXT,
    timestamp TEXT NOT NULL,
    metadata TEXT,
    FOREIGN KEY (version_id) REFERENCES prompt_versions(id) ON DELETE CASCADE
)

Columns:

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY AUTOINCREMENT	Unique metric record ID
`version_id`	INTEGER	NOT NULL, FK	Reference to prompt_versions.id
`model_name`	TEXT	-	AI model used (e.g., "gpt-4o")
`input_tokens`	INTEGER	-	Number of input tokens
`output_tokens`	INTEGER	-	Number of output tokens
`total_tokens`	INTEGER	-	Total tokens (calculated)
`cost_eur`	REAL	-	Cost in euros
`latency_ms`	REAL	-	Response latency in milliseconds
`quality_score`	REAL	-	Quality assessment (0-1)
`accuracy`	REAL	-	Accuracy measurement (0-1)
`temperature`	REAL	-	Model temperature parameter
`top_p`	REAL	-	Model top_p parameter
`max_tokens`	INTEGER	-	Model max_tokens parameter
`success`	BOOLEAN	-	Whether request succeeded
`error_message`	TEXT	-	Error details if failed
`timestamp`	TEXT	NOT NULL	ISO timestamp of metric collection
`metadata`	TEXT	-	Additional JSON metadata

Relationships: - FOREIGN KEY (version_id) REFERENCES prompt_versions(id) ON DELETE CASCADE

Example Data:

INSERT INTO prompt_metrics (
    version_id, model_name, input_tokens, output_tokens,
    cost_eur, latency_ms, quality_score, success, timestamp
) VALUES (
    1, 'gpt-4o-mini', 150, 200, 0.004, 1250.5, 0.87, 1, '2025-01-15T10:35:00Z'
);

annotations¶

Team annotations and comments on prompt versions.

CREATE TABLE IF NOT EXISTS annotations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    version_id INTEGER NOT NULL,
    author TEXT NOT NULL,
    text TEXT NOT NULL,
    timestamp TEXT NOT NULL,
    annotation_type TEXT DEFAULT 'comment',
    resolved BOOLEAN DEFAULT 0,
    FOREIGN KEY (version_id) REFERENCES prompt_versions(id) ON DELETE CASCADE
)

Columns:

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY AUTOINCREMENT	Unique annotation ID
`version_id`	INTEGER	NOT NULL, FK	Reference to prompt_versions.id
`author`	TEXT	NOT NULL	Annotation author
`text`	TEXT	NOT NULL	Annotation content
`timestamp`	TEXT	NOT NULL	ISO timestamp of creation
`annotation_type`	TEXT	DEFAULT 'comment'	Type: comment, review, approval
`resolved`	BOOLEAN	DEFAULT 0	Whether annotation is resolved

Annotation Types: - comment: General comment or note - review: Code/prompt review feedback - approval: Approval for production deployment - issue: Issue or problem identification

Example Data:

INSERT INTO annotations (
    version_id, author, text, annotation_type, timestamp
) VALUES (
    1, 'team_lead', 'Approved for production deployment', 'approval', '2025-01-15T11:00:00Z'
);

version_tags¶

Tag system for organizing and categorizing prompt versions.

CREATE TABLE IF NOT EXISTS version_tags (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    version_id INTEGER NOT NULL,
    tag TEXT NOT NULL,
    FOREIGN KEY (version_id) REFERENCES prompt_versions(id) ON DELETE CASCADE,
    UNIQUE(version_id, tag)
)

Columns:

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY AUTOINCREMENT	Unique tag association ID
`version_id`	INTEGER	NOT NULL, FK	Reference to prompt_versions.id
`tag`	TEXT	NOT NULL	Tag name

Constraints: - UNIQUE(version_id, tag): Prevents duplicate tags on same version

Common Tags: - production: Production-ready versions - experimental: Experimental versions - deprecated: Deprecated versions - stable: Stable releases - hotfix: Emergency fixes

Example Data:

INSERT INTO version_tags (version_id, tag) VALUES
    (1, 'production'),
    (1, 'stable'),
    (2, 'experimental');

Database Indexes¶

Performance-optimized indexes for common query patterns:

INDEXES = [
    "CREATE INDEX IF NOT EXISTS idx_name_version ON prompt_versions(name, version)",
    "CREATE INDEX IF NOT EXISTS idx_timestamp ON prompt_versions(timestamp DESC)",
    "CREATE INDEX IF NOT EXISTS idx_name ON prompt_versions(name)",
    "CREATE INDEX IF NOT EXISTS idx_metrics_version ON prompt_metrics(version_id)",
    "CREATE INDEX IF NOT EXISTS idx_metrics_timestamp ON prompt_metrics(timestamp DESC)",
    "CREATE INDEX IF NOT EXISTS idx_annotations_version ON annotations(version_id)",
    "CREATE INDEX IF NOT EXISTS idx_tags_version ON version_tags(version_id)",
    "CREATE INDEX IF NOT EXISTS idx_tags_tag ON version_tags(tag)",
]

Index Purpose¶

Index	Purpose	Query Optimization
`idx_name_version`	Composite index	Fast lookup by name + version
`idx_timestamp`	Temporal ordering	Recent versions first
`idx_name`	Name-based queries	List versions by prompt name
`idx_metrics_version`	Metrics lookup	Get metrics for specific version
`idx_metrics_timestamp`	Temporal metrics	Recent metrics analysis
`idx_annotations_version`	Annotation lookup	Get annotations for version
`idx_tags_version`	Tag queries	Find tags for version
`idx_tags_tag`	Tag filtering	Find versions with specific tag

Query Performance Examples¶

-- ✅ Optimized: Uses idx_name_version
SELECT * FROM prompt_versions WHERE name = 'code_reviewer' AND version = '1.2.0';

-- ✅ Optimized: Uses idx_timestamp
SELECT * FROM prompt_versions ORDER BY timestamp DESC LIMIT 10;

-- ✅ Optimized: Uses idx_metrics_version
SELECT AVG(quality_score) FROM prompt_metrics WHERE version_id = 1;

-- ✅ Optimized: Uses idx_tags_tag
SELECT pv.* FROM prompt_versions pv
JOIN version_tags vt ON pv.id = vt.version_id
WHERE vt.tag = 'production';

Schema Versioning¶

Current Schema Version¶

SCHEMA_VERSION = 1

The schema version enables database migrations and compatibility checking:

def check_schema_version(db):
    """Check if database schema matches current version."""
    current_version = get_user_version(db)
    if current_version != SCHEMA_VERSION:
        raise SchemaVersionMismatch(
            f"Database schema v{current_version} != required v{SCHEMA_VERSION}"
        )

Migration Support¶

Future schema changes will increment SCHEMA_VERSION and provide migration scripts:

# Future migration example
MIGRATIONS = {
    1: [
        "ALTER TABLE prompt_versions ADD COLUMN new_field TEXT",
        "CREATE INDEX idx_new_field ON prompt_versions(new_field)"
    ],
    2: [
        "CREATE TABLE new_table (...)",
        "INSERT INTO new_table SELECT ... FROM old_table"
    ]
}

Relationships and Constraints¶

Entity Relationship Diagram¶

prompt_versions (1) ──── (N) prompt_metrics
       │
       │ (1)
       │
       └── (N) annotations
       │
       │ (1)
       │
       └── (N) version_tags

Foreign Key Behavior¶

All foreign keys use ON DELETE CASCADE for automatic cleanup:

-- When a prompt version is deleted:
DELETE FROM prompt_versions WHERE id = 1;

-- Automatically deletes:
-- - All metrics for that version
-- - All annotations for that version
-- - All tags for that version

Data Integrity Rules¶

Unique Versions: Each prompt can have only one version with a given number
Required Fields: Core fields (name, version, prompts, timestamp) are mandatory
Referential Integrity: Metrics, annotations, and tags must reference valid versions
Cascade Deletion: Related data is automatically cleaned up

Usage Examples¶

Creating Tables¶

from prompt_versioner.storage.database import DatabaseManager
from prompt_versioner.storage.schema import SCHEMA_DEFINITIONS, INDEXES

db = DatabaseManager()

# Tables are created automatically during initialization
# But you can create manually if needed:
with db.get_connection() as conn:
    for table_name, table_sql in SCHEMA_DEFINITIONS.items():
        conn.execute(table_sql)

    for index_sql in INDEXES:
        conn.execute(index_sql)

# Get version with all related data
def get_version_with_details(db, version_id):
    with db.get_connection() as conn:
        # Main version data
        version = conn.execute(
            "SELECT * FROM prompt_versions WHERE id = ?",
            (version_id,)
        ).fetchone()

        # Metrics summary
        metrics = conn.execute("""
            SELECT COUNT(*) as count, AVG(quality_score) as avg_quality,
                   AVG(latency_ms) as avg_latency, AVG(cost_eur) as avg_cost
            FROM prompt_metrics WHERE version_id = ?
        """, (version_id,)).fetchone()

        # Annotations
        annotations = conn.execute(
            "SELECT * FROM annotations WHERE version_id = ? ORDER BY timestamp",
            (version_id,)
        ).fetchall()

        # Tags
        tags = conn.execute(
            "SELECT tag FROM version_tags WHERE version_id = ?",
            (version_id,)
        ).fetchall()

        return {
            'version': dict(version),
            'metrics': dict(metrics),
            'annotations': [dict(a) for a in annotations],
            'tags': [t['tag'] for t in tags]
        }

Complex Queries¶

# Find best performing versions by tag
def get_top_versions_by_tag(db, tag, limit=10):
    query = """
        SELECT pv.name, pv.version, AVG(pm.quality_score) as avg_quality
        FROM prompt_versions pv
        JOIN version_tags vt ON pv.id = vt.version_id
        JOIN prompt_metrics pm ON pv.id = pm.version_id
        WHERE vt.tag = ? AND pm.quality_score IS NOT NULL
        GROUP BY pv.id
        ORDER BY avg_quality DESC
        LIMIT ?
    """
    return db.execute(query, (tag, limit), fetch="all")

Database Schema¶

Overview¶

Schema Module¶

Schema Definitions¶

Database Tables¶

prompt_versions¶

prompt_metrics¶

annotations¶

version_tags¶

Database Indexes¶

Index Purpose¶

Query Performance Examples¶

Schema Versioning¶

Current Schema Version¶

Migration Support¶

Relationships and Constraints¶

Entity Relationship Diagram¶

Foreign Key Behavior¶

Data Integrity Rules¶

Usage Examples¶

Creating Tables¶

Querying Related Data¶

Complex Queries¶

See Also¶