Documentation - HkDevs CodeForge Database Studio

Phase 5: Smart Seeding

Intelligent data seeding system that generates realistic test data based on schema analysis, maintains referential integrity, and provides performance-optimized bulk operations with customizable templates.

Overview

Intelligent Analysis

Analyze your database schema and automatically generate contextually appropriate test data with realistic patterns.

• Automatic field type detection based on column names
• Smart data generation for emails, names, addresses
• Data type-specific value generation
• Custom field mapping suggestions

Basic Relationships

Handle basic foreign key relationships by discovering existing records in related tables.

• Foreign key pattern detection (_id columns)
• Automatic related table discovery
• Random selection from existing records
• Basic referential integrity maintenance

Available Smart Seeding Features

Table Analysis

Smart column detection and type analysis

Data Preview

Real-time preview before insertion

Template System

Reusable generation templates

Faker Integration

Built-in Faker library support

Seeding Templates

Template System

Create reusable data generation templates with field mappings and relationship configurations.

Auto Templates

Automatically generated from schema analysis

• Field type detection based on names
• Basic data type mapping
• Foreign key detection (_id pattern)
• Simple field suggestions

Manual Templates

Created and configured through Filament interface

• Custom field mapping repeater
• Relationship configuration
• Constraint definitions
• Reusable across tables

Available Commands:

# Generate data with auto template
php artisan codeforge:generate-data users --count=50

# Preview without inserting
php artisan codeforge:generate-data users --count=5 --preview

# Save as template
php artisan codeforge:generate-data users --count=10 --save-template=user_template

Template Configuration

Configure field mappings and data generation options through the Filament interface.

Field Type Support

Supported field types for data generation:

• auto_increment

• uuid

• string/text

• email

• name

• phone

• address

• number/integer

• decimal/float

• boolean

• date

• datetime

• json

• enum

• foreign_key

Smart Field Detection

Automatic field mapping based on column names:

• email columns → email type
• *_name columns → name type
• phone* columns → phone type
• *_at columns → datetime type
• is_*, has_*, can_* → boolean type
• *_id columns → foreign_key type

Field Options

Each field type supports configurable options through key-value pairs.

Template Management

Manage your data generation templates through the Filament admin interface.

Template Features

• Name and description
• Target table association
• Field mapping repeater
• Relationship configuration
• Constraint definitions
• Default record count
• Active/inactive status

Template Actions

• Preview generated data
• Generate data with confirmation
• Duplicate templates
• Export/Import templates
• Template usage statistics

Personal Data

• first_name, last_name → Names
• email → Valid email addresses
• phone → Phone numbers
• birth_date → Realistic ages

Address Fields

• address → Street addresses
• city → City names
• postal_code → ZIP codes
• country → Country codes

Business Data

• company_name → Company names
• price, cost → Monetary values
• description → Lorem text
• status → Enum values

Technical Fields

• url, website → URLs
• ip_address → IP addresses
• uuid → UUIDs
• slug → URL-friendly strings

Date & Time

• created_at → Recent dates
• updated_at → Update dates
• published_at → Publication dates
• expires_at → Future dates

Content Fields

• title → Titles
• content, body → Paragraphs
• summary → Short text
• tags → Tag lists

Relationship-Aware Generation

Basic Relationship Handling

Handle foreign key relationships by discovering and using existing records from related tables.

Current Support:

Basic belongsTo relationships
Foreign key pattern detection (_id)
Random existing record selection
Related table discovery

How It Works:

• Detects columns ending with "_id"
• Finds related table by removing "_id" and adding "s"
• Selects random existing ID from related table
• Maintains referential integrity

Relationship Configuration

Configure relationships through templates with column and table mappings.

Template Relationships

Define relationships in templates with column and related table specification.

Column: user_id

Related Table: users

Type: belongs_to

Auto Discovery

Automatic relationship discovery during table analysis for common patterns.

Current Limitations

The relationship system currently supports basic foreign key relationships. Advanced features are not yet implemented:

Not Yet Supported:

• Complex many-to-many relationships
• Polymorphic relationships
• Dependency order resolution
• Cascade seeding across tables
• Custom relationship distributions

Workarounds:

• Seed parent tables first manually
• Use foreign_key field type
• Configure relationships in templates
• Preview data before insertion

Safety Features

• Foreign key constraint checking
• Transaction-based seeding
• Rollback on failure
• Orphaned record prevention

Faker Integration

Built-in Faker Support

Integrated Faker library provides realistic data generation for various field types.

Personal Data

• Names (first, last, full)
• Email addresses
• Phone numbers
• Addresses (street, city, state)

Data Types

• Text (words, sentences, paragraphs)
• Numbers (integers, decimals, ranges)
• Dates and times
• Boolean values with probability

Specialized

• UUIDs
• JSON structures
• Enum selections
• Custom patterns

Available Field Types

Each field type uses appropriate Faker methods for realistic data generation.

String Types

• company() - Company names
• sentence() - Sentence text
• paragraph() - Paragraph text
• slug() - URL-friendly slugs
• word() - Single words

Date/Time

• dateTimeBetween() - Date ranges
• Configurable min/max dates
• Custom format support

Numeric Types

• numberBetween() - Integer ranges
• randomFloat() - Decimal numbers
• boolean() - True/false values
• Configurable probability

Special Types

• uuid() - UUID generation
• randomElement() - Enum choices
• JSON structure generation

Configuration Options

Field types support various options for customizing generated data.

String Options

• length: Maximum string length
• string_type: company, sentence, paragraph, etc.
• pattern: Custom patterns with placeholders

Numeric Options

• min/max: Value ranges
• decimals: Decimal places for floats
• true_probability: Boolean true percentage

Bulk Seeding Operations

Basic Bulk Generation

Generate multiple records efficiently using database bulk inserts.

Current Features:

• Configurable record count (1-10,000)
• Preview before generation
• Template-based generation
• Error handling and reporting
• Success/failure tracking

UI Features:

• Smart Data Seeder page
• Real-time data preview
• Confirmation modals
• Progress notifications
• Template management

Available Commands:

# Generate 1000 records
php artisan codeforge:generate-data users --count=1000

# Preview 5 records first
php artisan codeforge:generate-data products --count=100 --preview

Current Limitations

The bulk seeding system has some limitations that may be addressed in future versions:

Not Yet Implemented:

• Advanced batch processing with chunking
• Background queue processing
• Memory optimization for large datasets
• Progress bars for long operations
• Database connection pooling
• Parallel processing
• Transaction batching

Current Workarounds:

• Use smaller batch sizes (max 10,000)
• Run multiple smaller commands
• Monitor memory usage manually
• Use database tools for very large datasets

Progress Tracking

Monitor generation progress with real-time updates and estimated completion times.

Error Recovery

Automatic retry mechanisms and checkpoint recovery for failed operations.

Memory Management

Advanced memory management techniques to handle datasets larger than available RAM.

Configuration Example:

'performance' => [
    'chunk_size' => 1000,
    'memory_limit' => '512MB',
    'use_bulk_inserts' => true,
    'disable_model_events' => true,
    'connection_timeout' => 300,
]

Memory Optimization

• Configurable chunk sizes
• Memory usage monitoring
• Garbage collection triggers
• Resource cleanup

Database Optimization

• Connection pooling
• Query optimization
• Index management
• Transaction batching

Customization Options

Environment-Specific Configuration

Configure different seeding strategies for development, testing, staging, and production environments.

Environment Configuration:

'environments' => [
    'local' => [
        'default_count' => 50,
        'enable_relationships' => true,
        'use_realistic_data' => true,
    ],
    'testing' => [
        'default_count' => 10,
        'enable_relationships' => false,
        'use_simple_data' => true,
    ],
    'staging' => [
        'default_count' => 1000,
        'enable_relationships' => true,
        'anonymize_data' => true,
    ]
]

Development

• Rich, realistic data
• Full relationships
• Moderate volumes
• Fast generation

Testing

• Minimal, consistent data
• Predictable patterns
• Fast execution
• Isolated datasets

Production-Like

• Large volumes
• Anonymized data
• Performance testing
• Realistic load

Custom Seeder Generation

Generate Laravel seeder classes with intelligent defaults and customizable options.

Seeder Class Generation

Create Laravel seeder classes with factory integration and custom logic.

php artisan db-manager:generate-seeder UserSeeder --model=User --count=100

Factory Integration

Seamlessly integrate with Laravel model factories and factory states for consistent data generation.

Safety Features

Built-in safety features like environment checks, transaction handling, and foreign key management.

Data Constraints & Validation

Apply business rules and constraints to ensure generated data meets your requirements.

Constraint Types:

Unique value constraints
Range and boundary limits
Pattern matching rules
Cross-field dependencies
Business logic validation

Validation Features:

Pre-generation validation
Post-generation verification
Constraint violation handling
Data quality metrics
Compliance reporting

Best Practices

Development Workflow

Recommended workflow for using the smart seeding features effectively.

Analyze Your Tables

Use the Smart Data Seeder page to analyze table structure and see suggested field mappings.

Preview Before Generating

Always use the preview feature to verify data looks correct before inserting records.

Save Useful Templates

Create and save templates for tables you generate data for frequently.

Use Command Line for Automation

Use the codeforge:generate-data command for scripting and automation needs.

Current Recommendations

Based on the current implementation, here are recommended practices.

For Small Datasets (< 1,000):

• Use the Smart Data Seeder UI
• Enable all field types and relationships
• Preview data for verification
• Save successful configurations as templates

For Larger Datasets (1,000+):

• Use command line for better performance
• Start with smaller counts for testing
• Monitor memory usage
• Consider running in smaller batches

Working with Relationships

Current best practices for handling related data.

Seed Parent Tables First

Since automatic dependency resolution isn't implemented, manually seed parent tables before child tables that reference them.

Use Foreign Key Field Type

For foreign key columns, use the foreign_key field type to automatically select from existing records.

Configure Relationships in Templates

Define relationships explicitly in templates to ensure proper foreign key population.

Quick Start Guide

Getting Started:

1. Navigate to Smart Data Seeder page
2. Select a table from the dropdown
3. Set record count (start with 10)
4. Click "Generate Preview"
5. Review the preview data
6. Click "Generate & Insert Data"

Command Line:

1. php artisan codeforge:generate-data users --count=10 --preview
2. Review the preview output
3. php artisan codeforge:generate-data users --count=10
4. Save as template: --save-template=my_template