Phase 5: Smart Seeding
Intelligent data seeding system that generates realistic test data based on schema analysis, maintains referential integrity, and provides performance-optimized bulk operations with customizable templates.
Overview
Intelligent Analysis
Analyze your database schema and automatically generate contextually appropriate test data with realistic patterns.
- • Automatic field type detection based on column names
- • Smart data generation for emails, names, addresses
- • Data type-specific value generation
- • Custom field mapping suggestions
Basic Relationships
Handle basic foreign key relationships by discovering existing records in related tables.
- • Foreign key pattern detection (_id columns)
- • Automatic related table discovery
- • Random selection from existing records
- • Basic referential integrity maintenance
Available Smart Seeding Features
Table Analysis
Smart column detection and type analysis
Data Preview
Real-time preview before insertion
Template System
Reusable generation templates
Faker Integration
Built-in Faker library support
Seeding Templates
Template System
Create reusable data generation templates with field mappings and relationship configurations.
Auto Templates
Automatically generated from schema analysis
- • Field type detection based on names
- • Basic data type mapping
- • Foreign key detection (_id pattern)
- • Simple field suggestions
Manual Templates
Created and configured through Filament interface
- • Custom field mapping repeater
- • Relationship configuration
- • Constraint definitions
- • Reusable across tables
Available Commands:
# Generate data with auto template
php artisan codeforge:generate-data users --count=50
# Preview without inserting
php artisan codeforge:generate-data users --count=5 --preview
# Save as template
php artisan codeforge:generate-data users --count=10 --save-template=user_template
Template Configuration
Configure field mappings and data generation options through the Filament interface.
Field Type Support
Supported field types for data generation:
Smart Field Detection
Automatic field mapping based on column names:
- •
email
columns → email type - •
*_name
columns → name type - •
phone*
columns → phone type - •
*_at
columns → datetime type - •
is_*, has_*, can_*
→ boolean type - •
*_id
columns → foreign_key type
Field Options
Each field type supports configurable options through key-value pairs.
Template Management
Manage your data generation templates through the Filament admin interface.
Template Features
- • Name and description
- • Target table association
- • Field mapping repeater
- • Relationship configuration
- • Constraint definitions
- • Default record count
- • Active/inactive status
Template Actions
- • Preview generated data
- • Generate data with confirmation
- • Duplicate templates
- • Export/Import templates
- • Template usage statistics
Personal Data
- • first_name, last_name → Names
- • email → Valid email addresses
- • phone → Phone numbers
- • birth_date → Realistic ages
Address Fields
- • address → Street addresses
- • city → City names
- • postal_code → ZIP codes
- • country → Country codes
Business Data
- • company_name → Company names
- • price, cost → Monetary values
- • description → Lorem text
- • status → Enum values
Technical Fields
- • url, website → URLs
- • ip_address → IP addresses
- • uuid → UUIDs
- • slug → URL-friendly strings
Date & Time
- • created_at → Recent dates
- • updated_at → Update dates
- • published_at → Publication dates
- • expires_at → Future dates
Content Fields
- • title → Titles
- • content, body → Paragraphs
- • summary → Short text
- • tags → Tag lists
Relationship-Aware Generation
Basic Relationship Handling
Handle foreign key relationships by discovering and using existing records from related tables.
Current Support:
- Basic belongsTo relationships
- Foreign key pattern detection (_id)
- Random existing record selection
- Related table discovery
How It Works:
- • Detects columns ending with "_id"
- • Finds related table by removing "_id" and adding "s"
- • Selects random existing ID from related table
- • Maintains referential integrity
Relationship Configuration
Configure relationships through templates with column and table mappings.
Template Relationships
Define relationships in templates with column and related table specification.
Auto Discovery
Automatic relationship discovery during table analysis for common patterns.
Current Limitations
The relationship system currently supports basic foreign key relationships. Advanced features are not yet implemented:
Not Yet Supported:
- • Complex many-to-many relationships
- • Polymorphic relationships
- • Dependency order resolution
- • Cascade seeding across tables
- • Custom relationship distributions
Workarounds:
- • Seed parent tables first manually
- • Use foreign_key field type
- • Configure relationships in templates
- • Preview data before insertion
Safety Features
- • Foreign key constraint checking
- • Transaction-based seeding
- • Rollback on failure
- • Orphaned record prevention
Faker Integration
Built-in Faker Support
Integrated Faker library provides realistic data generation for various field types.
Personal Data
- • Names (first, last, full)
- • Email addresses
- • Phone numbers
- • Addresses (street, city, state)
Data Types
- • Text (words, sentences, paragraphs)
- • Numbers (integers, decimals, ranges)
- • Dates and times
- • Boolean values with probability
Specialized
- • UUIDs
- • JSON structures
- • Enum selections
- • Custom patterns
Available Field Types
Each field type uses appropriate Faker methods for realistic data generation.
String Types
- • company() - Company names
- • sentence() - Sentence text
- • paragraph() - Paragraph text
- • slug() - URL-friendly slugs
- • word() - Single words
Date/Time
- • dateTimeBetween() - Date ranges
- • Configurable min/max dates
- • Custom format support
Numeric Types
- • numberBetween() - Integer ranges
- • randomFloat() - Decimal numbers
- • boolean() - True/false values
- • Configurable probability
Special Types
- • uuid() - UUID generation
- • randomElement() - Enum choices
- • JSON structure generation
Configuration Options
Field types support various options for customizing generated data.
String Options
- • length: Maximum string length
- • string_type: company, sentence, paragraph, etc.
- • pattern: Custom patterns with placeholders
Numeric Options
- • min/max: Value ranges
- • decimals: Decimal places for floats
- • true_probability: Boolean true percentage
Bulk Seeding Operations
Basic Bulk Generation
Generate multiple records efficiently using database bulk inserts.
Current Features:
- • Configurable record count (1-10,000)
- • Preview before generation
- • Template-based generation
- • Error handling and reporting
- • Success/failure tracking
UI Features:
- • Smart Data Seeder page
- • Real-time data preview
- • Confirmation modals
- • Progress notifications
- • Template management
Available Commands:
# Generate 1000 records
php artisan codeforge:generate-data users --count=1000
# Preview 5 records first
php artisan codeforge:generate-data products --count=100 --preview
Current Limitations
The bulk seeding system has some limitations that may be addressed in future versions:
Not Yet Implemented:
- • Advanced batch processing with chunking
- • Background queue processing
- • Memory optimization for large datasets
- • Progress bars for long operations
- • Database connection pooling
- • Parallel processing
- • Transaction batching
Current Workarounds:
- • Use smaller batch sizes (max 10,000)
- • Run multiple smaller commands
- • Monitor memory usage manually
- • Use database tools for very large datasets
Progress Tracking
Monitor generation progress with real-time updates and estimated completion times.
Error Recovery
Automatic retry mechanisms and checkpoint recovery for failed operations.
Memory Management
Advanced memory management techniques to handle datasets larger than available RAM.
Configuration Example:
'performance' => [
'chunk_size' => 1000,
'memory_limit' => '512MB',
'use_bulk_inserts' => true,
'disable_model_events' => true,
'connection_timeout' => 300,
]
Memory Optimization
- • Configurable chunk sizes
- • Memory usage monitoring
- • Garbage collection triggers
- • Resource cleanup
Database Optimization
- • Connection pooling
- • Query optimization
- • Index management
- • Transaction batching
Customization Options
Environment-Specific Configuration
Configure different seeding strategies for development, testing, staging, and production environments.
Environment Configuration:
'environments' => [
'local' => [
'default_count' => 50,
'enable_relationships' => true,
'use_realistic_data' => true,
],
'testing' => [
'default_count' => 10,
'enable_relationships' => false,
'use_simple_data' => true,
],
'staging' => [
'default_count' => 1000,
'enable_relationships' => true,
'anonymize_data' => true,
]
]
Development
- • Rich, realistic data
- • Full relationships
- • Moderate volumes
- • Fast generation
Testing
- • Minimal, consistent data
- • Predictable patterns
- • Fast execution
- • Isolated datasets
Production-Like
- • Large volumes
- • Anonymized data
- • Performance testing
- • Realistic load
Custom Seeder Generation
Generate Laravel seeder classes with intelligent defaults and customizable options.
Seeder Class Generation
Create Laravel seeder classes with factory integration and custom logic.
php artisan db-manager:generate-seeder UserSeeder --model=User --count=100
Factory Integration
Seamlessly integrate with Laravel model factories and factory states for consistent data generation.
Safety Features
Built-in safety features like environment checks, transaction handling, and foreign key management.
Data Constraints & Validation
Apply business rules and constraints to ensure generated data meets your requirements.
Constraint Types:
- Unique value constraints
- Range and boundary limits
- Pattern matching rules
- Cross-field dependencies
- Business logic validation
Validation Features:
- Pre-generation validation
- Post-generation verification
- Constraint violation handling
- Data quality metrics
- Compliance reporting
Best Practices
Development Workflow
Recommended workflow for using the smart seeding features effectively.
Analyze Your Tables
Use the Smart Data Seeder page to analyze table structure and see suggested field mappings.
Preview Before Generating
Always use the preview feature to verify data looks correct before inserting records.
Save Useful Templates
Create and save templates for tables you generate data for frequently.
Use Command Line for Automation
Use the codeforge:generate-data command for scripting and automation needs.
Current Recommendations
Based on the current implementation, here are recommended practices.
For Small Datasets (< 1,000):
- • Use the Smart Data Seeder UI
- • Enable all field types and relationships
- • Preview data for verification
- • Save successful configurations as templates
For Larger Datasets (1,000+):
- • Use command line for better performance
- • Start with smaller counts for testing
- • Monitor memory usage
- • Consider running in smaller batches
Working with Relationships
Current best practices for handling related data.
Seed Parent Tables First
Since automatic dependency resolution isn't implemented, manually seed parent tables before child tables that reference them.
Use Foreign Key Field Type
For foreign key columns, use the foreign_key field type to automatically select from existing records.
Configure Relationships in Templates
Define relationships explicitly in templates to ensure proper foreign key population.
Quick Start Guide
Getting Started:
- 1. Navigate to Smart Data Seeder page
- 2. Select a table from the dropdown
- 3. Set record count (start with 10)
- 4. Click "Generate Preview"
- 5. Review the preview data
- 6. Click "Generate & Insert Data"
Command Line:
- 1.
php artisan codeforge:generate-data users --count=10 --preview
- 2. Review the preview output
- 3.
php artisan codeforge:generate-data users --count=10
- 4. Save as template:
--save-template=my_template