ClickHouse - Data Warehouse
Hosted ClickHouse service for storing and processing analytical data from both Pixlr and Designs.ai.
Overview
We use a hosted ClickHouse service to avoid infrastructure management overhead while maintaining high performance for analytical queries.
Why Hosted?
- Limited Headcount: No need to manage ClickHouse infrastructure
- Auto-scaling: Automatically scales up and down based on usage
- Cost Optimization: Can shutdown when unused to save costs
- Managed Service: Updates, backups, and monitoring handled by provider
Data Sources
Raw Data Tables
- Pixlr Data: Users, sessions, images, subscriptions, events
- Designs.ai Data: Users, projects, subscriptions, usage logs, billing
- Ingestion: From Airbyte (self-hosted on AWS)
Processed Data Tables
- Intermediate Tables: Created by SQLMesh CI/CD pipeline
- Aggregated Tables: Pre-computed metrics for fast queries
- Data Models: Business logic and transformations
Integration Points
Pixlr Admin Area
- Current: Existing admin area connects directly to ClickHouse
- Method: ClickHouse SDK or HTTP API
- Purpose: Real-time analytics and reporting
Designs.ai Admin Area
- Status: Planned for future implementation
- Method: ClickHouse SDK or HTTP API
- Purpose: Similar analytics capabilities as Pixlr
BI Team (Caleb)
- Tools: Various BI tools connect to ClickHouse
- Access: Read-only access for reporting and analysis
- Purpose: Business intelligence and data analysis
Key Benefits
- No Infrastructure Management: Focus on data, not servers
- Automatic Scaling: Handles varying workloads efficiently
- Cost Effective: Pay only for what you use
- High Performance: Optimized for analytical queries
- Reliability: Managed service with high availability