Posts
All the articles I've posted.
- 8 MIN READ•Apr 29, 2026
Hash, Sort-Merge, Broadcast: How Distributed Joins Work
Distributed joins move data across the network using shuffle, broadcast, or co-location strategies. Here is how each works and when engines choose which.
distributed join algorithmsshuffle joinbroadcast join - 7 MIN READ•Apr 29, 2026
When Catalogs Are Embedded in Storage
S3 Tables and MinIO AI Stor embed the Iceberg catalog directly in the storage layer. Here is when embedded catalogs make sense and when they do not.
embedded Iceberg catalogS3 TablesMinIO AI Stor - 7 MIN READ•Apr 29, 2026
Partitioning, Sharding, and Data Distribution Strategies
Hash partitioning distributes data evenly. Range partitioning enables fast range scans. Both create tradeoffs. Here is how databases divide data across storage and nodes.
data partitioningdatabase shardingpartition pruning