Infrastructure
8 items in Infrastructure
Blog Posts
PostgreSQL Internals Series, Part 6: Write-Ahead Logging & Replication
How PostgreSQL guarantees durability with WAL, recovers from crashes, and replicates to standbys using streaming replication and logical slots.
Kubernetes Series, Part 0: Overview
What is Kubernetes, what problem it solves over bare metal and Docker, and a roadmap for running data workloads on K8s.
Kubernetes Series, Part 1: Core Concepts
Pods, Deployments, Services, ConfigMaps, and Namespaces — the essential vocabulary every K8s user must know.
Kubernetes Series, Part 2: Storage and Configuration
PersistentVolumes, PersistentVolumeClaims, StorageClasses, Secrets, and ConfigMaps — how stateful data workloads survive pod restarts.
Kubernetes Series, Part 3: Workload Patterns for Data Engineering
StatefulSets, Jobs, CronJobs, and DaemonSets — the right workload type for each data engineering use case.
Kubernetes Series, Part 4: Running Spark on Kubernetes
Submitting Spark jobs natively to K8s, the Spark Operator, executor resource sizing, and shuffle storage.
Kubernetes Series, Part 5: Running Flink and Kafka on Kubernetes
Deploying Flink with the Flink Kubernetes Operator and Kafka with Strimzi — the streaming stack on K8s.
Kubernetes Series, Part 6: Production Operations
Resource quotas, autoscaling (HPA/KEDA), monitoring with Prometheus and Grafana, and cluster cost management for data platforms.