Posts

From Kappa Architecture to Streamhouse: Making the Lakehouse Real-Time

Tech Radar Data 2024 by Theodo.com

Uber Data Infra Scale Numbers

Monitoring Airflow with Prometheus/Grafana

PySpark Kafka Stream: How to test it ?

Spark partitioning vs bucketing partitionsby vs bucketby

Spark partitioning

Avro vs Parquet overview

Deploy serverless Spark jobs to AWS using GitHub Actions

Data Modeling - Why Data Engineers Need To Understand It - An Introduction To Data Engineering

Migrate a Parquet data lake to Delta Lake

Data architecture book review: Deciphering Data Architectures