Spark shuffle storage options

 

 


Apache Spark executors require storage space for various operations, particularly for shuffle data during wide operations such as sorting, grouping, and aggregations. Wide operations are transformations that require data from different partitions to be combined, often resulting in data movement across the cluster. During the map phase, executors write data to shuffle storage, which is then read by reducers

 https://iomete.com/resources/k8s/spark-executor-shuffle-storage-options

 

Credit: https://iomete.com/

Comments