Spark shuffle storage options

Apache Spark executors require storage space for various operations, particularly for shuffle data during wide operations such as sorting, grouping, and aggregations. Wide operations are transformations that require data from different partitions to be combined, often resulting in data movement across the cluster. During the map phase, executors write data to shuffle storage, which is then read by reducers

https://iomete.com/resources/k8s/spark-executor-shuffle-storage-options

Credit: https://iomete.com/

La donnée intelligente

Search This Blog

Spark shuffle storage options

Comments

Post a Comment