Simple Method to choose Number of Partitions in Spark
At the end of this article, you will able to analyze your Spark Job and identify whether you have the right configurations settings for your spark environment and whether you utilize all your resources.
Whenever you work on a spark job, you should consider 2 things.
- Avoid Spill
- Maximize Parallelism by utilizing all the cores.
Credit: Tharun Kumar Sekar
Photo by Fran Couto