Apache Spark: The Flavors of CI/CD
Credits: Kesav Thakur
Understand under the hood… What are the possibility to achieve Continous delivery of Spark-based ETL Job?
In this article, I will do my best to cover two topics from all if/else perspective:
First One is certainly Apache Spark(JAVA, Scala, PySpark, SparklyR) or (EMR, Databricks)
Second One: Continous Integration and Delivery which is a Pipeline possibility using Job/Jenkins, Dockers/Kubernetes, Airflow with EMR/Databricks
Now, if you are continuing to read, Thanks for your interest and I will try to make it comprehensive for you. For this, Let’s get started with a design in which, I will fit Spark based ETL which needs a CI/CD Pipeline.
Read more: Apache Spark : CI/CD
Comments
Post a Comment