Tag: SparkRDD


RDD Transformations are Spark operations when executed on RDD, it results in a single or multiple new RDD’s. Since RDD are immutable in nature, transformations always create new RDD without updating an existing one. Hence, this creates an RDD lineage. RDD Read more…


In this post we will learn about sparkContext parallelize Let’s see how to create Spark RDD using sparkContext.parallelize, Resilient Distributed Datasets (RDD) is a fundamental data structure of Spark, It is an immutable distributed collection of objects. Each dataset in RDD is divided into logical Read more…