Spark Download and Test


Prerequisite

Step 1:

Download the Spark the from the following URL:

Go to the following URL:

https://spark.apache.org/downloads.html

Her we need to select the spark version(the verson on which we are going to work) and package type (compatible version with hadoop)

Click on the Download Spark hyperlink. Then you will be redirected to the following URL

https://www.apache.org/dyn/closer.lua/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz

When you check that url , you will find many Download URLs. Here you need to select the first One. If this is not working (Not able to download) then you can select any link provided in that page.

Step2:

Normally , downloaded files will be in the Downloads folder in Ubuntu.

Go to that folder and type the following .

tar -xvf spark.jar file (Her the downloaded jar file)

Now we have the spark software in our system.

Step3: Test.

Now go the the spark extracted folder after that type the following in terminal.

bin/spark-shell.

The above command will start the spark software in command

When the software runs, we will find two things which are sparkContext as sc , which is useful to create RDD. (Main part in the Spark)

sparkSession as spark. Which is useful to perform sql type queries in spark.

Step4: Quit

To come out from the spark shell.

Type :q or Ctrl+D

Have any Question or Comment?

Leave a Reply

Your email address will not be published. Required fields are marked *