This post explains how to setup and run Spark jobs on Hadoop Yarn cluster and will run an spark example on Yarn cluster. Prerequisites : If you don’t have Hadoop & Yarn installed, please Install and Setup Hadoop cluster and setup Yarn on Read more…


This post explains how to setup Yarn master on hadoop 3.1 cluster and run a map reduce program. Before you proceed this document, please make sure you have Hadoop3.1 cluster up and running. if you do not have a setup, Read more…


This documents explains step by step Apache Hadoop installation version (hadoop 3.1.1) with master node (namenode) and 3 worker nodes (datanodes) cluster on Ubuntu. Below are the 4 nodes and it’s IP addresses I will be referring here. 192.168.1.100    Read more…


In this post ,we are going to explain about sparksession. Since Spark 2.0 SparkSession has become an entry point to Spark programming with RDD, DataFrame, and Dataset. Prior to 2.0, SparkContext used to be an entry point. Here, I will Read more…


Spark default language is Scala. SparkContext (JavaSparkContext for Java) is an entry point to Spark and PySpark to programming with RDD and to connect to Spark Cluster, In this article, you will learn how to create it using examples. What Read more…


In this blog we will be discussing about how to install oozie in hadoop 2.x cluster. First we need to download the oozie-4.1.0 tar file from the below link: Oozie-4.1.0 tar file By default it will be downloaded in the Downloads folder. We need to Read more…


You can do the following method, copy to clipboard datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9 and run following command under HADOOP_HOME/bin directory then this code formatted the namenode with datanode cluster ids. OR ou must do as follow : bin/stop-all.sh Read more…


nagaraju@nagaraju:/usr/local/softwares/hadoop-2.10.0$ sbin/stop-all.sh This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh Stopping namenodes on [localhost] localhost: chown: changing ownership of ‘/usr/local/softwares/hadoop-2.10.0/logs’: Operation not permitted localhost: no namenode to stop localhost: chown: changing ownership of ‘/usr/local/softwares/hadoop-2.10.0/logs’: Operation not permitted localhost: no Read more…


Solr – > solr-7.0.7 is working with Java 1.8 only … I have tested everything is working fine.