Let us now learn how to build Hadoop Application using SBT. SBT is an open source build tool for Scala and Java projects, similar to Java’s Maven or Ant. Its main features are native support for compiling Scala code and integrating with many Scala test frameworks.
100% Free Course On Big Data Essentials
Subscribe to our blog and get access to this course ABSOLUTELY FREE.
Installing SBT on Linux
To install SBT on Linux Debian systems, please use the following commands:
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
To install SBT on Linux RPM-based distributions, please use the following commands:
curl https://bintray.com/sbt/rpm/rpm | sudo tee /etc/yum.repos.d/bintray-sbt-rpm.repo
sudo yum install sbt
If you want to use SBT in Eclipse, then you need to use the SBT Eclipse plug-in. Following are the steps to download the SBT eclipse plug-in.
After the installation of SBT, open the .sbt directory. By default, .sbt will be in the ~/ directory. Move it into that folder using the following commands:
Here you need to create one directory for downloading SBT plug-ins. Create the directory for plug-ins using the following command:
sudo mkdir plugins
Inside the plug-ins directory, you need to create a file called plugins.sbt. Create the file by using the following command:
sudo gedit plugins
Here in this file add the following lines:
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "4.0.0") addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0") addSbtPlugin("org.xerial.sbt" % "sbt-pack" % "0.5.1")
*Note: Adding an empty line between the lines is necessary.
Now save and close the file. Open the sbt console by simply typing sbt in the command line.
Here in the sbt console, if you type eclipse, it should show you the success message as shown in the above screenshot. This eclipse plug-in will create a project definition. An Eclipse project definition consists of a .project file, a .classpath file.
Building Hadoop-Java Application using SBT
First, you need to create one Project folder. Here inside the src/main/java directory, you need to write your code, otherwise, SBT could not find your main class file.
Now, we have created one file called Hadoop_app_sbt and inside src/main/java we have pasted the code of Hadoop word count program.
Inside the root directory, i.e., Hadoop_app_sbt you need to create one more file called build.sbt here you need to provide all the specifications and library dependencies of your project.
name := "WordCount" version := "1.0" scalaVersion := "2.10.4" scalacOptions += "-target:jvm-1.8" publishMavenStyle := true //dependencies for Hadoop program libraryDependencies += "org.apache.hadoop" % "hadoop-mapreduce-client-core" % "2.7.1" libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.1" javacOptions in (Compile, compile) ++= Seq("-source", "1.8", "-target", "1.8", "-g:lines") mainClass in (Compile,run) := Some("WordCount") //specifying fully qualified path of main class crossPaths := false autoScalaLibrary := false //runs a pure java program resolvers += Resolver.file("Frozen IVY2 Cache Dependences", file("/home/luis/.ivy2/cache")) (Resolver.ivyStylePatterns) ivys "/home/luis/.ivy2/cache/[organisation]/[module]/ivy-[revision].xml" artifacts "/home/luis/.ivy2/cache/[organisation]/[module]/[type]s/[module]-[revision].[type]"
Now, we need to compile the code to check whether it has any errors or not. You can do that by using the command sbt compile. Move into the root directory where build.sbt is present.
After compiling, if your project doesn’t have any errors then you can see a success message at the last.
After compilation, all the library files that are required for your project will get downloaded. Now you can import this project into Eclipse and follow the following steps:
For that, you need to use the command sbt eclipse. This will build the project files that are required for Eclipse, and in Eclipse you need to import it as shown below:
To import the project into eclipse, follow the below procedure.
Open Eclipse –> Click on File–> Click on Import–> Click on General –> Click on Existing Projects into Workspace — >Select root directory–>Click on Browse–>Select the project file which you have created–>Click on ok
After selecting the root directory, in Eclipse select all projects as shown in the following screenshot:
You can see your imported project in the list of Eclipse projects. You can do modifications when needed.
Now for building the JAR file, you need to use the command sbt package. After successfully packaging the class files as a JAR, you can see the jar file in the target directory.
Now you can run the JAR file normally as a Hadoop JAR file as shown in the following screenshot.
hadoop jar Hadoop_app_sbt/target/wordcount-1.0.jar WordCount /word_count_input /word_count_sbt_output
In the above screenshot, you can see the output of our word count program which is packaged using SBT.
We hope this blog helped you in understanding how to build Hadoop application using SBT. Keep visiting our site www.acadgild.com for more updates on Bigdata and other technologies.