Apache Maven for Beginners
Install Oracle JDK 12 on Linux
Apache Spark Tutorial
Install Ballerina on Linux
Complex Event Processing - An Introduction

Android: List External Storage Files

This article explains how to list files from the external storage (SD Card) in Android. Though you can list files recursively using a simple method, the new Runtime Permission Model introduced in Android 6 makes it a little difficult. Let's dive into the code and see how we can list all the files recursively.

Android: List External Storage Files

As I mentioned earlier, I am using Kotlin for Android development since it is the future of Android. If you are using Java, just copy and paste the code into your class method by method. The Android Studio will translate the method into Java for you.
Read More

ANTLR Hello World! - Arithmetic Expression Parser

ANTLR Hello World! - Arithmetic Expression Parser

Ever wondered how all these programming languages understand what you write? This article reveals the truth: Language Parsing. It is often referred to as parsing, syntax analysis, or syntactic analysis. Regardless of the term, it is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The following diagram depicts the language parsing process:

Language Parser

As you can see, the Language Parser (which is part of the compiler) takes an input (which is the source code), validates it against the Language Grammar and produces an Abstract Syntax Tree (commonly known as AST which is representing the source code in a tree structure).

ANTLR (ANother Tool for Language Recognition) is a tool to define such grammar and to build a parser automatically using that grammar. It also provides two high-level design patterns to analyze the AST: Visitor and Listener. ANTLR is being used by several languages and frameworks including Ballerina, Siddhi, and Presto SQL. This article introduces ANTLR to you using a hello world application to evaluate basic mathematical expressions as a string.

Read More

Install the latest Oracle JDK on Linux

Even though OpenJDK is available in Linux repositories, some applications strictly require Oracle Java Development Kit. This article shows you how to manually install Oracle JDK 12 on your Linux system. This article uses JDK 12$java_update_no to demonstrate the installation. In the provided commands, replace the version specific paths and file names according to your downloaded version.
Oracle provides deb and rpm installers
If your Linux distribution is using DEB package format like Debian, you can download and install the $java_version$java_update_no_linux-x64_bin.deb file using the following command:
sudo dpkg -i $java_version$java_update_no_linux-x64_bin.deb
If your  Linux distribution is using RPM package format like Cent OS, you can download and install the $java_version$java_update_no_linux-x64_bin.rpm file using the following command:
sudo rpm -ivh $java_version$java_update_no_linux-x64_bin.rpm

However, this article explains the manual installation method which is applicable for all Linux distributions out there. Personally, I prefer the manual installation because I have more control over the changes made in the system.

Install Oracle JDK 12 on Linux

Read More

Install the latest Eclipse on Linux

This article shows you the way to install the latest version of Eclipse on Linux. There are other ways to install Eclipse using scripts to automate the installation. However, I prefer the manual installation method explained in this article so that you know where your files go. Later if you want to remove the Eclipse, it is just two commands as explained at the end of the article.

If you do not have Java in your system, follow this link and install the Java first.

Step 1:
Download the desired version of Eclipse from the official site:

Step 2:
Open the Terminal (Ctrl + Alt + T) and enter the following command to change the directory.
cd /opt

Step 3:
Enter the command given below to extract the Eclipse from ~/Downloads directory. If your downloaded file is in any other directory, replace the last parameter by the actual file path.
sudo tar -xvzf ~/Downloads/eclipse-jee-2019-03-R-linux-gtk-x86_64.tar.gz

Step 4:
Open another Terminal (Ctrl + Alt + T) and enter the following command to create a shortcut file for eclipse.
gedit eclipse.desktop

Step 5:
In the opened gedit, copy and paste the following text.
[Desktop Entry]
Comment=Integrated Development Environment

Step 6:
Save and close the gedit.

Step 7:
Enter the following command in the terminal to install the shortcut.
sudo desktop-file-install eclipse.desktop

Now search for Eclipse in the dashboard and open it.

Upgrade Eclipse

If you have already installed Eclipse using the above method and would like to upgrade the Eclipse to the latest version, just remove the Eclipse from /opt director and follow Steps 1 to 3 from the installation process.
sudo rm -rf /opt/eclipse/eclipse.desktop

Remove Eclipse

Removing the Eclipse installed as described in this article is just two lines of commands.

Step 1:
First, remove the menu entry you created in Step 7.
sudo rm /usr/share/applications/

Step 2:
Delete the /opt/eclipse folder.
sudo rm -rf /opt/eclipse

Read More

Javalin: A Tiny but Mighty Framework

Two years ago, I wrote an article Microservices in a minute using the open source framework MSF4J. Today I came across another framework Javalin: another lightweight framework to develop lightweight web applications with less or no effort. We already have plenty of web frameworks including the shining star Spring. What makes Javalin different is its simplicity. In addition, it can be used as a microservice framework or a tiny web framework to serve a web application with static files. In Javalin developers' words:

Javalin’s main goals are simplicity, a great developer experience, and first-class interoperability between Kotlin and Java.

Comparing Javalin with Spring is like comparing a shaving blade with a Wenger 16999 Swiss Army Knife Giant, but it does what it is supposed to do. If you want to quickly add a REST endpoint for a quick demo or if you just need a simple web framework without any additional gimmicks like Dependency Injection or Object Relational Mapping, consider Javalin. It is easy to learn and lighter to run.

In this article, you will see how to use Javalin as a web framework to serve a contact-us page and how to build a CRUD micro-service using Javalin.


Read More

Serve TensorFlow Models in Java

TensorFlow is a famous machine learning framework from Google and a must to know asset for machine learning engineers. Even though Python is recommended to build TensorFlow models, Google offers Java API to use TensorFlow in Java. Still, Python is the easiest language to build TensorFlow models, even for Java developers (learn Python, my friend). However, enterprise applications developed in Java may require the artificial intelligence offered by a trained TensorFlow model. In this article, you will learn how to load and use a simple TensorFlow model exported from Python.

Read More

Spark 06: Broadcast Variables

If you read the Spark 04: Key-Value RDD and Average Movie Ratings article, you might wonder what to do with popular movie IDs printed at the end. A data analyst cannot ask his/her users to manually check those IDs in a CSV file to find the movie name. In this article, you will learn how to map those movie IDs to movie names using Apache Spark's variable broadcasting.

Spark 06: Broadcast Variables

Suppose you want to share a read-only data that can fit into memory with every worker in your Spark cluster, broadcast that data. The broadcasted variable will be distributed only once and cached in every worker node so that it can be reused any number of times. More about broadcasting will be covered later in this article after the code example.
Read More

Apache Maven for Beginners

Apache Maven is a build tool widely being used by Java developers to manage project dependencies, control build process and automate tests. Apache Maven makes our life easier especially in building a complex Java project. However, beginners stay away from Apache Maven as I did years ago just because they find it complex to learn and use. This article simplifies the concept of Apache Maven and introduces Maven in a smooth way to beginners. In this article, you will see how you can use Apache Maven to manage your project dependencies using a simple Java project as an example. The article is structured into two main topics: Apache Maven in Eclipse and Apache Maven in IntelliJ IDEA. Of course, you can use Apache Maven without any IDEs. However, I stick with IDEs to make it simple for beginners. Other applications of Apache Maven like build management and test automation will be covered in another article.
Let's begin with manual dependency management using a simple calculator application. Suppose you want to develop a Calculator that receives a simple arithmetic expression like "2 + 3 * 5" as input and prints the output in the console. It is a complex task to evaluate such a String input and calculate the result by ourselves. Fortunately, there is a library: exp4j which can evaluate a String expression and return the output.
Read More

Spark 05: List Action Movies with Spark flatMap

Welcome to the fifth article in the series of Apache Spark tutorials. In this article, you will learn the application of flatMap transform operation. After the introduction to flatMap operation, a sample Spark application is developed to list all action movies from the MovieLens dataset.

Spark 05: List Action Movies with Spark flatMap

In the previous articles, we have used the map transform operation which transforms an entity into another entity where the transformation is one-to-one. For example, suppose you have a String RDD named lines, applying lines.map(x => x.toUpperCase) operation creates a new String RDD with the same number of records but with uppercase string literals as shown below:
Read More

Install Ballerina on Linux

Ballerina is a new open source JVM based language specially designed for integration purposes by WSO2 the world's #1 open source integration vendor. In this article, you will see how to manually install Ballerina on Linux systems. Visit the official website and download the installer for your system. There is an installer for Windows, Mac, Debian-based Linux and Fedora-based Linux. I prefer to install Ballerina manually because it is universal for all Linux operating systems out there.

Install Ballerina on Linux

Read More

Spark 04: Key-Value RDD and Average Movie Ratings

In the first article of this series: Spark 01: Movie Rating Counter, we created three RDDs (data, filteredData and ratingData) each contains a singular datatype. For example, data and filteredData were String RDDs and the ratingRDD was a Float RDD. However, it is common to use an RDD which can store complex datatypes especially Key-Value pairs depending on the requirement. In this article, we will use a Key-Value RDD to calculate the average rating of each movie in our MoviLens dataset.  Those who don't have the MovieLens dataset, please visit the Spark 01: Movie Rating Counter article to setup your environment.

Spark 04: Key Value RDD and Average Movie Ratings

As you already know, the ratings.csv file has the fields movieId and rating. A given movie may get different ratings from different users. To get the average ratings of each movie, we need to add all ratings of each movie individually and divide the sum by the number of ratings.

Read More

Spark 03: Understanding Resilient Distributed Dataset

You are not qualified as an Apache Spark developer until you know what is a Resilient Distributed Dataset (RDD). It is the fundamental technique to represent data in the Spark memory. There are advanced data representation techniques like DataFrame built on top of RDD. However, it is always better to start with the most basic dataset: RDD. RDD is nothing other than a data structure with some special properties or features.

Spark 03: Understanding Resilient Distributed Dataset

We all know that Apache Spark is a distributed general-purpose cluster-computing framework. There are some common problems faced in a distributed environment including but not limited to:
  1. Remote access of data is expensive
  2. High chance of failure
  3. Runtime errors are expensive and hard to track
  4. Wasting computing power is way too expensive
RDD is designed to address the abovementioned problems. In the following section, you will see the properties of RDD and how it solves these problems.
Read More

Spark 02: Scala Cheat Sheet for Java Developers

This article introduces Scala to those Java developers who don't know Scala. I assume here that you already know Java (preferably Java 8) so that you can compare the features of Scala with Java. Please be informed that this article is not an end to end Scala tutorial. I am covering only the fundamentals of Scala which are used in my Apache Spark tutorials.

Spark 02: Scala Cheat Sheet for Java Developers

First of all, remember that Scala is a JVM based language which is running on top of your regular Java Virtual Machine. The whole purpose of Scala is providing a convenient functional programming language (at that time Java 8 wasn't there). However since Scala is built on top of Java, you can access Java libraries and API from your Scala code.

To play with Scala, please setup Scala on IntelliJ IDEA or install a command line Scala version in your system.
Read More

Contact Form


Email *

Message *