Skip to main content

Posts Tagged ‘Scala’

On Scala’s parenthesis convention for no-arg functions

One might be confused or even angered when they learn about Scala’s convention regarding parenthesis usage for no-arg functions. The convention is this: given a no-arg function, you use put parentheses next to the function call only if the function has side effects. So, you would invoke a function named printCurrentState by writing printCurrentState(), since printCurrentState […]

A Spark Example to MapReduce System Log File

In some aspects, the Spark engine is similar to Hadoop because both of them will do Map & Reduce over multiple nodes. The important concept in Spark is RDD (Resilient Distributed Datasets), by which we could operate over array, dataset and the text files. This example gives you some ideas on how to do map/reduce […]

How to Configure Eclipse for Spark Application in the Cluster

Spark provides several ways for developer and data scientists to load, aggregate and compute data and return a result. Many Java or Scala developers would prefer to write their own application codes (aka Driver program) instead of inputting a command into the built-in spark shell or python interface. Below are some steps for how to quickly configure […]

How to Setup Local Standalone Spark Node

From my previous post, we may know that Spark as a big data technology is becoming popular, powerful and used by many organizations and individuals. The Spark project was written in Scala, which is a purely object-oriented and functioning language. So, what can a Java developer do if he or she wants to learn about […]

A little stuffed animal called Hadoop

A little stuffed animal called Hadoop

Doug Cutting – Hadoop creator – is reported to have explained how the name for his Big Data technology came about: “The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria.” The term, of course, evolved over time and […]