Apache Spark Articles - Perficient Blogs
Blog

Posts Tagged ‘Apache Spark’

  • Topics
  • Industries
  • Partners

Explore

Topics

Industries

Partners

Big Data Bootcamp by the Beach: An introduction

This is a little story about nothing ventured; nothing gained. One day, I got a LinkedIn message asking if I would like to teach a Big Data Bootcamp at an event for the Universidad Abierta Para Adultos in Santiago de Caballeros, República Dominicana. Luis didn’t know me; he just saw my profile and saw that I’ve been […]

Read more

Hangzhou Spark Meetup 2016

Last weekend there was a meetup in Hangzhou for the Spark community, and about 100 Spark users or committers attended. It was great to meet so many Spark developers, users and data scientists and to learn about recent Spark community update issues, road maps and real use cases. The event organizer delivered the first presentation […]

Read more

How to Load Oracle Data into SparkR Dataframe

In the Spark 1.4 and onward, it supplied various ways to enable user to load the external data source such as RDBMS, JSON, Parquet, and Hive file into SparkR. Ok, when we talk about SparkR, we would have to know something about R. Local data frame is a popular concept and data structure in R […]

Read more

A Spark Example to MapReduce System Log File

In some aspects, the Spark engine is similar to Hadoop because both of them will do Map & Reduce over multiple nodes. The important concept in Spark is RDD (Resilient Distributed Datasets), by which we could operate over array, dataset and the text files. This example gives you some ideas on how to do map/reduce […]

Read more

How to Configure Eclipse for Spark Application in the Cluster

Spark provides several ways for developer and data scientists to load, aggregate and compute data and return a result. Many Java or Scala developers would prefer to write their own application codes (aka Driver program) instead of inputting a command into the built-in spark shell or python interface. Below are some steps for how to quickly configure […]

Read more

How to Setup Local Standalone Spark Node

From my previous post, we may know that Spark as a big data technology is becoming popular, powerful and used by many organizations and individuals. The Spark project was written in Scala, which is a purely object-oriented and functioning language. So, what can a Java developer do if he or she wants to learn about […]

Read more

Hangzhou Apache Spark Meetup

Similar to the Hadoop project, the Apache Spark project is a fast evolving in-memory engine for large-scale data processing platform. Particularly in recent years, Spark was widely used in many organizations and its community is being committed by many contributors. Perficient China GDC colleagues attended a recent Spark technology meetup in Hangzhou, during the meetup […]

Read more

Subscribe to the Weekly Blog Digest:

Sign Up