Posts Tagged ‘Hadoop’

InForm

Take advantage of windows in your Spark data science pipeline

Windows can perform calculations across a certain time frame around the current record in your Spark data science pipeline. Windows are SQL functions that allow you to access data before and after the current record to perform calculations. They can be broken down into ranking and analytic functions and, like aggregate functions. Spark provides the […]

agile backlog groom

OLAP and Hadoop: The 4 Differences You Should Know

OLAP and Hadoop are not the same. OLAP is a technology to perform multi-dimensional analytics like reporting and data mining. It has been around since 1970. Hadoop is a technology to perform massive computation on large data. Around since 2002. They can be used together but there are differences when choosing between using Hadoop/MapReduce data […]

Big Data Bootcamp

Big Data Bootcamp by the Beach: An introduction

This is a little story about nothing ventured; nothing gained. One day, I got a LinkedIn message asking if I would like to teach a Big Data Bootcamp at an event for the Universidad Abierta Para Adultos in Santiago de Caballeros, República Dominicana. Luis didn’t know me; he just saw my profile and saw that I’ve been […]

How to Load Log Data into HDFS using Flume

Data acquisition is a very important part of building a big data ecosystem. Data acquisition allows you to extract various types of data such as a file, DB, streaming, web page etc. If you are just setting up your local environment, not in the real business scenarios, you can resolve data acquisition by making use […]

2 Choices for Big Data Analysis on AWS: Amazon EMR or Hadoop on EC2

What are the key differentiators to determine Hadoop distribution for Big Data analysis on AWS? We have two choices: Amazon EMR or a third-party provided Hadoop (ex: Core Apache Hadoop, Cloudera, MapR etc). Yes, cost is important. But, aside from cost, other things to look for include ease of operation, controlling, managing, performance, features etc. 1. Cost […]

Top 10 EIS Posts of 2015

The Year in Review | Top 10 EIS Posts of 2015

It’s been a busy year in the Enterprise Information Systems space. With over 75 posts this year, our in-house experts found themselves face to face with big changes and an abundance of great information to share. We sifted through that content and present to you the Top 10 EIS posts of 2015.   Ten | […]

Time Well Spent in 2015

The end of 2015 is fast approaching, with December looming just a week away. For most people, December is packed with the hustle and bustle of last-minute gift shopping, or end-of-year projections and budgets for 2016. Often in the sway of all this activity, many are so focused on the approaching New Year that they […]

Dorothy in the Land of Big Data

Big Data is one of the enabling technologies for companies to digitally transform either their operations and/or customer  interactions.  However the open source world can be complicated, especially in the red hot Big Data arena. There are a myriad of technologies; some compete with one another, others overlap, some are complementary, and worse of all, […]

Hadoop, Spark, Cassandra, Oh My!

Previously, I reviewed why Spark will not by itself replace Hadoop, but Spark combined with other data storage and resource management technologies creates other options for managing Big Data.  Today we will investigate how an enterprise should proceed in this new, “Hadoop is not the only option” world.  Hadoop, Spark, Cassandra, Oh My!  Open source Hadoop and […]

IBM’s Spark Investment is Evidence Big Data is Dead

  Right after I posted my blog on Spark and Hadoop, I came across this article. IBM made a big announcement that they are putting their weight behind Spark.  They are committing more than 3,500 developers and programmers to help move Spark forward. This combined with significant support from the Big 3 Hadoop distributors (HortonWorks, Cloudera, […]

Webinar: Big Data & Microsoft, Key to Your Digital Transformation

Companies undergoing digital transformation are creating organizational change through technologies and systems that enable them to work in ways that are in sync with the evolution of consumer demands and the state of today’s marketplace. In addition, more companies are relying on more and more data to help make business decisions. And when it comes […]

Analytical Talent Gap

As new companies embark on the Digital Transformation leveraging Big Data, key concerns and challenges get amplified especially for the near term before the technology and talent pool supply adjusts to the demand. Looking at the  earlier post Big Data Challenges, the top 3 concerns were: Identifying the Business value/Monetizing the Big Data Setting up the […]

Load More