Skip to main content

Posts Tagged ‘Hadoop’

Webinar: Big Data & Microsoft, Key to Your Digital Transformation

Companies undergoing digital transformation are creating organizational change through technologies and systems that enable them to work in ways that are in sync with the evolution of consumer demands and the state of today’s marketplace. In addition, more companies are relying on more and more data to help make business decisions. And when it comes […]

Analytical Talent Gap

As new companies embark on the Digital Transformation leveraging Big Data, key concerns and challenges get amplified especially for the near term before the technology and talent pool supply adjusts to the demand. Looking at the  earlier post Big Data Challenges, the top 3 concerns were: Identifying the Business value/Monetizing the Big Data Setting up the […]

Big Data Changes Everything – Has Your Governance Changed?

A few years ago, Big Data/Hadoop systems were generally a side project for either storing bulk data or for analytics. But now as companies  have pursued a data unification strategy, leveraging the Next Generation Data Architecture, Big Data and Hadoop systems are becoming a strategic necessity in the modern enterprise. Big Data and Hadoop are technologies […]

Hadoop’s Ever-Increasing Role

With the advent of Splice Machine and the release of Hive 0.14 we are seeing Hadoop’s role in the data center continue to grow. Both of these technologies support limited transactions against data stored in HDFS. Now, I would not suggest moving your mission-critical ERP systems to Hive or Splice Machine, but the support of […]

Defining Big Data Prototypes – part 2

In part 1 of this series, we discussed some of the most common assumptions associated with Big Data Proof of Concept (POC) projects. Today, we’re going to begin exploring the next stage in Big Data POC definition – “The What.” The ‘What’ for Big Data has gotten much more complicated in recent years; and now […]

One Cluster To Rule Them All!

In the Hadoop space we have a number of terms for the Hadoop File System used for data management. Data Lake is probably the most popular. I have heard it called a Data Refinery as well as some other not so mentionable names. The one that has stuck with me has been is the Data […]

The Modern Data Warehouse Will Augment Hadoop

The data warehouse has been a part of the EIM vernacular for nearly 20 years. The vision of the single source of the truth and a single repository for reporting and analysis are two objectives that have resulted in a never-ending journey.   The data warehouse never has had enough data and the quality required for […]

Data Staging and Hadoop

Traditionally, in our information architectures we have a number of staging or intermediate data storage areas / systems.   These have taken different forms over the years, publish directories on source systems, staging areas in data warehouses, data vaults, or most commonly, data file hubs.   In general, these data file staging solutions have suffered from two […]

Get R Running over YARN-based MapReduce

Out of the mathematical and statistics language and tools such as SAS, SPSS, Matlab, etc. R language is a pretty good tool which provides the environment and essential packages for statistical computing and graphics. It is free and it offers an open environment and the means to allow users to develop custom package. In addition to […]

A little stuffed animal called Hadoop

Doug Cutting – Hadoop creator – is reported to have explained how the name for his Big Data technology came about: “The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria.” The term, of course, evolved over time and […]

Webinar Recap: The Modern Data Warehouse – A Hybrid Story

Last week, we held a webinar, The Modern Data Warehouse – A Hybrid Story. As the world of data evolves ever so quickly, it transforms the industry and creates a need for new approaches to business intelligence. Data warehousing technology that worked well for years, serving its purpose to manage and understand business driven data, […]

SAP HANA and Hadoop – complementary or competitive?

In my last blog post, we learned about SAP HANA… or as I called it, “a database on steroids”. Here is what SAP former CTO and Executive Board Member, Vishal Sikka, told InformationWeek: “Hana is a full, ACID-compliant database, and not just a cache or accelerator. All the operations happen in memory, but every transaction […]

Load More