Skip to main content

Posts Tagged ‘YARN’

Istock 649839956

Key Components/Calculations for Spark Memory Management

Different organizations will have different needs for cluster memory management. For the same, there is no set of recommendations for resource allocation. Instead, it can be calculated from the available cluster resources.  In this blog post, I will discuss best practices for YARN resource management with the optimum distribution of Memory, Executors, and Cores for […]

Top 5 Lessons of Day 1 at Hadoop Summit #HS16SJ

Perficient is at the Hadoop Summit in San Jose, CA and we’re tracking the best of the conference. Here’s the top 5 lessons from day 1: Apache Atlas for managing your business catalog is almost ready for prime time! It is not, however, ready to be a full fledged Records Management solution (no policy management, […]

Get R Running over YARN-based MapReduce

Out of the mathematical and statistics language and tools such as SAS, SPSS, Matlab, etc. R language is a pretty good tool which provides the environment and essential packages for statistical computing and graphics. It is free and it offers an open environment and the means to allow users to develop custom package. In addition to […]

A little stuffed animal called Hadoop

Doug Cutting – Hadoop creator – is reported to have explained how the name for his Big Data technology came about: “The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria.” The term, of course, evolved over time and […]