So, the already convoluted Open Source Hadoop ecosystem just got a little more complicated with a Kudu joining the Elephant at #StrataHadoop. Advocates of Fast Analytics on Fast Data at Scale also just got more excited regarding the potential of fast writes, fast updates, fast reads, fast everything – all with Kudu! Cloudera’s Kudu is designed to fill major gaps in Hadoop’s storage layer, especially with regard to Fast Analytics, but is not meant to replace or disrupt (just yet!) HBase or HDFS. Instead, Kudu is meant to complement and run in close proximity with the storage engine because some applications may get more benefit out of HDFS or Hbase.
Before the official release of this news, VentureBeat speculated about Kudu’s possible implications for the Big Data industry. It “could present a new threat to data warehouses from Teradata and IBM’s PureData … It may also be used as a highly scalable in-memory database that can handle massively parallel processing (MPP) workloads, not unlike HP’s Vertica and VoltDB.”
Whatever the long-term implications of Kudu, the above scenarios are not going to play out any time soon. Maturity is still what most enterprises crave in this rather diverse Open Source ecosystem, and Kudu, despite all its excitement, has a long way to go on that front.