Traditionally, in our information architectures we have a number of staging or intermediate data storage areas / systems. These have taken different forms over the years, publish directories on source systems, staging areas in data warehouses, data vaults, or most commonly, data file hubs. In general, these data file staging solutions have suffered from two […]
Posts Tagged ‘Enterprise Data and Analytics’
Seven Deadly Sins of Database Design
This is a summary of an article from Database Trends And Applications; dbta.com. The author addresses fundamental mistakes that we do or we live with in regards to our database systems. 1. Poor or missing documentation for databases in PRODUCTION We may have descriptive table names and columns to begin with, but as workforce turns […]
Looping through files in a folder using ODI
On a recent project, I was faced with a requirement to scan the contents of a folder and load all the files into their respective staging tables. There were multiple file types – Customer file, Store file, Products file, Sales file, etc. Every day, we received zero to many files for each type of file. […]
Data Science = Synergistic Teamwork
Data science is a discipline conflating elements from various fields such as mathematics, machine learning, statistics, computer programming, data warehousing, pattern recognition, uncertainty modeling, computer science, high performance computing, visualization and others. According to Cathy O’Neil and Rachel Schutt, two luminaries in the field of Data Science, there are about seven disciplines that even data scientists in training […]
Disruptive Scalability
The personal computer, internet, digital music players (think ipods), smart phones, tablets are just a few of the disruptive technologies that have become common place in our lifetime. What is consistent about these technology disruptions is that they all have changed the way we work, live, and play. Whole industries have grown up around these technologies. […]
Thoughts on Oracle Database In-Memory Option
Last month Oracle announced Oracle In-Memory database option. The overall message is that once installed, you can turn this “option” on and Oracle will become an in-memory database. I do not think it will be that simple. However, I believe Oracle is on the correct track with this capability. There are two main messages with […]
Evaluating In-Memory DBs
This month Oracle is releasing its new in-memory database. Essentially, it is an option that leverages and extends the existing RDBMs code base. Now with Microsoft’s recent entry all four the mega-vendors (IBM, SAP, Microsoft, and Oracle) have in-memory database products. Which one that is a best fit for a company will depend on a […]
A little stuffed animal called Hadoop
Doug Cutting – Hadoop creator – is reported to have explained how the name for his Big Data technology came about: “The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria.” The term, of course, evolved over time and […]
Web analytics and Enterprise data…
I was looking at the market share of Google Analytics (GA) and it is definitely on the rise. So I was curious to see the capabilities and what this tool can do. Of course it is a great campaign management tool. It’s been a while since I worked on campaign management. I wanted to know […]
SAP HANA and Hadoop – complementary or competitive?
In my last blog post, we learned about SAP HANA… or as I called it, “a database on steroids”. Here is what SAP former CTO and Executive Board Member, Vishal Sikka, told InformationWeek: “Hana is a full, ACID-compliant database, and not just a cache or accelerator. All the operations happen in memory, but every transaction […]
Is IT ready for Innovation in Information Management ?
Information Technology (IT) has come a long way from being a delivery organization to an organization part of business innovation strategy, though a lot has to change in the coming years. Depending on the industry and the company culture, IT organization will mostly fall in the operational spectrum and a lot of progressive ones are […]
SAP HANA – A ‘Big Data’ Enabler
Some interesting facts and figures for your consideration: 90% – of stored data in the world today was created in the past 2 years 50% – annual data growth rate 34,000 – tweets sent each minute 9,000,000 – daily Amazon orders 7,000,000,000 – daily Google Page Views 2.5 Exabyte – amount of data created every day (an Exabyte […]