In Part 1, we saw an overview of Data Governance and the initiatives firms need to take to incorporate governance. Let’s now look a bit more in detail about Data Quality Management as this is a key step in Data Governance towards ensuring data quality. Why is Data Quality Management necessary? Data Quality Management is […]
Blogs from this Author
Data Governance – a must-have to ensure data quality – Part 1
While one of my earlier posts on Quality Data being a pre-requisite for every BI technique is still generating both positive and negative responses, I felt it would be apt to delve into Data Governance and see why it is necessary to be incorporated to achieve & maintain better data quality. First, lets have a […]
Leveraging NoSQL & Business Intelligence
I just happened to hear about the NoSQL Now Conference taking place at San Jose through this weekend and thought it would be interesting to explore a bit about bridging NoSQL and BI. What is NoSQL? NoSQL (‘Not Only SQL’) can be defined as the next-gen databases which differ from the traditional ones in being […]
Data Scientist – Will this be the dream job in the near future? – Part 2
In Part 1 we saw an overview of Data Science and how a Data Scientist comes into picture. Let’s now look into some of the challenges firms face in finding these skilled data scientists and what measures they could take to overcome the same. Challenges & Initiatives to consider There should be no surprises that […]
Data Scientist – Will this be the dream job in the near future? – Part 1
According to Fortune – Data Scientist is “The hot new gig in tech”. Indeed, the term Data Scientist is slowly being seen as one of the in-demand career options. With the increasing trend of firms like Facebook & Amazon depending more and more on data science to have that vital competitive edge, the value of […]
A Data Mining Approach to Spam Detection in Social Bookmarking Sites – Part 3
In Part 2 of this series we saw the details about the approach we employed to predict the spammers using Neural Networks and Text Mining. In this post, we’re going to look at some of the complexities involved in this approach and finally wrap it up by looking at some of the alternative approaches and […]
Quality Data – a key pre-requisite for any BI technique
There was a recent post in the HBR blogs that stated that ‘Success comes from better data, not better data analysis’. http://blogs.hbr.org/cs/2011/08/success_comes_from_better_data.html While this sounds cliche, it is a fact and we tend to ignore the value of quality data. Nowadays firms invest on hiring some of the best analysts in the industry with the […]
A Data Mining Approach to Spam Detection in Social Bookmarking Sites – Part 2
In Part 1 , we saw a small introduction to Social Bookmarking Sites and about the task. Let us now look into the Approach we employ here to predict the spammers. THE APPROACH Data Extraction & Data Cleaning The dataset provided consists of bookmarks, tags & user ids, and is in the form of a […]
A Data Mining Approach to Spam Detection in Social Bookmarking Sites – Part 1
With the growing popularity of social bookmarking sites, spammers typically use these kind of services as a playground for their activities. As we sll know, one of the main disadvantages of Social Bookmarking Systems is Spam. The intention of spammers to use these systems is to pursue two goals: Place links in the sites to […]