Big Data and Social Business / Blogs / Perficient

Leon Katsnelson gave a presentation on Big Data and Social Business. His general theme is that we are seeing a LOT of data. This huge volume can fundamentally change how we addressthe business.

In todays transactions we want more information on it

What was the transaction?
Who bought the widget?
Where did they get the information to come buy the widget?
Why did they buy the widget?
What more did they buy?

These questions are very different from the operationally oriented questions asked a while ago.

Nice Quote: Data is the product

Think about LinkedIn, Yahoo!, Twitter, Experian, facebook, and Zynga

Each web page on Yahoo! is customized. That’s what help drawing the audience. They do this because they harness all the data. Yahoo! was the pioneer of Hadoop
Zynga. Over 90% of their business comes from facebook. They are a gaming company. Ken Rudin, VP of Analytics said, “We’re an analytics company masquerading as a game company.”

232 million play their games
A small percentage of users will buy virtual products and pay Zynga money for that.
Electronic Arts is now worth less than Zynga. That’s a huge disruption to the gaming industry

Apple sells a lot of stuff but itunes and the app store holds the largest collection of customers in the world. They have all the credit card data. This includes, music, subscriptions, app, movies, etc. They hold your data.

Illustrations of big numbers and data

US tax revenue of 2,170,000,000,000
Federal budget of $3,820,000,000,000
Current deficit of 1,650,000,000,000
National debt of $14,271,000,000,000
Budget cuts of just $38,500,000,000

Other stats:

Facebook is expected to have 1 billion users by August 2012. It generates 12TB of data a day. That’s 1/7 of the population of the world.

Pets have facebook pages. 14% of pet owners do this.

Google + is expected to reach 400+ million users by the end of 2012
LinkedIn has 130 million members today
Twitter has 100 million active users. It also generates 12+ TB of tweet data a day

What is this all about? Answer, it’s about getting a 360 degree view of a customer

Who
What social network
What competitors do they use
ARe they profitable
Is she a shopping maven
Does she influence others?

Quote: Twitter is not a technology. It’s a conversation

Quote: Posting press releases on twitter is a dumb idea

How cool can it be, check out the advanced search. But once you do a search, the amount of data is just plain huge. You can’t deal with it all individually.

Other Data

Build an AI-First Enterprise

From early pilots to enterprise-wide deployment, our award-winning AI consulting and technical services help you build the right foundation, scale responsibly, and deliver meaningful business outcomes.

Learn More

Think about the data in sensors. At lotusphere, every room we enter is scanned by an rfid tag. 5,000+ users * 15 rooms a day * 5 days equal a lot of data.

Now think about engines on a commercial airliner. They generate a huge amount of data. It’s 1TB of data every 30 minutes. That Terrabytes of data generated but they are erased after landing. Why? They can’t handle that amount of data.

Now think about electric meters that are sampled 4 times an hour

Quote: Data generated by machines and sensors will exceed that of machines

This all adds up to information load but lacking insight. That makes Big Data a big problem and a big opportunity.

IBM Big Data Platform

So yes, we knew it was coming. IBM does have a tool to handle these levels of data. It’s called the MPP datawarehouse

Think

Netezza 1000
IBM infosphere big insites for Hadoop
Infosphere streams for streaming data or quickly moving data
Infosphere Information Server to consolidate and integrate the data you have

What does a data platform do?

Analyze a variety of information
Analyze information in motion
Analyze extreme volumes of information
Discover and experiment. Do ad hoc analysis, data discovery, and experimentation. Experimentation is key because you don’t know what to do with that data until you experiment.
Manage and Plan. Enforce data structure, integrity, and control to ensure consistency for repeatable queries

A lot of research has gone into this. Think about what IBM has done with Watson. That’s text analytics.

What are uses for big data? It’s log analysis and storage, smart grid, fraud detection, 360 degree view of customer, email and call center transcript analysis

A few more words about Hadoop

IBM embraced and extended it. It’s a nice way to embrace. It’s not forked, not ported. It’s extended and then contributed back to open source. The same analytics are used for in motion and at rest data. The two teams share a fair amount of information and code.

Case study

UOIT capturing preemie sensor data. It used to be captured every 30-60 minutes and discarded after 72 hours. Their system captures this information, analyzes it and make the nurses aware of the changes to a preemies’ health. It resulted in a 20% drop in mortality rates

Case Study

Sprint processes CDR records. It knows if they dropped a call and if they need to contact you about it.

Cloud use of Hadoop?

IBM can give you a 100 node cluster for $34/hr. They work with Amazon, cloud.com, IBM SmartCloud, RackSpace, etc.

How is Big Data different from datawarehouses?

First and foremost it’s about taking unstructured data instead of structured data. It’s about taking a more holistic approach. it’s a spreadsheet metaphor.

Text analytics can be a big problem. Is Spam the horrible food or the horrible email for example. They have text analytics tools and are investing a lot to improve upon it.

Thoughts on “Big Data and Social Business”

Dave Jones January 18, 2012 at 12:27 am

At least two of the companies you mention above (LinkedIn and Zynga) are Splunk customers. Splunk has been doing this kind of stuff for a while now. It’d be interesting to see how they compare to IBM MPP Data Warehouse.
Michael Porter Post author January 18, 2012 at 6:17 am

I was talking to someone last night about the fact that Splunk has some big data characteristics, especially in the fact that it too analyzes unstructured data. Given that IBM is making a bet on Hadoop with Big Data, perhaps the better question is how Splunk compares to Hadoop. Thoughts on that?
Dave Jones March 29, 2012 at 2:33 pm

Short answer is that Splunk and Hadoop are different, and Splunk is even embracing Hadoop. Check this out:
http://blogs.splunk.com/2011/12/05/introducing-shep/

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Big Data and Social Business

by Michael Porter on January 16th, 2012 | ~ minute read

Michael Porter

Categories

Follow Us