In a recent post (SPSS – Making good on an investment), I spoke about a study on the ROI of SPSS. This time, I would like to call out a study performed by Ventana Research of San Ramon, Calif. in March 2012. This study reports on organizations already using predictive analytics:
March 7, 2012 – “Predictive Analytics: Improving Performance by Making the Future More Visible,” finds that in organizations that have already deployed predictive analytics:
“Predictive analytics are changing the way businesses operate” – David Menninger, VP & Research Director of Ventana Research.
Corporate executives, don’t think that Predictive Analytics is a “nice to have” – it’s now a “must have”.
Research
(for example, “The Real ROI from SPSS”, NUCLEUS Research 2005),
shows that IBM SPSS is a safe investment and indicate that 94 percent of SPSS customers achieve a positive return within less than a year! These returns are found by way of reduced costs, increased productivity, increased employee and customer satisfaction and greater visibility.
In a study, NUCLEUS Research invited 61 organizations to be included in a sample that included public and private companies and organizations in transportation, retail, health care, education, professional and business services, and technology and according to their findings:
“The majority of companies provided details of their deployments on condition of anonymity. Nucleus analysts contacted companies and investigated various aspects of their SPSS deployments that would impact ROI, including what technology they were using, why and when they selected SPSS, the deployment process and budget, expected and achieved benefits, expected and incurred costs, training, consulting, deployment challenges, and other issues associated with the deployment.”
“Ninety-four percent of customers had achieved a positive ROI from their SPSS deployment with an average payback period of 10.7 months. Eighty-one percent of SPSS projects were deployed on time; 75 percent were deployed on or under budget.”
Ninety four present? I’ll take that!
That NUCLEUS study not only outlined some of the important benefits of using SPSS, but described average costs of SPSS projects.
The benefits included:
Reduced Costs
A number of customers were able to identify cost savings areas using SPSS to measure performance. In some cases, even a small percentage in savings identified through predictive modeling delivered significant returns.
Increased Productivity or Reduced Head Count
Tools that reduced the time needed to analyze information and the ability for business users to do their own reporting and analysis without involving programming specialists were key drivers for increased productivity from SPSS.
Improved Employee and Customer Satisfaction
A number of customers also noted that the ease of use and speed of analysis with SPSS tools enabled analysts to complete their jobs with less frustration.
Increased Visibility
The democratization of access to information in many cases enabled SPSS users to rapidly deliver information to decision makers for key strategic planning and other activities. Auditing capabilities also enabled users to track results and support regulatory and other requirements for record keeping.
The Costs where outlined as including line items for software, professional services, hardware, personnel time, training and maintenance.
In Conclusion
Most if not all, organizations have already made the investment in data so it would be foolish not to use that data. According to the aforementioned research report, it’s a sure bet that IBM SPSS is the best method.
Recently while waiting for a flight, I had the opportunity to review “Driving Business Insight with Effective BI Strategy” (April 30, 2012) from Forrester, a global research and advisory firm.
The article reminds that a business (any business) cannot survive today by relying on ERP systems and spreadsheets. In fact, it goes further and proposes that even “BI Empowered” organizations need to examine what they currently may have implemented and accept that what they have may be outdated and obsolete.
In building a case for BI, Forrester points out that all CEO’s will have a business strategy that identifies and measures both effectiveness and efficiencies of an organization and this will directly drive its BI strategy.
Business Intelligence Defined
So, what is BI? Forrester explains that BI is:
“a set of methodologies, processes, architectures and technologies – supported by organizational structures, roles and responsibilities – that transform raw data into meaningful and useful information”…
(I think they nailed that)!
The “Business Intelligence Playbook” Forrester has put together is commendable:
Discover -> Plan -> Act -> Optimize
During Discovery, an organization must look at:
During the Planning, an organization must establish a business plan and then use that to drive a BI plan. This will become the “roadmap” used in its pursuit of BI.
The Act will be the execution of the established plan and this includes staffing and training, policy definition and implementation and the building (or buying) of BI tools, applications and solutions.
Finally, during Optimization, Forrester reminds us that a BI strategy must continue to evolve – that it’s “a journey, not a destination”.
This article is a well worth your time. Well done!
Data entry isn’t something most would put on a list of BI tools. But here’s my thought: I have yet to see a BI system of any size that doesn’t have some need for data entry of some kind, whether maintaining a small reference table or pulling the system’s configuration levers or correcting erroneous inbound data of some kind.
Usually most of these get handled by a DBA running a SQL script and voila! But, there’s all kinds of issues with that from a system security and stability standpoint. There’s no record of the change (excepting database logs that may or may not be parsable), business and quality rules are bypassed possibly leading to a corrupt state, and the DBA has been granted write level access to the production system (which I understand is OK in many places). On the upside, it’s cheap, flexible, and isn’t another tool to maintain.
But what if there was an easy, generalized way to make the necessary edits without bypassing the rules and check and audit trails?
Since Cognos TM1 uses a very efficient data compression algorithm to allow large datasets to fit in relatively small amounts of RAM, TM1 calculates the values only when needed by TM1—resulting in improved performance and reduced storage requirements.
Here is the sequence of events:
It is important to know:
Like consolidations within dimensions, TM1 rules are calculated on demand. But unlike consolidations, TM1′s sparse consolidation algorithm is not able to determine in advance which results will be empty without additional information. In fact, when consolidation occurs in TM1 cubes that have rules defined, the sparse consolidation algorithm is turned off. This is done to avoid incorrect results getting generated by the TM1 Rules. So when the sparse consolidation algorithm is turned off, every single cell in the cube is checked for a value during a consolidation and this can slow down cubes that are very large or sparse.
General rules of thumb:
Sparsity
During consolidations, TM1 uses a sparse consolidation algorithm to skip over cells that contain zero or are empty. This algorithm speeds up consolidation calculations in cubes that are highly sparse. A sparse cube is a cube in which the number of populated cells as a percentage of total cells is low (http://publib.boulder.ibm.com). Cognos TM1 provides the use of feeders to identify which cube cells need to have rules evaluated, and which can be skipped. The effective use of Cognos TM1 feeders is essential for making your rules efficient and able to avoid combinatorial explosion.
Cognos TM1 provides the use of feeders to identify which cube cells need to have rules evaluated, and which can be skipped. The effective use of Cognos TM1 feeders is essential for making your rules efficient and able to avoid combinatorial explosion.
Combinatorial explosion!
In mathematics, a combinatorial explosion describes the effect of functions that grow very rapidly as a result of combinatorial considerations.
Wikipedia
A simple example that illustrates the power of the sparse consolidation algorithm is to consider the consolidation value (8433) at the intersection of total regions and surfboard lengths. To calculate the total, you must add up every possible cell which might total 119. However, if you add only the cells with non-zero values, the number of components to add may drop down significantly.
Dissection of Cognos TM1 Rules
Cognos TM1 Rules is nothing more than calculation statements. These calculation statements define how to compute values in the cells of the cube to which the rules are assigned. Usually, every rule you create will require one or more corresponding feeders. Feeder statements, when used correctly, ensure that the correct values are calculated by your Rules and can significantly impact on the performance of your Rules. If you use feeder statements in a rule (and you should), the rule must also contain a SKIPCHECK declaration and a FEEDERS declaration. The SKIPCHECK declaration must immediately precede any calculation statements in the rule, while the FEEDERS declaration must precede the feeder statements.
Rules – they are only “calculation statements”…
A calculation statement consists of the following:
next time, a deep dive into Cognos TM1 rules!
Antoninus: Are you afraid to die, Spartacus?
Spartacus: No more than I was to be born.
To understand what is meant by dummy coding, you need to understand 2 forms of data:
Qualitative or Quantitative?
“Qualitative data describes items in terms of some quality or categorization while Quantitative data are described in terms of quantity (and in which a range of numerical values are used without implying that a particular numerical value refers to a particular distinct category).” To better understand the differences, always remember that qualitative data is more of an observance, while quantitative is measurable.
So!
If we consider a morning latte example, we might note the following:
Qualitative Examples
Quantitative Examples
Statistical Analysis often includes variables in which the numbers represent qualitative categories (such as gender, ethnicity or political affiliation).
Including these variables in an analytical model requires special steps to ensure the results can be interpreted properly. These steps involve coding a categorical variable into multiple dichotomous variables, in which variables take the value of “1″ or zero.
For clarity, a dichotomous variable is defined as a variable that splits or groups data into 2 distinct categories. An example would be employed and unemployed.
This process is known as “dummy coding.” IBM SPSS makes dummy coding an unpretentious practice. Let’s walk through the steps!
Done Deal!
Recently I was in a conversation where a PM declared “Agile’s just waterfall really fast – we can do that no problem!” Uh oh.
Like (most) everything, delivery methodologies are subject to fashion and trend, and Agile/Scrum/Kanban and the like are en vouge. Collective, I’ll refer to these highly cyclic methodologies as “iterative” or (little a) agile development. My interest being BI, I’ll take a little time discussing how these iterative delivery methods impact your BI delivery processes.
Generally, iterative development does a number of things to your teams. When operating effectively, it (among other things):
In this post I’d like to talk about attributes – let’s me begin:
To define an element’s type (numeric, consolidation or string), elements can have attributes defined and assigned to them. What is an attribute you ask? Well, if elements identify data in your cube, then you can think of the element attributes as describing the elements themselves, it’s that simple.
For example, let’s say that some of your users would like to display an account using the account name followed by the account number. Other users would like to display only the account name. You can define an alias attribute for each of these requirements. In fact, you can define as many alias attributes as you need.
I can have an account number of “01-0000-00001″ and define multiple aliases so that the user can view the element as any of the following:
Some interesting uses for attributes include:
An alias attribute may also be used to present data in different languages!
You can also select elements by attribute value in the Subset Editor and display element names in TM1 windows using their aliases.
Creating an Attribute Is Easy!
To create attributes and assign attribute values, you use the TM1 Attributes Editor.
When adding attributes using the Attributes Editor, you will notice that it “defaults” to an attribute type of string, so be careful to select the desired type before proceeding. In most cases, I’ve used TurboIntegrator processes as the tool used to add, update, and delete my dimension attributes. You use the following programming functions:
• AttrInsert: To add a new attribute
• AttrPutN or AttrPutS: To update the attribute which can be numeric or string value
• AttributeDelete: To remove an existing attribute
A key point to know is that, if you try to view or update dimensions attributes and there are a large number of elements in the dimension when you open the Attributes Editor, you will receive a message as shown in the following screenshot:
You can potentially lock your TM1 session for a long time.
The alternative is to access the attributes of this dimension through the attributes cube instead, as it is much faster:
This is another important point. The }ElementAttributes cubes are known as Cognos TM1 control cubes. These cubes are automatically generated by TM1. As an clever developer you can either create a new (lookup) cube or use a control cube to look up data, depending upon your needs.
Some key attribute terminology and concepts you should be able to recognize are descriptive attributes, alias attributes, and when to use an attribute versus an additional element. Let’s briefly touch on these.
Descriptive attributes !
Descriptive attributes are simply the attributes which are data that describe the data. For example, consider some attributes for selecting a surfboard:
Alias attributes!
These attributes provide alternative names for elements:
It can be tempting to add many attributes to describe your elements and in most cases this is fine as you can filter your data by attribute value, however you should consider how your data is going to be used and presented. Sometimes it is more appropriate to create elements rather than attributes and sometimes even additional dimensions.
For example, board length is an attribute of surfboard models. The 6.6 boards often outsell the other length boards. If you create one element per board and another dimension with elements for each length, you can use TM1 to track surfboard sales by the length of the board. If you combine sales into a single board length, you might lose valuable detail.
Display format attributes!
A astute use for element attributes is formatting what is displayed in the Cube Viewer. When you create a dimension, an attribute named format is created for you automatically by TM1. This attribute can be used to set a display format for each individual numeric element.
Display format attributes can also be set programmatically with a TurboIntegrator process using the AttrPutS function. Just remember to add the c: to the format string to indicate that it is a custom format:
AttrPutS(‘c:###,###.00′, myDimensioName, myElementName, ‘Format’);
Referring to the IBM Cognos documentation, you see that the numeric data can be displayed in the following formats:
The Cube Viewer will display the format to use:
The current view formatting is used.
Well, I hope this post is useful to you.
“If you’re doing something for the right reasons, nothing can stop you” – Duncan Me.
A Chi-Square Challenge (or Test) procedure organizes your data pond variables into groups and computes a chi-square statistic. Here is the specific definition:
“The chi-square (chi, the Greek letter pronounced “kye”) statistic is a statistical technique used to determine if a “distribution of observed frequencies” differs from the “theoretical expected frequencies”…
Okay, this is a pretty clever explanation, however it really just means “does what I see match what I had thought I’d see”.
For example, the chi-square test could be used to determine whether a box of crayons contains equal quantities of blue, brown, green, orange, red, and yellow.
Using IBM SPSS you can obtain your chi-test by selecting from the menus:
Analyze > Nonparametric Tests > Legacy Dialogs > Chi-Square…
From there, you can select one or more test variables. Each variable will produce a separate test.
Using a previous blog as an example, I might want to evaluate the results of a poll conducted on marriage, gender and an individual’s overall satisfaction with their life. Using SPSS I can determine (for an example) that there are 132 total “observations” -121 male and 12 female. Does my expected ratio of male v. female align to the actual? Does my assumption that married females are significantly more satisfied with their life than males are? And so on…
Once again, SPSS makes statistical analysis easy.
I spend a lot of time working with BI teams assessing how they’re doing and what they should do next. We often use a BI maturity model as a framework for these assessments. They’re great accelerators and help the teams to understand where they fall between “getting started” and “guru”, but they’re not great (by themselves) at helping development manager figure out what specific things they need to address next.
Development managers are (nearly) always faced with limited resources and must aggressively prioritize how to spend those resources – on people (FTEs and consulting/contract), on tools, and on “soft” expenses such as training and conferences. They also are faced with determining how to allocate new projects vs. maintenance work and refactoring.