Perficient Business Intelligence Solutions Blog
avatar

Predictive Analytic Evangelism

Research

In a recent post (SPSS – Making good on an investment), I spoke about a study on the ROI of SPSS. This time, I would like to call out a study performed by Ventana Research of San Ramon, Calif. in March 2012. This study reports on organizations already using predictive analytics:

March 7, 2012 – “Predictive Analytics: Improving Performance by Making the Future More Visible,” finds that in organizations that have already deployed predictive analytics:

  • Over two-thirds are satisfied (45 percent) or very satisfied (21 percent) with their use of predictive analytics.
  • Three-quarters (74 percent) of respondents largely trust predictive analytics, while another 10 percent trust them totally.
  • The use of predictive analytics on social media networks, pricing across supply chain and customer service were the top three planned areas of focus.

“Predictive analytics are changing the way businesses operate” – David Menninger, VP & Research Director of Ventana Research.

 

Corporate executives, don’t think that Predictive Analytics is a “nice to have” – it’s now a “must have”.

avatar

SPSS – Making good on an investment

Research

(for example, “The Real ROI from SPSS”, NUCLEUS Research 2005),

shows that IBM SPSS is a safe investment and indicate that 94 percent of SPSS customers achieve a positive return within less than a year! These returns are found by way of reduced costs, increased productivity, increased employee and customer satisfaction and greater visibility.

In a study, NUCLEUS Research invited 61 organizations to be included in a sample that included public and private companies and organizations in transportation, retail, health care, education, professional and business services, and technology and according to their findings:

“The majority of companies provided details of their deployments on condition of anonymity. Nucleus analysts contacted companies and investigated various aspects of their SPSS deployments that would impact ROI, including what technology they were using, why and when they selected SPSS, the deployment process and budget, expected and achieved benefits, expected and incurred costs, training, consulting, deployment challenges, and other issues associated with the deployment.”

“Ninety-four percent of customers had achieved a positive ROI from their SPSS deployment with an average payback period of 10.7 months. Eighty-one percent of SPSS projects were deployed on time; 75 percent were deployed on or under budget.”

 

Ninety four present? I’ll take that!

 

That NUCLEUS study not only outlined some of the important benefits of using SPSS, but described average costs of SPSS projects.

 

The benefits included:

 

Reduced Costs

A number of customers were able to identify cost savings areas using SPSS to measure performance. In some cases, even a small percentage in savings identified through predictive modeling delivered significant returns.

 

Increased Productivity or Reduced Head Count

Tools that reduced the time needed to analyze information and the ability for business users to do their own reporting and analysis without involving programming specialists were key drivers for increased productivity from SPSS.

 

Improved Employee and Customer Satisfaction

A number of customers also noted that the ease of use and speed of analysis with SPSS tools enabled analysts to complete their jobs with less frustration.

 

Increased Visibility

The democratization of access to information in many cases enabled SPSS users to rapidly deliver information to decision makers for key strategic planning and other activities. Auditing capabilities also enabled users to track results and support regulatory and other requirements for record keeping.

 

The Costs where outlined as including line items for software, professional services, hardware, personnel time, training and maintenance.

 

In Conclusion

Most if not all, organizations have already made the investment in data so it would be foolish not to use that data. According to the aforementioned research report, it’s a sure bet that IBM SPSS is the best method.

avatar

BI Strategy Basics

Recently while waiting for a flight, I had the opportunity to review “Driving Business Insight with Effective BI Strategy” (April 30, 2012) from Forrester, a global research and advisory firm.

The article reminds that a business (any business) cannot survive today by relying on ERP systems and spreadsheets. In fact, it goes further and proposes that even “BI Empowered” organizations need to examine what they currently may have implemented and accept that what they have may be outdated and obsolete.

In building a case for BI, Forrester points out that all CEO’s will have a business strategy that identifies and measures both effectiveness and efficiencies of an organization and this will directly drive its BI strategy.

Business Intelligence Defined

So, what is BI? Forrester explains that BI is:

“a set of methodologies, processes, architectures and technologies – supported by organizational structures, roles and responsibilities – that transform raw data into meaningful and useful information”…

(I think they nailed that)!

The “Business Intelligence Playbook” Forrester has put together is commendable:

Discover -> Plan -> Act -> Optimize

During Discovery, an organization must look at:

  • Where is BI technology today?  Where is it going?
  • How might this trend impact my organization?
  • Where exactly is my organization with respect to BI?

During the Planning, an organization must establish a business plan and then use that to drive a BI plan. This will become the “roadmap” used in its pursuit of BI.

The Act will be the execution of the established plan and this includes staffing and training, policy definition and implementation and the building (or buying) of BI tools, applications and solutions.

Finally, during Optimization, Forrester reminds us that a BI strategy must continue to evolve – that it’s “a journey, not a destination”.

This article is a well worth your time. Well done!

 

avatar

BI Tools – Data Entry

Data entry isn’t something most would put on a list of BI tools.  But here’s my thought: I have yet to see a BI system of any size that doesn’t have some need for data entry of some kind, whether maintaining a small reference table or pulling the system’s configuration levers or correcting erroneous inbound data of some kind.

Usually most of these get handled by a DBA running a SQL script and voila!  But, there’s all kinds of issues with that from a system security and stability standpoint.  There’s no record of the change (excepting database logs that may or may not be parsable), business and quality rules are bypassed possibly leading to a corrupt state, and the DBA has been granted write level access to the production system (which I understand is OK in many places).  On the upside, it’s cheap, flexible, and isn’t another tool to maintain.

But what if there was an easy, generalized way to make the necessary edits without bypassing the rules and check and audit trails?

Read the rest of this post »

avatar

Cognos TM1 Rules – how do they really work?

 

 

 

 

 

Since Cognos TM1 uses a very efficient data compression algorithm to allow large datasets to fit in relatively small amounts of RAM, TM1 calculates the values only when needed by TM1—resulting in improved performance and reduced storage requirements.

Here is the sequence of events:

  1. A value is requested from (a location in) a cube.
  2. TM1 Server checks if the location corresponds to the area definition of any calculation statements associated with the cube.
  3. If the location does correspond to a statement, TM1 evaluates the formula portion of the calculation statement.
  4. TM1 returns the calculated value to the relevant area.

It is important to know:

  • Values are calculated only once for a cell
  • Rules are executed with order preference and the first rule that gets applied to the cell, wins

Like consolidations within dimensions, TM1 rules are calculated on demand. But unlike consolidations, TM1′s sparse consolidation algorithm is not able to determine in advance which results will be empty without additional information. In fact, when consolidation occurs in TM1 cubes that have rules defined, the sparse consolidation algorithm is turned off. This is done to avoid incorrect results getting generated by the TM1 Rules. So when the sparse consolidation algorithm is turned off, every single cell in the cube is checked for a value during a consolidation and this can slow down cubes that are very large or sparse.

 

 

 

 

 

General rules of thumb:

  • Remember that, if most values in your cube are zeros, this is an indication that the cube is relatively sparse.
  • Multidimensional cubes are almost always sparse.
  • The more dimensions a cube has, the greater is the degree of sparsity.
  • In TM1, there is a distinction between a zero and a value that is missing (or non-applicable).
  • In TM1, values can only be real numbers, and the value zero is used to represent zero, no (or missing) value, and even the non-applicable values.
  • The impact of sparsity on calculations can be tremendous.

Sparsity

During consolidations, TM1 uses a sparse consolidation algorithm to skip over cells that contain zero or are empty. This algorithm speeds up consolidation calculations in cubes that are highly sparse. A sparse cube is a cube in which the number of populated cells as a percentage of total cells is low (http://publib.boulder.ibm.com). Cognos TM1 provides the use of feeders to identify which cube cells need to have rules evaluated, and which can be skipped. The effective use of Cognos TM1 feeders is essential for making your rules efficient and able to avoid combinatorial explosion.

Cognos TM1 provides the use of feeders to identify which cube cells need to have rules evaluated, and which can be skipped. The effective use of Cognos TM1 feeders is essential for making your rules efficient and able to avoid combinatorial explosion.

Combinatorial explosion!

In mathematics, a combinatorial explosion describes the effect of functions that grow very rapidly as a result of combinatorial considerations.

Wikipedia

A simple example that illustrates the power of the sparse consolidation algorithm is to consider the consolidation value (8433) at the intersection of total regions and surfboard lengths. To calculate the total, you must add up every possible cell which might total 119. However, if you add only the cells with non-zero values, the number of components to add may drop down significantly.

Dissection of Cognos TM1 Rules

Cognos TM1 Rules is nothing more than calculation statements. These calculation statements define how to compute values in the cells of the cube to which the rules are assigned. Usually, every rule you create will require one or more corresponding feeders. Feeder statements, when used correctly, ensure that the correct values are calculated by your Rules and can significantly impact on the performance of your Rules. If you use feeder statements in a rule (and you should), the rule must also contain a SKIPCHECK declaration and a FEEDERS declaration. The SKIPCHECK declaration must immediately precede any calculation statements in the rule, while the FEEDERS declaration must precede the feeder statements.

Rules – they are only “calculation statements”…

A calculation statement consists of the following:

  • Area definition
  • Leaf, consolidation, or string qualifier
  • Formula
  • Terminator

next time, a deep dive into Cognos TM1 rules!

Antoninus: Are you afraid to die, Spartacus?

Spartacus: No more than I was to be born.

avatar

Dummy Coding with IBM SPSS

Dummy Coding with IBM SPSS

To understand what is meant by dummy coding, you need to understand 2 forms of data:

Qualitative or Quantitative?

“Qualitative data describes items in terms of some quality or categorization while Quantitative data are described in terms of quantity (and in which a range of numerical values are used without implying that a particular numerical value refers to a particular distinct category).” To better understand the differences, always remember that qualitative data is more of an observance, while quantitative is measurable.

 

Your Morning Latte…

So!

If we consider a morning latte example, we might note the following:

 

 

Qualitative Examples

  • robust aroma
  • frothy appearance
  • strong taste
  • burgundy cup

Quantitative Examples

  • 12      ounces of latte
  • Serving  temperature 150º F.
  • serving  cup 7 inches in height
  • cost  $4.95

Statistical Analysis often includes variables in which the numbers represent qualitative categories (such as gender, ethnicity or political affiliation).

Including these variables in an analytical model requires special steps to ensure the results can be interpreted properly. These steps involve coding a categorical variable into multiple dichotomous variables, in which variables take the value of “1″ or zero.

For clarity, a dichotomous variable is defined as a variable that splits or groups data into 2 distinct categories. An example would be employed and unemployed.

This process is known as “dummy coding.” IBM SPSS makes dummy coding an unpretentious practice. Let’s walk through the steps!

  1. Select the categorical variable that you want to dummy code. (Note the number of categories, remembering that dummy coding transforms a variable with “n” categories into “n-1″ categories. For example, a categorical variable on political affiliation with three categories — Democrat, Republican and Independent — would be dummy coded into two dichotomous variables, such as Democrat and Republican. A person who identifies as one of these would be coded a “1″ in the data set. A person with a zero in these categories would be counted as independent).
  2. Click the “Transform” menu at the top of the SPSS data sheet, then select “Recode Into Different Variable,” because you will transform the categorical variable into one or more dichotomous or dummy variables. This opens a window that displays the variables in your data set. Select the variable you want to recode, and then click the arrow, which moves the variable name into the box labeled “Numeric Variable.”
  3. Click the “Output Variable” name box and type a name for your new dichotomous variable. Click “Change.” Click “Old and New Values,” which opens a new display, showing old and new values for the variable you want to transform.
  4. Recode the values of the variable by coding one category as a “1″ and the others as zero. Under “Old Value,” enter the category value to be recoded. Under “New Value,” type a “1,” then click “Add.” On the “Old Value” side, select the “All Other Values” button and type “0″ as the new value. For example, the political affiliation example that codes Democrat as a “1,” Republican as a “2″ and Independent as a “3″ could be recoded into the dichotomous variable Republican, with all “2s” recoded as “1″ and other values coded as zero.Click “Continue” after entering the old and new values for your dummy codes, then click “OK.” SPSS will then recode the categorical variable as you have specified.

 

Done Deal!

avatar

Iterative BI – What’s the Difference?

Recently I was in a conversation where a PM declared “Agile’s just waterfall really fast – we can do that no problem!”  Uh oh.

Like (most) everything, delivery methodologies are subject to fashion and trend, and Agile/Scrum/Kanban and the like are en vouge.  Collective, I’ll refer to these highly cyclic methodologies as “iterative” or (little a) agile development.  My interest being BI, I’ll take a little time discussing how these iterative delivery methods impact your BI delivery processes.

Generally, iterative development does a number of things to your teams.  When operating effectively, it (among other things):

  • Brings your users much closer to the development process.
  • Multiplies the number of builds/deployments you do by a factor of LOTS (probably 10-20).
  • Multiplies the number of tests (esp. regressions) required.
  • Makes juggling project tasks more complex by putting many more “balls” in the air.
  • Eliminates the formality (and safety) of predefined scope and quality gates.

Read the rest of this post »

avatar

Cognos TM1 Attributes -what are they and what can then do for me?

In this post I’d like to talk about attributes – let’s me begin:

To define an element’s type (numeric, consolidation or string), elements can have attributes defined and assigned to them. What is an attribute you ask? Well, if elements identify data in your cube, then you can think of the element attributes as describing the elements themselves, it’s that simple.

For example, let’s say that some of your users would like to display an account using the account name followed by the account number. Other users would like to display only the account name. You can define an alias attribute for each of these requirements. In fact, you can define as many alias attributes as you need.

 

I can have an account number of “01-0000-00001″ and define multiple aliases so that the user can view the element as any of the following:

  • “01-0000-00001″
  • “01-0000-00001 – Long Surfboard”
  • “Long Surfboard”

Some interesting uses for attributes include:

  • To define features of elements. For example, an employee may have attributes that include “title”, “hire date”, or “department”
  • To provide alternative or “friendly” names, or aliases. For example, an accounting code of “02-0000-00001″ may have an alias of “salary and wages”
  • To control the display format for the numeric data.

An alias attribute may also be used to present data in different languages!

You can also select elements by attribute value in the Subset Editor and display element names in TM1 windows using their aliases.

Creating an Attribute Is Easy!

To create attributes and assign attribute values, you use the TM1 Attributes Editor.

When adding attributes using the Attributes Editor, you will notice that it “defaults” to an attribute type of string, so be careful to select the desired type before proceeding. In most cases, I’ve used TurboIntegrator processes as the tool used to add, update, and delete my dimension attributes. You use the following programming functions:

AttrInsert: To add a new attribute

AttrPutN or AttrPutS: To update the attribute which can be numeric or string value

AttributeDelete: To remove an existing attribute

A key point to know is that, if you try to view or update dimensions attributes and there are a large number of elements in the dimension when you open the Attributes Editor, you will receive a message as shown in the following screenshot:

 

Do not continue!

You can potentially lock your TM1 session for a long time.

The alternative is to access the attributes of this dimension through the attributes cube instead, as it is much faster:

  1.  Select View | Display Control Objects.
  2.  Open the cube called }ElementAttributes_dimension.
  3.  Modify the required fields like in any cube!

This is another important point. The }ElementAttributes cubes are known as Cognos TM1 control cubes. These cubes are automatically generated by TM1. As an clever developer you can either create a new (lookup) cube or use a control cube to look up data, depending upon your needs.

Some key attribute terminology and concepts you should be able to recognize are descriptive attributes, alias attributes, and when to use an attribute versus an additional element. Let’s briefly touch on these.

Descriptive attributes !

Descriptive attributes are simply the attributes which are data that describe the data. For example, consider some attributes for selecting a surfboard:

Alias attributes!

These attributes provide alternative names for elements:

It can be tempting to add many attributes to describe your elements and in most cases this is fine as you can filter your data by attribute value, however you should consider how your data is going to be used and presented. Sometimes it is more appropriate to create elements rather than attributes and sometimes even additional dimensions.

For example, board length is an attribute of surfboard models. The 6.6 boards often outsell the other length boards. If you create one element per board and another dimension with elements for each length, you can use TM1 to track surfboard sales by the length of the board. If you combine sales into a single board length, you might lose valuable detail.

Display format attributes!

A astute use for element attributes is formatting what is displayed in the Cube Viewer. When you create a dimension, an attribute named format is created for you automatically by TM1. This attribute can be used to set a display format for each individual numeric element.

Display format attributes can also be set programmatically with a TurboIntegrator process using the AttrPutS function. Just remember to add the c: to the format string to indicate that it is a custom format:

AttrPutS(‘c:###,###.00′, myDimensioName, myElementName, ‘Format’);

Referring to the IBM Cognos documentation, you see that the numeric data can be displayed in the following formats:

The Cube Viewer will display the format to use:

  1. Elements in the column dimension are checked for formatting.
  2. Elements are checked in the row dimension for display formats.
  3. Elements are checked in the title dimension for display formats (left to right).

The current view formatting is used.

Well, I hope this post is useful to you.

 

“If you’re doing something for the right reasons, nothing can stop you” – Duncan Me.

 

 

 

 

 

 

 

avatar

Chi-Squared Challenging using SPSS

  Chi-Squared Challenging using SPSS

A Chi-Square Challenge (or Test) procedure organizes your data pond variables into groups and computes a chi-square statistic. Here is the specific definition:

“The chi-square (chi, the Greek letter pronounced “kye”) statistic is a statistical technique used to determine if a “distribution of observed frequencies” differs from the “theoretical expected frequencies”…

Okay, this is a pretty clever explanation, however it really just means “does what I see match what I had thought I’d see”.

For example, the chi-square test could be used to determine whether a box of crayons contains equal quantities of blue, brown, green, orange, red, and yellow.

Using IBM SPSS you can obtain your chi-test by selecting from the menus:

Analyze > Nonparametric Tests > Legacy Dialogs > Chi-Square…

From there, you can select one or more test variables. Each variable will produce a separate test.

Using a previous blog as an example, I might want to evaluate the results of a poll conducted on marriage, gender and an individual’s overall satisfaction with their life. Using SPSS I can determine (for an example) that there are 132 total “observations” -121 male and 12 female. Does my expected ratio of male v. female align to the actual?  Does my assumption that married females are significantly more satisfied with their life than males are? And so on…

Once again, SPSS makes statistical analysis easy.

 

 

avatar

BI Maturity – Now What?

I spend a lot of time working with BI teams assessing how they’re doing and what they should do next.  We often use a BI maturity model as a framework for these assessments.  They’re great accelerators and help the teams to understand where they fall between “getting started” and “guru”, but they’re not great (by themselves) at helping development manager figure out what specific things they need to address next.

Development managers are (nearly) always faced with limited resources and must aggressively prioritize how to spend those resources – on people (FTEs and consulting/contract), on tools, and on “soft” expenses such as training and conferences.  They also are faced with determining how to allocate new projects vs. maintenance work and refactoring.

Read the rest of this post »