Skip to main content

Posts Tagged ‘SPSS Statistics’

TM1 vs. SPSS Modeler Comparison Continues – Setting to Flags

Consider the scenario where you have to convert information held in a “categorical field” into a “collection of flag fields” – found in a transactional source. For example, suppose you have a file of transactions that includes (among other fields) a customer identifier (“who”) and a product identifier (“what”).  This file of transactional data indicates […]

IBM SPSS Modeler and Duplicate Data

Transactional datasets (especially those originating from databases) may contain duplicate records that must be removed before any modeling can begin.  There are simply two situations where duplicate records occur are: Datasets ARE erroneous (causing the same record to multiple times) Datasets ARE NOT erroneous (but records appear multiple times because information is collected different moments […]

Missing Data – “Nothing from nothing” – leaves something?

In TM1, missing data usually means that there is a defect in the logic of your ETL script or you need to check your SQL. In SPSS Modeler, missing values arise for a variety of reasons and they must be considered carefully. You might expect that missing values imply errors or should those records be […]

Data Indiscretions

Data loaded into a TM1 or SPSS model will, in most cases, include files consisting of thousands (or hundreds of thousands) of records. It is not reasonable, given the number of fields and records in files of this size, for you to visually inspect all fields in every record (of every file) for missing or […]

Importing Data into SPSS Modeler for the TM1 Developer

If you have a TM1 background it is a quick step to using SPSS Modeler -if you look for similarities in how the tools handle certain tasks like, for example, importing data. With TM1, source data is transformed and loaded into cube structures for consolidation, modeling and reporting using its ETL tool TurboIntegrator. In SPSS […]

CFO Performance Insight – Déjà vu?

Recently, I attended the IBM Vision conference in Orlando. At the conference, I watched a presentation on what our friends at IBM are calling one of their “signature solutions”: CFO Performance Insight. This reminded me of various blog posts of mine, such as: Reengineering the Forecasting Process with Predictive Models (Nov 2nd 2012) and Forecasting […]

IBM SPSS Syntax for File Operations

My start-up predictive analytics organization “Predictive Performers” wants to do some internal planning. We receive extract files from an an accounting service each month that provide the total hours billed per each of our consultants, along with each consultants hourly rate. The files are saved to a folder on our network: The files also breakout […]

IBM Vision 2013 – 2 thumbs Up!

I just returned from the IBM Vision Conference in Orlando, Florida. I attended a session in every available timeslot from Monday morning to Wednesday afternoon and it was worth every single minute of my time! Although there were too many sessions and presenters to mention, here are my “top picks”: Designing Solutions with IBM Cognos […]

IBM SPSS Statistics Syntax Best Practice

I recently audited the IBM course IBM SPSS Statistics Syntax I – ILO 0L406. In that course, you are introduced to the scripting language that IBM SPSS Statistics offers. It’s well worth your time. SPSS Syntax is a scripting language composed of a library of functions that can be used to modify, manage and analyze […]

Metadata Attributes

IBM SPSS Statistics offers many ways to help save time when analyzing data, particularly if you are continually performing the same types of analysis on similar sets of pools of data. TIME SAVERS “Metadata Attributes” – Data attributes have properties associated with them, and these properties are defined in metadata.   During data analysis, documentation […]