The discovering of relationships within data (between fields) is an important part of any data mining project (in the Crisp-DM methodology, this is described as part of the “Data Understanding” stage). This “relationship discovery” is part of developing a predictive model but is also helpful in answering specific business questions -perhaps even what originally motivated […]
Posts Tagged ‘data collection’
All about CLEM
SPSS CLEM is the control Language for Expression Manipulation, which is used to build expressions within SPSS Modeler streams. CLEM is actually used in a number of SPSS “nodes” (among these are the Select and Derive nodes) and you can check the product documentation to see the extended list. CLEM expressions are constructed from: Values, […]
Sampling Your Data
Another interesting feature of SPSS Modeler is its built-in ability to sample data. It is pretty typical to have (in one or more files) hundreds of thousands of records to process, and using complete sets of data during testing can take a huge amount of your time and is inefficient in terms of computer processing […]
TM1 vs. SPSS Modeler Comparison Continues – Setting to Flags
Consider the scenario where you have to convert information held in a “categorical field” into a “collection of flag fields” – found in a transactional source. For example, suppose you have a file of transactions that includes (among other fields) a customer identifier (“who”) and a product identifier (“what”). This file of transactional data indicates […]
Performance Testing TM1Web Applications with HP LoadRunner
If you’ve ever attempted to perform a performance test on a TM1 application, you know that there is not really an effective way to manually create sufficient load on a model. Getting “real users” to execute application operations – over and over again – is nearly impossible. Thankfully, an automated testing product can solve for […]
A Simple Analytical Architectural Strategy
Over the last month I’ve been taking a tactical view of analytics by focusing on some of the specific features of IBM SPSS Statistics so today, I have decided to think a bit more “strategically”. If your organization wants to begin leveraging survey-response type data for example, what might be a reasonable approach? If I […]
SPSS Virtual Files
The power of SPSS allows the data scientist or predictive modeler to consume large data volumes. This data may come in smaller manageable subsets or possible huge “data ponds”. Depending upon the procedures you will be performing in your analysis, SPSS may reread the entire data set for each procedure. Of course, procedures that change […]