If you have a TM1 background it is a quick step to using SPSS Modeler -if you look for similarities in how the tools handle certain tasks like, for example, importing data.
With TM1, source data is transformed and loaded into cube structures for consolidation, modeling and reporting using its ETL tool TurboIntegrator. In SPSS Modeler, source data is loaded and transformed through a “logic stream of nodes” for modeling and reporting.
Here is a closer look:
Sourcing Data
Cognos TM1 uses TurboIntegrator as its data import mechanism. TurboIntegrator (referred to as “TI”) is a programming or scripting tool that allows you to automate the data importation into a TM1 application. Scripts built with TurboIntegrator or “TI”, can be saved, edited and, through the use of chores, be set up to run at regular intervals.
Through the use of TI’s Data Source feature, you can import data or information external to TM1, such as:
- Industry standard comma-delimited (CSV) text files including ASCII files.
- Information stored in relational databases (accessible through an ODBC data source).
- Other OLAP cubes, including Cognos TM1 cubes and views.
- Microsoft® Analysis Services.
- SAP via RFC.
Rather than using scripts, SPSS Modeler utilizes data import nodes, which are all found on the SPSS Sources palette.
SPSS Modeler source nodes allow you to reads data from:
- Industry standard comma-delimited (CSV) text files including ASCII files.
- XML files (wish TM1 did this!)
- Statistics Files
- SAS
- Excel
- Databases (DB2TM, OracleTM. SWL ServerTM, and a variety of other databases) supported via ODBC
- Other OLAP cubes, including Cognos TM1 cubes and views.
Components of a Script
To create a new TI script (using TM1 Server Explorer) you just right-click on “Processes” and select “Create New Process”. (TM1 then opens the “TurboIntegrator” dialog showing a new, “blank” process ready for you modify).
Within any TI process, there will be 5 visible “tabs”:
- The “Data Source” tab,
- The “Variables” tab,
- The “Maps” tab (this tab is initially greyed out),
- The “Advanced” tab and
- The “Schedule” tab.
Components of a Node
In SPSS, to create a node, you simply “drag” the appropriate node from the sources palette into your current “stream”.
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
Data source nodes also have the concept of “tabs” – and each source node will have four:
- The Data Tab,
- The Filter Tab,
- The Type Tab, and
- The Annotations Tab.
Configuring for Import
With TM1, you use the Data Source tab to identify the source from which you want to import data to TM1. The fields and options available on the Data Source tab will then vary according to the Datasource “type” that you select.
In SPSS, you use the File tab to set the options for the data import and the dialog will then be specific for the type of data that are imported.
Data Structuring
The concept of database table columns or “file fields” are handled in the TM1 TurboIntegrator “Variables” tab. If the process has a data source defined, each column in that datasource will become a variable in the variables tab. Based upon your selected datasource, TM1 attempts to setup the variables for you in the variables tab. TM1 will assign names, storage types and a “purpose” or “content” to each of your variables (each of which you can override).
Similarly, Modeler also requires a “rectangular” data structure – records (rows of the data table) and fields (column of the data table) that are handled in the Data tab. Again, based upon the data source, options on the Data tab allow you to override the specified storage type for fields as they are imported (or created).
Logic and Processing
To “process” imported data, TurboIntegrator processes include several “procedures tabs”: Prolog, Metadata, Data, and Epilog (“Sub-tabs” of the “Advanced” tab). When you run a (TurboIntegrator) process, the procedures are executed in sequence (Prolog, Meta, Data and Epilog) and provide the ability to apply data transformation (or other) logic to each record of data in the source using lines of script that contain various predefined functions.
In SPSS, the Filter tab can be used to rename or exclude fields (from the source file) and, using the Type tab, you can specify field properties and logic such as:
• Usage typing
• Options for handling missing values and system nulls
• Setting the role of a fields
• Specifying values for a field as well as options used to automatically read values from the dataset.
• Specify field and value labels.
Using the Data
Using the Data Tab in the TI process, data will then be written to cube cells (using a CellPutN functions) where it will be available for further analysis.
In SPSS Modeler, the source node you used to import your data would be “connected” to 1 (or more) nodes for additional processing or analysis.
Keep in mind that in TM1, the TurboIntegrator processes you create can be saved and rerun for new data files. In SPSS Modeler, you create and save streams (that begin with your source node) that can be rerun for new data.
Conclusion
When learning a new tool, equate what you learn to what you already knew – there will be plenty of similarities!