Multiple Data Sources and Predictive Modeling

by Jim Miller on March 16th, 2012 | ~ minute read

Knowledge Bottleneck?

When building a predictive model, the larger the number of examples or “cases” considered, the better the model. Typically, these cases exist in multiple data files (or data sources) that must be “stitched together”.

The task of accessing each data source, performing some analysis on the cases contained in the data and formatting and moving that information into a single “knowledge base” can be a labor intensive process, and is referred to as the “knowledge bottleneck problem”.

Build an AI-First Enterprise

From early pilots to enterprise-wide deployment, our award-winning AI consulting and technical services help you build the right foundation, scale responsibly, and deliver meaningful business outcomes.

Learn More

IBM SPSS Statistics allows the data technician to span more than a single data source at the same time. You simply open each data source in a new SPSS Data Editor window:

When you first open a data source, it automatically becomes the “active dataset”.
You can change the active dataset simply by clicking anywhere in the Data Editor window of the data source that you want to use or by selecting the Data Editor window for that data source from the Window menu.
At least one Data Editor Window must be open during a session. When you close the last open Data Editor window, SPSS Statistics automatically shuts down (prompting you to save changes first).

Accessing multiple data sources all at once allows you to:

Switch back and forth between open data files.
Compare the contents of different data files.
Copy and paste data between data files.
Create multiple subsets of cases and/or variables for analysis.
Merge multiple data sources from various data formats (for example, spreadsheet, database, text data) without saving each data source first.

Conclusion

Understanding your data is key in predictive modeling and this involves rigorous data analysis; IBM SPSS is a powerful tool that supports this effort.

Tags

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Jim Miller

Mr. Miller is an IBM certified and accomplished Senior Project Leader and Application/System Architect-Developer with over 30 years of extensive applications and system design and development experience. His current role is National FPM Practice Leader. His experience includes BI, Web architecture & design, systems analysis, GUI design and testing, Database modeling and systems analysis, design, and development of Client/Server, Web and Mainframe applications and systems utilizing: Applix TM1 (including TM1 rules, TI, TM1Web and Planning Manager), dynaSight - ArcPlan, ASP, DHTML, XML, IIS, MS Visual Basic and VBA, Visual Studio, PERL, Websuite, MS SQL Server, ORACLE, SYBASE SQL Server, etc. His Responsibilities have included all aspects of Windows and SQL solution development and design including: analysis; GUI (and Web site) design; data modeling; table, screen/form and script development; SQL (and remote stored procedures and triggers) development and testing; test preparation and management and training of programming staff. Other experience includes development of ETL infrastructure such as data transfer automation between mainframe (DB2, Lawson, Great Plains, etc.) systems and client/server SQL server and Web based applications and integration of enterprise applications and data sources. In addition, Mr. Miller has acted as Internet Applications Development Manager responsible for the design, development, QA and delivery of multiple Web Sites including online trading applications, warehouse process control and scheduling systems and administrative and control applications. Mr. Miller also was responsible for the design, development and administration of a Web based financial reporting system for a 450 million dollar organization, reporting directly to the CFO and his executive team. Mr. Miller has also been responsible for managing and directing multiple resources in various management roles including project and team leader, lead developer and applications development director. Specialties Include: Cognos/TM1 Design and Development, Cognos Planning, IBM SPSS and Modeler, OLAP, Visual Basic, SQL Server, Forecasting and Planning; International Application Development, Business Intelligence, Project Development. IBM Certified Developer - Cognos TM1 (perfect score 100% on exam) IBM Certified Business Analyst - Cognos TM1

More from this Author

Multiple Data Sources and Predictive Modeling

by Jim Miller on March 16th, 2012 | ~ minute read

Knowledge Bottleneck?

Build an AI-First Enterprise

Tags

Leave a Reply

Jim Miller

Categories

Follow Us