Skip to main content

Posts Tagged ‘statistics’

Ranking your Cases: IBM SPSS Statistics

Ranking A ranking is a relationship between a set of items such that, for any two items, the first is either “ranked higher than”, “ranked lower than” or “ranked equal to” the second. – Wikipedia   Ranking in SPSS Statistics IBM SPSS Statistics ranks cases in your data pond by automatically defining new variables to […]

Understanding the SPSS Crosstabs Procedure

The SPSS Statistics Crosstabs procedure forms two-way and multi-way tables (and provides a variety of tests and measures of association for the two-way tables). The structure of the table and whether categories are ordered determine what test or measure to use. Crosstabs’ statistics and measures of association are computed for two-way tables only. If you […]

SPSS Codebook

A  codebook is a type of document used for gathering and storing codes. Originally codebooks were often literally books, but today codebook is a byword for the complete record of a series of codes, regardless of physical format. – Wikipedia The codebook command was introduced in IBM SPSS Statistics version 17. It provides information about the […]

SPSS Virtual Files

The power of SPSS allows the data scientist or predictive modeler to consume large data volumes. This data may come in smaller manageable subsets or possible huge “data ponds”.  Depending upon the procedures you will be performing in your analysis, SPSS may reread the entire data set for each procedure.  Of course, procedures that change […]

Automated Data Preparation (ADP) IBM SPSS Statistics Base

Automated Data Preparation (ADP)   The seasoned data scientist knows that probably the single most import step in creating a predictive model is pinpointing the appropriate “data pond” and ensuring that it is properly “prepared”. I’ve written about the many “out of the box” tools that SPSS users can use to manage data, such as […]

IBM SPSS Add-On Modules

  Serious Analytical Architect? Any serious analytic architect will need to at least be aware of the individual SPSS products offered and have at least a basic understanding of what each of them can do. Here is the list: IBM Showcase Report Writer Quickly create professional-looking, presentation-quality reports using intuitive, word processor-like page layout and […]

IBM SPSS Time

IBM SPSS defines each variable with a “TYPE”. By default, all variables in SPSS are assumed to be numeric until you change them. SPSS V20 currently supports the following variable types: Numeric and String (the most common), Comma, Dot, Scientific Notation, Date, Dollar, Custom Currency and Restricted Numeric What day is it? Today I want […]

Predictive Model Engineering

Organizations interested in using analytics to predict outcomes will score data pools by applying an appropriate predictive model. Pre-built predictive models are becoming increasingly available in the market place. Data scientists that are knowledge experts in particular areas are developing models that have increasingly better success rates. However the best approach may be for an […]

BM SPSS Statistics – Data Management Toolset

IBM SPSS Statistics – Data Management Toolset (DMS) In a recent blog post I listed some of the more helpful “data management tools” offered within IBM SPSS Statistics version 20 (Case Summaries, Replace Missing Values, Transform and Compute, Recode, Select Cases, Sort Cases and Merge Files) and would like to review them today. These tools […]

SPSS Collaboration and Deployment Services

Last time I mentioned IBM SPSS collaboration and deployment services and promised to talk more about it – so here we go: Analytical Assets Organizations positioning themselves to take full advantage of analytics will look to separate the effort of developing analytical assets and actually using them – between “creators” and “consumers”.  Generally speaking, an […]

Basic Data Analysis and IBM SPSS

    The basic steps in data analysis might be simplified into (1) Identifying data, (2) Selecting an analysis and summarization method and (3) Presenting the results. Over the next couple of weeks I will look at using IBM SPSS version 20 to accomplish these tasks. Today, I want to focus on loading a data […]

Interoperability and PMML

If you work within the rapidly expanding analytics space, you will need to think about defining and sharing statistical models between applications. PMML (or Predictive Model Markup Language) is an XML-based language developed by the Data Mining Group (DMG) for this purpose. I’d like to pass on some of the essentials: The Basics PMML provides […]

Load More