Overtime, I’ve written about keeping your TM1 model design “architecturally pure”. What this means is that you should strive to keep a models “areas of functionality” distinct within your design.
I believe that all TM1 applications, for example, are made of only 4 distinct “areas of functionality”. They are absorption (of key information from external data sources), configuration (of assumptions about the absorbed data), calculation (where the specific “magic” happens; i.e. business logic is applied to the source data using the set assumptions) and consumption (of the information processed by the application and is ready to be reported on).
Keeping functional areas distinct has many advantages:
Resist the Urge
There is always a tendency to “jump in” and “do it all” using a single tool or technology or, in the case of Cognos TM1, a few enormous cubes and today, with every release of software, there are new “package connectors” that allow you to directly connect (even external) system components. In addition, you may “understand the mechanics” of how a certain technology works which will allow you to “build” something, but without comprehensive knowledge of architectural concepts, you may end up with something that does not scale, has unacceptable performance or is costly to sustain.
Some final thoughts:
Always remember, just because you “can” doesn’t mean you “should”.
Most organizations today have had successes implementing technology and they are happy to tell you about it. From a tactical perspective, they understand how to install, configure and use whatever software you are interested in. They are “practitioners”. But, how may can bring a “strategic vision” to a project or to your organization in general?
An “enterprise” or “strategic” vision is based upon an “evolutionary roadmap” that starts with the initial “evaluation and implementation” (of a technology or tool), continues with “building and using” and finally (hopefully) to the organization, optimization and management of all of the earned knowledge (with the tool or technology). You should expect that whoever you partner with can explain what their practice vision or mythology is or, at least talk to the “phases” of the evolution process:
The discovery and evaluation that takes place with any new tool or technology is the first phase of a practices evolution. A practice should be able to explain how testing is accomplished and what it covers How was it that they determined if the tool/technology to be used will meet or exceed your organization’s needs? Once a decision is made, are they practiced at the installation, configuration and everything that may be involved in deploying the new tool or technology for use?
Once deployed, and “building and using” components with that tool or technology begin, the efficiency at which these components are developed as well as the level of quality of those developed components will depend upon the level of experience (with the technology) that a practice possess. Typically, “building and using” is repeated with each successful “build” so how many times has the practice successfully used this technology? By human nature, once a solution is “built” and seems correct and valuable, it will be saved and used again. Hopefully, this solution would have been shared as a “knowledge object” across the practice. Although most may actually reach this phase, it is not uncommon to find:
At some point, usually while (or after a certain number of) solutions have been developed, a practice will “mature its development or delivery process” to the point that it will begin investing time and perhaps dedicate resources to organize, manage and optimize its developed components (i.e. “organizational knowledge management”, sometimes known as IP or intellectual property).
You should expect a practice to have a recognized practice leader and a “governing committee” to help identify and manage knowledge developed by the practice and:
As I’ve mentioned, a practice needs to take a strategic or enterprise approach to how it develops and delivers and to do this it must develop its “vision”. A vision will ensure that the practice is leveraging its resources (and methodologies) to achieve the highest rate of success today and over time. This is not simply “administrating the environment” or “managing the projects” but involves structured thought, best practices and continued commitment to evolved improvement. What is your vision?
Tests are commonly categorized by where they are added in the software development process, or by how specific the test is.
Testing levels are classified by the test’s objectives. The common levels of testing are: unit, integration, system, acceptance, performance and regression.
Unit (or module) testing refers to testing that verifies a specific “section of code”, usually at a very basic level.
This type of testing focuses on validating the linkage between components within a solution.
This kind of testing emphases the information that is passed between components in a solution (not to be confused with integration testing that is focused on the actual component linkage
System testing (often referred to as your “end-to-end” testing) refers to a completely integrated solution test – verifying that the solution really does meet requirements.
Acceptance testing is the (perhaps) last step, phase or “level” in your testing effort. This is when your solution is actually “distributed” to your user community’s for (hopefully) acceptance.
The objective of performance testing is to focus on determining what level that a particular component or entire solution will perform given expected workloads.
“True to form”, Splunk offers a downloadable “test kit” to help with your Splunk performance testing and tuning. “Splunkit” is an extendable app that streamlines an organizations practice of performance testing by:
Splunkit is configurable – you can set the speed at which data is generated or use your own custom data. You can also set the number of simulated users and thier specific usage patterns. Splunkit will work without a complicated setup – even with complicated directory structures or deployment configurations.
I always wondered about the security and outages of cloud applications and how that impacted the decisions of the Enterprise in using Cloud Services. I wonder if anyone thinks about it when signing up for a service. Typically it is low-cost and a small group is signing up for an application, so no one has to worry about it.
I was listening to the Gartner Webinar on this topic recently, one thing stuck out which I never thought about. What if the provider goes out of business? Obviously it does happen even if it is not very often. Most of the applications in the cloud, especially the latest ones, have quite a bit of Enterprise level Data. Does anyone think about their cloud provider going out of business? I am sure we are assuming that the Disaster recovery is the service providers issue. Most of the cloud providers do have this and the chances of outages in the cloud are as good as Enterprise Applications. But if the provider goes out of business what kind of timeline does one get for securing the data and if the data is lost what is the impact and remedy?
I remember my start-up days where as a CTO, I backed up our code and deposited it in a bank vault as part of the source code insurance obligation. The idea is that customers have this software running until they transition out. The customer may not change the code, but they asked for that. All the Enterprises I worked for included a Disaster Recovery (DR) strategy and I remember the drills we had to go through during my DBA days (backup / restore etc.).
The question to ask here is, does your DR strategy include Cloud applications, to mitigate the impact of Partial / Permanent Data Loss, and Service interruptions ?
Some of the sophisticated companies do secure their cloud data through back up and web-services to integrate into their Enterprise Data. But, I have not seen any contingent operating procedure for outages and service interruptions for Cloud Applications. We take it granted that when it is in the cloud, it is always available.
It is imperative to have a plan, if the application holds important data, for outages, security breaches and permanent data loss. If you haven’t classified the cloud data sensitivity and have comprehensive policies for risk mitigation based on the sensitivity, the time to do so is now.
If you are interested in the Gartner Webinar here is one:
Presented by: Jay Heiser
Myths & Realities of Self-Service BI
The popularity of Data Visualization tools and the Cloud BI offerings are new forces to reckon with. I find it interesting to see how the perception Vs usage of these tools in reality. Traditionally IT likes the control and centralized management for obvious reasons of accountability and quality of information. However the self-service BI tools and cloud offerings are accelerating the departmental BI development. Some of the misconceptions based on the early hype cycle is wearing off and the realities are becoming more clear.
Let’s look at some of the myths and realities…
Self-Service BI is pitched as the solution for faster access to data. BI product vendors think anyone can develop reports and use it, but the truth is, analysts want to analyze, not create reports or dashboards. What they need is easy ways to analyze the data, visual analysis still better. Self-service does not mean everyone is on their own to create the reports.
Myth 2: Self-Service BI means it is end of the traditional BI!
Almost all the major BI vendor and major Data Management software player offers a Visualization / In-memory tool along with traditional BI Tools. Every tool has its advantages and disadvantages based on their capability and usage. Forging a framework for data access, sharing and securing data appropriately is the key to leverage these new technologies. IT can also learn from some of the departmental success, primarily their ability to create solutions in their space and how they are using the tool to further their cause and apply those techniques in traditional BI space as well.
Myth 3: Self-Service is new!
Well, Excel is always the king of self-service BI. It was there before Self-Service BI, it is there now, and it will be there in the foreseeable future as well. So understanding the Self-Service BI usage and the limits will help IT and the entire organization to use these spectrum of tools efficiently.
Self-service has its place and limitations. It is great for data discovery. Who could do data discovery better than the business folks? Self-Service BI is all about getting the data sooner than later to the business power user not necessarily end-user. Use the data discovery to validate the benefit and integrate into the EDW or corporate centralized data once the application is proven.
In a nutshell self-service BI is here to stay as they always have been, but the key is to create a balancing governance structure to manage the quality, reliability and security.
I heard about Bluemix recently and I decided to give it a try. It is amazing that the DevOps environment is free for trying out (30 days) and makes perfect sense for start-ups and individual developers. It also makes sense for corporations trying to check out a new technology and don’t want to wait for Infrastructure setup. Signing up is very easy no credit card requirements.
Oh did I mention it is from IBM? Plethora of tools available for trying it out. I like the Project Management Dashboard. It is ideal for collaboration, communication and keeping track of the project progress.
Ok – it is a platform as a service from IBM.
It was interesting to see the amount of available tools and gadgets and readily available. I did go through the webinar etc but nothing like trying it out. I was curious about iOS application. So I created the Mobile environment and was able to try out a sample (polling) and I can copy the sample code and start modifying it for my use.
Word of caution, this is a multi-step process when you sign up. For each step you are going to get an email and you need to create the environment. Don’t know why I have to look at the email when I am already logged in – possibly the security feature.
I would like to hear the cost comparison to other PaaS out there. I liked the simplicity and should compare this to Amazon or Azure – that will be my next item. Pricing can quickly hit the threshold. It is all OpEx expenses that will be an issue for some companies. But if it cuts down the POC time / or ‘try before buy’ option instead of waiting for the environment creation – it is ideal.
URL for Bluemix: bluemix.net
Love to find out if anyone had good / bad experience in using it.
Traditionally, in our information architectures we have a number of staging or intermediate data storage areas / systems. These have taken different forms over the years, publish directories on source systems, staging areas in data warehouses, data vaults, or most commonly, data file hubs. In general, these data file staging solutions have suffered from two limitations:
Hadoop’s systems reduce the cost per terabyte of storage by two orders of magnitude. Data that consumed $100 worth of data storage now costs $1 to store on a Hadoop system. This radical reduction cost, enables enterprises to replace sourcing hubs with data lakes based on Hadoop, where a data lake can now house years of data vs. only a few months.
Next, once in a Hadoop filesystem (HDFS) the data can be published, either directly to tools that consume HDFS data or to Hive (or other SQL like interface). This enables end-users to leverage analytical, data discovery, and visualization tools to derive value from data within Hadoop.
The simple fact that these data lakes can now retain historical data and provide scalable access for analytics, also has a profound effect on the data warehouse. This effect on the data warehouse will be the subject of my next few blogs.
This is a summary of an article from Database Trends And Applications; dbta.com. The author addresses fundamental mistakes that we do or we live with in regards to our database systems.
1. Poor or missing documentation for databases in PRODUCTION
We may have descriptive table names and columns to begin with, but as workforce turns over and a database grows, we can lose essential knowledge about the systems.
A suggested approach is to maintain the data model in a central repository. This must be followed by executing validation and quality metrics regularly to enhance the quality of the models over time.
2. Little or no normalization
All information in one table may be easier for data access but may not be the best option in terms of design. Understand normalization:
• 1st normal Form – eliminate duplicate columns and repeating values in columns
• 2nd Normal Form – remove redundant data that apply to multiple columns.
• 3rd Normal Form – Each record of a table is unique based on the primary identifier.
3. Not treating the data model like a living breathing organism
Many people start with a good model when designing a database, and then throw it away as soon as the application is in production. The model should be updated as often as any new changes are applied on the database to communicate these changes effectively.
4. Improper storage of reference data
Store reference data in a model or have a reference in the model that points to reference data. Reference data is typically stored in several places or worse, in application code – making it very difficult when there is need to change this information.
5. Not using foreign keys or check constraints
Data quality is increased highly by having referential integrity and validation checks defined right from the database level.
6. Not using domains and naming standards
Domains allow you to create reusable attributes so that users don’t have to create them each time they need to use them. Naming standards increase readability of the database and make it easier for new users to adopt to a database. It is recommended that one uses proper names as opposed to short forms that a user has to try and figure out what they mean.
7. Not choosing primary keys properly
Choose a primary key wisely because it is painful to try and correct these down the line. A simple principle is suggested when picking a primary key; SUM: Static, Unique, Minimal. So a social security number may not be the best primary key in some cases because it may not always be unique and not everyone has one
Happy database modeling!
The OpenPages GRC platform includes 5 main “operational modules”. These modules are each designed to address specific organizational needs around Governance, Risk, and Compliance.
Operational Risk Management module “ORM”
The Operational Risk Management module is a document and process management tool which includes a monitoring and decision support system enabling an organization to analyze, manage, and mitigate risk simply and efficiently. The module automates the process of identifying, measuring, and monitoring operational risk by combining all risk data (such as risk and control self-assessments, loss events, scenario analysis, external losses, and key risk indicators (KRI)), into a single place.
Financial Controls Management module “FCM”
The Financial Controls Management module reduces time and resource costs associated with compliance for financial reporting regulations. This module combines document and process management with awesome interactive reporting capabilities in a flexible, adaptable easy-to-use environment, enabling users to easily perform all the necessary activities for complying with financial reporting regulations.
Policy and Compliance Management module “PCM”
The Policy and Compliance Management module is an enterprise-level compliance management solution that reduces the cost and complexity of compliance with multiple regulatory mandates and corporate policies. This model enables companies to manage and monitor compliance activities through a full set of integrated functionality:
IBM OpenPages IT Governance module “ITG”
This module aligns IT services, risks, and policies with corporate business initiatives, strategies, and operational standards. Allowing the management of internal IT control and risk according to the business processes they support. In addition, this module unites “silos” of IT risk and compliance delivering visibility, better decision support, and ultimately enhanced performance.
IBM OpenPages Internal Audit Management module “IAM”
This module provides internal auditors with a view into an organizations governance, risk, and compliance, affording the chance to supplement and coexist with broader risk and compliance management activities throughout the organization.
The IBM OpenPages GRC Platform Modules Object Model (“ORM”, “FCM”, “PCM”, “ITG” an “IAM”) interactively deliver a superior solution for Governance, Risk, and Compliance. More to come!
When preparing to deploy the OpenPages platform, you’ll need to follow these steps:
Depending upon your needs, you may find that you’ll want to use separate servers for your application, database and reporting servers. In addition, you may want to add additional application or reporting servers to your topology.
After the topology is determined you can use the following information to prepare your environment. I recommend clean installs (meaning starting with fresh or new machines and VM’s are just fine (“The VMWare performance on a virtualized system is comparable to native hardware. You can use the OpenPages hardware requirements for sizing VM environments” – IBM).
(Note – this is if you’ve chosen to go Oracle rather than DB2):
MS Windows Severs
All servers that will be part of the OpenPages environment must have the following installed before proceeding:
The Database Server
In addition to the above “all servers” software, your database server will require the following software:
Again, in addition to the above “all servers” software, the server that hosts the OpenPages application modules should have the following software installed:
o IBM Websphere Application Server ND 126.96.36.199 and any higher Fix Pack Note: Minimum requirement is Websphere 188.8.131.52.
o Oracle WebLogic Server 10.3.2 and any higher Patch Set Note: Minimum requirement is Oracle WebLogic Server 10.3.2. This is a prerequisite only if your OpenPages product does not include Oracle WebLogic Server.
The server that you intend to host the OpenPages CommandCenter must have the following software installed (in addition to the above “all servers” software):
During the OpenPages Installation Process
As part of the OpenPages installation, the following is installed automatically:
For Oracle WebLogic Server & IBM WebSphere Application Server environments:
If your OpenPages product includes the Oracle WebLogic Server:
If your OpenPages product includes the Oracle Database: