Perficient Business Intelligence Solutions Blog

Blog Categories

Subscribe via Email

Subscribe to RSS feed

Posts Tagged ‘bi’

Qlik leadership – vision, guts and glory… hopefully

“It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is most adaptable to change.” – Supposedly Darwin from ‘Origin of Species’… or NOT

According to the most recent report from Gartner, no one vendor is fully addressing the critical space in the market for “governed data discovery”. Governed data discovery means addressing both business users’ requirements for ease of use and enterprises’ IT-driven requirements – nobody is really doing that. So, who will be the most adaptable to change and embrace the challenges of an ever-changing and increasingly demanding BI and Analytics market?

Qlik leadership – vision, guts and glory… hopefullyThis year, Qlik plans to release a completely re-architected product – QlikView.Next – that will provide a new user experience, known as ‘Natural Analytics’. The lofty goal of QlikView.Next and this Natural Analytics approach is to provide both business-user-oriented and IT-friendly capabilities. According to Gartner, this approach has ‘the potential to make Qlik a differentiated and viable enterprise-standard alternative to the incumbent BI players.’

Will QlikView.Next be able to deliver the combination of business user and IT capabilities that are currently lacking in the market? Will Qlik be able to reinvent itself with Natural Analytics and deliver the “governed data discovery” solution that the market needs so desperately? Only time will tell; however Qlik is definitely showing all the traits of a real leader in the BI and Analytics space – once again, setting the bar pretty high. The vision and the guts are definitely there and accounted for. Will glory follow? That will depend on execution and delivery.

However, QlikView.Next is more than a year behind its scheduled release… so I guess we’ll have to rely on past behavior for now. Back in 2006, when Qlik carved its space on Gartner’s Magic Quadrant for Analytics and BI platforms (BusinessWire article), it positioned itself right in the ‘Visionary’ quadrant and it has been delivering on its vision ever since. For about eight years, Qlik has been delivering on its vision for Business Intelligence, i.e. user-driven BI – Business Discovery. Given this track record, I’ve reasons to believe that Qlik will be able to deliver on its vision once again.

I also believe that leadership is all about having a vision, along with the guts and ability to execute on that vision. That is probably one of the reasons why Gartner came up with quadrants that organize technologies along two dimensions – ‘Completeness of Vision’ and ‘Ability to Execute’. For the past few years, thanks to its ability to execute and deliver on its vision, QlikView has been able to work its way to the leaders quadrant and secure its position (GMQ 2014) – by demonstrating excellence in both vision and execution. So, how is Qlik planning on executing on its vision over the next few months – what’s .Next?

Well, there are several features worth mentioning… but, we’d be able to review only a few here, namely:

Read the rest of this post »

Tag Splunk, you’re it!

Splunk does a wonderful job of searching through all of the data you’ve indexed, based upon your search command pipeline. There are times though that you can add additional intelligence to the search that Splunk cannot add on its own – perhaps this information is specific to your organizational structure, like host names or server names. Rather than typing this information within the Search pipeline each time, you can create a knowledge object in the form of a Splunk search tag.

Search Tags

To help you search more efficiently for particular groups of event data, you can assign one or more tags to any field/value combination (including event type, host, source, or source type) and then do your searches, based on those tags.

Tagging field value pairs

You can use Splunk Web to create your tags directly from your search results. As an example, I’ve indexed multiple Cognos TM1 server logs into my Splunk server. These logs are generated from many different TM1 Admin servers but are all indexed by one Splunk server. If I’d like to have the ability to search a particular server source without having to qualify in each of my searches, I can create a tag for that server.

In a resulting search, I can select any event that has the field value pair that I want to tag, then:

1. Click on the arrow next to that event:

tag1

 

 

 

 

 

 

 

 

2. Under Actions, click on the arrow next to that field value:

tag2

3. Now select Edit Tags:

tag3

4. Create your tag and click Save:

tag4

 

 

In my example, I created a tag named “TM1-2” that specifies a particular TM1 server source. In the future, I can then use that tag to further narrow my search and isolate events that occurred only in that server log:

tag5

tag=TM1-2 product x plan

You can use the tag to narrow down the search (like in my example above) by using the following syntax:

tag=<tagname>

Or, you can even further narrow down your search by associating your tag to a specific field using the following syntax:

tag::<field>=<tagname>

Use wildcards to search for tags

As a Splunk Master, you can “get wild” and use the asterisk (*) as a wildcard when searching using your Tags. For example, if you have multiple event-type tags for various types of TM1 servers, such as TM1-1 and TM1-99, you can search for all of them with:

tag::eventtype=TM1-*

If you wanted to find all hosts whose tags contain “22″, you can search for the tag:

tag::host=*22*

Here is an interesting example that I have yet to utilize (although you’ll find it in several places in the Splunk documentation): if you wanted to search for the events with event types that have no tags associated with them, you can search for the Boolean expression:

NOT tag::eventtype=*

Wildcards in general

Wildcard support makes searching very flexible, however it is important to understand that the “more flexible” (or less specific) you’re Splunk searches are, the less efficient they will become. It is recommended that care be taking when using wildcards within your searches.

Splunk On! 

 

QlikView… QlikTech… Qlik…

Several years ago, when I started using QlikView (QlikTech’s flagship product), I had a strong preference for more traditional BI tools and platforms, mostly because I thought that QlikView was just a visualization tool. But after some first-hand experience with the tool, any bias I had was quickly dissipated and I’ve been a QlikView fan and fulfilling the role of Senior QlikView Architect on full lifecycle projects for a while now.

QlikViewToday, Qlik Technologies (also known as QlikTech or simply Qlik) is the 3rd fastest growing tech company in the US (according to a Forbes article) but my personal journey with QlikView, and probably QlikTech journey as well, has not always been easy – a paradigm shift in the way we look at BI is required. Most importantly, I understood along with many others, that this isn’t a matter of QlikView or SAP BI, of QlikView Agile approach to BI or Traditional BI – it is NOT a matter of ORs, but rather a matter of ANDs.

It is a matter of striking the right balance with the right technology mix and do what is best for your organization, setting aside personal preferences. At times QlikView may be all that is needed. In other cases, the right technology mix is a must. At times ‘self-service’ and ‘agile’ BI is the answer…. and at times it isn’t. Ultimately, it all revolves around the real needs of your organization and creating the right partnerships.

So far, QlikTech has been able to create a pretty healthy ecosystem with many technology partners, from a wide variety of industries and with a global reach. QlikTech has been able to evolve over time and has continued to understand, act on and metabolize the needs of the market, along with the needs of end-users and IT – I wonder what’s next.

That’s one of the reasons why Qlik has been able to trail-blaze a new approach to BI; user-driven BI, i.e. Business Discovery. According to Gartner ‘Qlik’s QlikView product has become a market leader with its capabilities in data discovery, a segment of the BI platform market that it pioneered.’

Gartner defines QlikView as ‘a self-contained BI platform, based on an in-memory associative search engine and a growing set of information access and query connectors, with a set of tightly integrated BI capabilities’. This is a great definition that highlights a few key points of this tool.

In coming blogs, we’ll explore some additional traits of QlikTech and its flagship product QlikView, such as:

Ø  An ecosystem of partnerships – QlikTech has been able to create partnerships with several Technology Partners and set in place a worldwide community of devotees and gurus

Ø  Mobility – QlikView was recently named ‘Hot Vendor’ for mobile Business Intelligence and ranks highest in customer assurance (see WSJ article here) with one of the best TCO and ROI

Ø  Cloud – QlikView has been selected as a cloud-based solution by several companies and it has also created strong partnerships with leading technologies in Cloud Computing, such as Amazon EC2 and Microsoft Azure

Ø  Security – provided at the document, row and field levels, as well as at the system level utilizing industry standard technologies such as encryption, access control mechanisms, and authentication methods

Ø  Social Business Discovery – Co-create, co-author and share apps in real time, share analysis with bookmarks, discuss and record observations in context

Ø  Big Data – Qlik has established partnerships with Cloudera and Hortonworks. In addition, according to the Wall Street Journal, QlikView ranks number one in BI and Analytics offering in Healthcare (see WSJ article here), mostly in connection with healthcare providers seeking “alternatives to traditional software solutions that take too long to solve their Big Data problems”

 

In future posts, I am going to examine and dissect each of these traits and more! I am also going to make sure we have some reality checks set in place in order to draw the line between fact and fiction.

What other agile BI or visualization topics would you like to read about or what questions do you have? Please leave comments and we’ll get started.

Searching with Splunk

It would be remiss in a blog on Splunk searching without at least mentioning the 6.0 version dashboard.

The Search dashboard

If you take a look at the Splunk search dashboard (and you should), you can break it down into 4 areas

  • Search Bar. The search bar is a long textbox that you can enter your searches into when you use Splunk Web.
  • Range Picker. Using the (time) range picker you set the period over which to apply your search. You are provided with a good supply of preset time ranges that you can select from, but you can also enter a custom time range.
  • How-To. This is a Splunk panel that contains links you can use to access the Search Tutorial and the Search Manual.
  • What-To. This is another Splunk panel that displays a summary of the data that is installed on this Splunk instance.

as1

 

 

 

 

 

 

 

 

 

 

The New Search Dashboard

After you run a new search, you’re taken to the New Search page. The search bar and time range picker are still available in this view, but the dashboard updates with many more elements, including search action buttons, a search mode selector, counts of events, a job status bar, and results tabs for Events, Statistics, and Visualizations.

Generally Speaking

All searches in Splunk take advantage of the indexes that where setup on the data that you are searching. Indexes exist in every database, and Splunk is not an exception. Splunk’s indexes organize words or phrases in the data over time. Successful Splunk searches (those that yield results) return records (events) that meet your searching criteria. The more matches you find in your data (the more events Splunk returns) will impact the overall searching performance so it is important to be as specific in your searches as you can.

Before I “jump in”, the following are a few things worth alerting you to:

  • Search terms are case insensitive.
  • Search terms are additive
  • Only the time frame specified is queried
  • Search terms are words, not parts of words

Splunk Quick Reference Guide

To all of us future Splunk Masters, Splunk has a (updated for version 6.0) Splunk Language Quick Reference Card available for downloading in PDF format from the company website:

www.splunk.com/web_assets/pdfs/secure/Splunk_Quick_Reference_Guide.pdf.

I recommend you having a look!

To Master Splunk, you need to master Splunk’s search language, which includes an almost endless array of commands, arguments and functions. To help with this, Splunk offers its searching assistant.

The Splunk searching assistant uses “typahead” to “suggest” search commands and arguments as you are typing into the search bar. These suggestions are based on the content of the datasource you are searching and are updated as you continue to type. In addition, the searching assistant will also display the number of matches for the search term, giving you an idea of how many search results Splunk will return.

The image below shows the Splunk searching assistant in action. I’ve typed “TM1” into the search bar and Splunk has displayed every occurrence of these letters it found within my datasource (various Cognos TM1 server logs) along with a “hit count”:

as2

The search assistant uses Python to perform a reverse-url-lookup to return description and syntax information as you type. You can control the behavior of the searching assistant with UI settings in the Search-Bar module, but it is recommended that you keep the default settings and use it as a reference.

Some Basic Optimization

Searching in Splunk can be done from Splunk Web, from the command line interface (CLI) or the REST API. When searching using the Web interface you can (and should) optimize the search by setting the search mode (Fast, Verbose or Smart).

Depending on the search mode, Splunk automatically discovers and extracts fields other than the default fields, returns results as an events list or a table, and runs the calculations required to generate the event timeline. This “additional work” can affect the performance and therefore the recommended approach would be to utilize the Splunk Fast Mode during which time you conduct your initial search discovery (with the help of the searching assistant) after which you can move to either the verbose or smart mode (depending upon your specific requirements and the outcome of your discovery searching).

Time-out

I should probably stop here (before this post gets any longer) – but stay tuned; my next post is already written and “full of Splunk” …

Thank you for downloading Splunk Enterprise. Get started now…

install_image_002

 

 

 

 

 

 

 

 

Once you have found your way to the (Splunk.com) website and downloaded your installation file, you can initiate the installation process. At this point you would have received the “Thank You for Downloading” welcome email.

More than just a sales promotion, this email gives you valuable information about the limitations of your free Splunk Enterprise license, as well as links to help you get started quickly, including links to:

  • Online Tutorials
  • Free live training with Splunkers
  • Educational videos
  • Etc.

The (MS Windows) Installation

On MS Windows, once your download is complete, you are prompted to Run.

install_image_003

 

 

 

 

 

 

 

 

 

Read the rest of this post »

Give Me Splunk!

So you are ready to Splunk and you want to get started? Well..

Taking the First Step

Your first step, before you download any installation packages, is to review the Splunk Software License Agreement, which you can find at splunk.com/view/SP-CASSSFA (and if you don’t check it there the Splunk install drops a copy for you in the installation folder – in both .RTF and .TXT formats). Although you have the ability to download a free full-featured copy of Splunk Enterprise, the agreement governs the installation and use and it is incumbent upon you to at least be aware of the rules.

Next, as in anytime you are intending to perform a software installation, you must make time to review your hardware to make sure that you can run Splunk in such a way as to meet your expected objectives. Although Splunk is a highly optimized application, a good recommendation is if you are planning on performing an evaluation of Splunk for eventual production deployment, you should use hardware typical of the environment you intend to employ to. In fact, the hardware you use for your evaluation should meet or exceed the recommended hardware capacity specifications for the tool and (your) intentions (you can check the Splunk.com website or talk to a Splunk professional to be sure what these are).

Disk Space Needs

Beyond the physical footprint of the Splunk software (which is minimal), you will need some Splunk “operational space”. When you read data into Splunk, it creates a compressed/indexed version of that “raw data” and this file is typically about 10% of the size of the original data. In addition, Splunk will then create index files that “point” to the compressed file. These associated “index files” can range in size -from approximately 10% to 110% of the rawdata file – based on the number of unique terms in the data. Again, rather than get into sizing specifics here, just note that if your goal is “education and exploration”, just go ahead and install Splunk on your local machine or laptop – it’ll be just fine.

Go Physical or Logic?

Most organizations today run a combination of both physical and virtual machines. Without getting into specifics here, it is safe to say that Splunk runs well on both; however (as does most software) it is important that you understand the needs of the software and be sure that your machine(s) are configured appropriately. The Splunk documentation reports:

“If you run Splunk in a virtual machine (VM) on any platform, performance does degrade. This is because virtualization works by abstracting the hardware on a system into resource pools from which VMs defined on the system draw as needed. Splunk needs sustained access to a number of resources, particularly disk I/O, for indexing operations. Running Splunk in a VM or alongside other VMs can cause reduced indexing performance”.

Let’s get the software!

Splunk Enterprise (version 6.0.2 as of this writing) can run on both MS Windows and Linux, but for this discussion I’m going to focus on only the Windows version. Splunk is available in both 32 and 64 bit architectures, and it is always advisable to check the product details to see which version are correct for your needs.

Assuming that you are installing for the first time (not upgrading) you can download the installation file (msi for Windows) from the company website (www.splunk.com). I recommend that you read through the release notes for the version that you intend to install before downloading. Release notes list the known issues along with potential workarounds and being familiar with this information can save plenty of your time later.

[Note: If you are upgrading Splunk Enterprise, you need to visit the Splunk website for specific instructions before proceeding.]

Get a Splunk.com Account

To actually download (any) version of Splunk, you need to have a Splunk account (and user name). Earlier, I mentioned the idea of setting up an account that you can use for educational purposes and support. If you have visited the website and established your account, you are ready; if not, you need to set one up now.

  1. Visit Splunk.com.
  2. Click on “Sign Up”

Once you have an account, you can click on the big, green button labeled “Free Download”. From there, you will be directed to the “Download Splunk Enterprise” page, where you can click on the link of the Splunk version you want to install.

From there, you will be redirected to the “Thank You for downloading…” page and be prompted to save the download to your location:

installSplunk1

 

 

 

 

 

 

 

And you are on your way!

Check back and I’ll walk you through a typical MS Windows install (along with some helpful hints that I learned during my journey to Splunk Nirvana)!

 

Where and How to Learn Splunk

“Never become so much of an expert that you stop gaining expertise.” – Denis Waitley

In all professions, and especially information services (IT), success and marketability depends upon an individual’s propensity for continued learning. With Splunk, there exist a number of options for increasing your knowledge and expertise. The following are just a few. We’ll start with the obvious choices:

  • Where and How to Learn SplunkCertifications,
  • Formal training,
  • Product documentation and
  • The company’s website.

Certifications

Similar to most main-stream technologies, Splunk offers various certifications and as of this writing, Splunk categorizes certifications into the following generalized areas:

The Knowledge Manager

A Splunk Knowledge Manager creates and/or manages knowledge objects that are used in a particular Splunk project, across an organization or within a practice. Splunk knowledge objects include saved searches, event types, transactions, tags, field extractions and transformations, lookups, workflows, commands and views. A knowledge manager not only will have a though understanding of Splunk, the interface, general use of search and pivot, etc. but also possess the “big picture view” required extend the Splunk environment, through the management of the Splunk knowledge object library.

The Administrator

A Splunk Administrator is required to support the day-to-day “care and feeding” of a Splunk installation. This requires “hands-on” knowledge of best practices, configuration details as well as the ability to create and manage Splunk knowledge objects, in a distributed deployment environment.

The Architect

The Splunk Architect will include both knowledge management expertise, administration know-how and the ability to design and develop Splunk Apps. Architects must also possess the ability to focus on larger deployments, learning best practices for planning, data collection, sizing and documenting in a distributed environment.

Read the rest of this post »

Cognos TM1 Performance Review – on a budget!

Often I am asked to conduct a “performance review” of implemented Cognos TM1 applications “rather quickly” when realistically; a detailed architectural review must be extensive and takes some time. Generally, if there is a limited amount of time, you can use the following suggestions as perhaps some appropriate areas to focus on (until such time when a formal review is possible):

  1. Cognos TM1 Performance Review on a BudgetLocking and Concurrency. Ensure that there are no restrictions or limitations that may prevent users from performing tasks in one part of the application because other users are “busy” exercising the application somewhere else or, specifically, limit the exposure that would allow one user or feature to adversely affect another.
  2. Batch (TurboIntegrator) Processing Time. This is the time it takes for critical TurboIntegrator scripts to complete.
  3. Application Size. The overall size of the application should be reviewed to determine if it is “of a reasonable size”, based upon the current and future expectations of the application (there are a lot of factors that will cause memory consumption to be less than optimal but, given time a cursory check is recommended to identify any potential offenders).

Specific Review Areas

Based upon timing and the above outlined objectives, you might proceed by looking at the following:

  • The TM1 server configuration settings
  • The number of and the dimensionality of the implemented cubes – specifically the cubes currently consuming the largest amounts of memory.
  • Implemented security (at a high level)
  • All TurboIntegrator processes that are considered to be part of critical application processing.

Start by

Taking a quick look at the server configuration settings (tm1s.cfg) and follow-up on any non-default settings – who changed them and why? Do they make sense? Where they validated as having the expected effects?

Cubes and dimensions should be reviewed. Are there excessive views or subsets? Are there any cubes with an extraordinary number of dimensions? Have the dimension order been optimized? Any dimension partially large? Etc.

Security can be complex and compound or, simple and straight forward. Given limited time I usually check the number of roles (groups) vs. the number of users (clients) – hint: never have more groups than clients – and things like naming conventions and how security is maintained. In addition, I am always uneasy when I see cell-level security implemented.

More considerations:

As part of a “quick review” I recommend leveraging a file-search tool such as Grep (or similar) to provide the ability to examine all of the applications TurboIntegrator script files for a specific function or logic pattern use (that you may want to examine more closely):

Saving Changed Data Methodology

Quickly scan the TI processes – do any call the SaveDataAll function to force in-memory changes to be committed to disk? SaveDataAll commits changes in-memory for all cubes in the instance or server. This creates a period of time that the TM1 server is locked to all users. In addition, depending on the volume of changes that were made since the last save, the period of time the function will require to complete will vary and increase processing time. Rather than using the SaveDataAll function, CubeSaveData should be used to serialize or commit in-memory changes for (only) a specific cube:

CubeSaveData(Cube);

Logging Options

TM1 by default logs all changes as transactions to all cubes. Transactional logging may be used to recover in an event of a server crash or other abnormal error or shutdown. A typical application does not need to log all transactions for all cubes. Transaction logging impacts overall server performance and increases processing time when changes are being made in a “batch”. The best practice recommendation is to turn off logging for all cubes that do not require TM1 to recover lost changes. In other situations, it may be preferable to set cube logging on for a particular cube but temporally turn off logging at the start of a (TurboIntegrator) process and then reset or turn back on logging after the process completes successfully:

CubeGetLogChanges(CubeName);

CubeSetLogChanges(Cube, LogChanges);

Subset and View Maintenance

It is a best practice recommendation to avoid using the ViewDestory and SubsetDestory functions. These functions are memory intensive, cause potential locking/rollback situations and impact performance. The appropriate approach is to use ViewExists and SubsetExists and if the view or subset exists, update the view and subsets as required for the processing effort.

An additional good practice is to modify the view and subset in the Epilog section to insert a single leaf element to all subsets in the view to reduce its overall size in case a user accidentally opens these “not for user” views.

CellsUpdateable

The CellsUpdateable is a TM1 function that lets you determine if a particular cube cell can be written to. This function is useful but impacts performance since it uses the same logic and internal resources as performing an actual cell write (CellPutN or CellPutS). This function is usually used as a defensive measure in avoiding write errors or to simplify processing logic flow. It is a best practice recommendation to restrict or filter the data view being processed (eliminating the need for recurring CellsUpdateable calls) if possible. This approach also decreases the volume or size of the data transactions to be processed.

Span of Data

Even the best TM1 model designs have limitations on the volume of data that can be loaded into a cube. An optimal approach is to limit a cube to 3 years of “active” data with prior years available in archival cubes. This will reduce the size of the entire TM1 application and improve performance. Older data can still be available for processing (sourced from 1 or more archival cubes). Keep the “active” cube loaded with only those years of data that have the highest percentage chance of being required for a “typical” business process. Additionally, it is recommended that a formal process for both archival and removal of years of data be provided.

View Caching

TM1 caches views of data into memory for faster access (increasing performance). Once data changes, views become invalid and must be re-cached for viewing or processing. It may be beneficial to utilize the TM1 function VIECONSTRUCT to “force” TM1 to pre-cache updated views before processing them. This function is useful for pre-calculating and storing large views so they can be quickly accessed after a data load or update:

ViewConstruct(CubeName, ViewName);

Real-time (RT) to Near-real-time (NRT)

Generally speaking, consolidation is one of TM1’s greatest features. The TM1 engine will perform in-memory consolidations and drill-downs faster than any rule or TI process.

All data that is not loaded from external means should be maintained using the most efficient means: a consolidation, TurboIntegrator process or rule. From a performance and resource usage perspective, consolidations are the fastest and require the least memory and rules are the slowest and require the most memory. Simply put, all data that cannot be calculated by a TM1 consolidation should be seriously evaluated to determine its change tempo (slow moving or fast moving). For example, data that changes little or only changes during a set period of time, or “on demand” are “good candidates” for TI processing (or near real time maintenance) rather than using a rule. If at all possible, moving rule calculation logic into a TurboIntegrator process should be the method used to maintain as much data as possible. This will reduce the size of the overall TM1 application and improve overall processing performance as well.

Architectural Approach

It is an architectural best practice to organize application logical components as separate or distinct from one another. This is known as “encapsulation” of business purpose. Each component should be purpose based and be optimized to best solve for its individual purpose or need. For example, the part of the application that performs calculating and processing of information should be separated from the part (of the application) that supports the consumption of or reporting on of information.

Applications with architecture that does not separate on purpose are more difficult (more costly) to maintain and typically develop performance issues over time.

Architectural components can be separated within a single TM1 server instance or across multiple server instances. The “multiple servers” approach can be ideal in some cases may:

  • Improve processing time as the processing instance would be a “system only” instance optimized for batch processing.
  • Eliminate locking situations as no client/user would have access to the “system only” instance.
  • Reduce the overall size of the “client/user” instance.
  • Improve maintainability by reducing the amount of code per instance.
  • Support scalability – the “system instance” can server numerous future servers that may require profile and work plan generation.

Performance Profile

Profiling is the extrapolation of information about something, based on known qualities (baselines) to determine its behavior (performance) patterns. In fact, performance profiling is determining the average time and/or resources required to perform a particular task within an application. During performance testing, profiles will be collected for selected application events that have established baselines. It is absolutely critical to follow the same procedure that was used to establish the event baseline to create its profile. Application profiles are extremely valuable in:

  • Validation of application architecture
  • Application optimization and tuning
  • Predicting cost of application service

It is strongly recommend that a performance/stress test be performed, will the goal of establishing an application profile, as part of any “application performance” review.

Conclusion

Overall, most TM1 applications will have various “to-dos” such as fine tuning of specific feeders and other overall optimizations which may be already in process or scheduled to occur as part of the project plan.  It is recommended that if an extended performance review is not feasible at this time, at leaset each of these suggestions should be reviewed and discussed to determine its individual feasibility, expected level of effort to implement and effect on overall design approach (keeping in mind these are only high level – but important suggestions).

Good Luck!

A Splunk Decision Support System

The importance of making credible decisions can be the difference between profit or loss, or even survival or extinction.

Decision Support Systems (or DSSs) serve the key decision makers of an organization– helping  them to effectively assess  predictors (which can be rapidly changing and not easily specified in advance) and make the best decisions, reducing risk.

A Splunk Decision Support SystemThe advantages of successfully implemented decision support systems are many, and include:

  • Saving time and increasing productivity
  • Improving efficiency
  • Boosting communications
  • Reducing costs
  • Supporting learning
  • Enhancing control
  • Identifying and understanding trends and/or patterns
  • Gaining operational intelligence (OI)
  • Gauging  results of services by channel or demographic
  • Reconciling fees against actual use
  • Finding the heaviest users (or abusers)
  • And more…

SPLUNK as the DSS

Can Splunk be considered a true real-time decision support system? The answer is of course, “Yes!”

Splunk does this by providing features and functionalities that provide the ability to:

  • answer questions based upon both structured and unstructured data,
  • support managers at all levels (as well as individuals and groups),
  • be adaptable, flexible, interactive and easy to learn and use,
  • provide efficient and quick responses,
  • allow scheduled-control of developed processes,
  • support easy development by all levels of end users,
  • provide universal access to all types of data,
  • offer both standalone and web-based integrations,
  • connect real-time service data with details about that data collected in an organizations master or other data and
  • More…

Splunk the Product

Splunk runs from both a standard command line or an interface that is totally web-based (which means that no thick client application needs to be installed) and performs large-scale, high-speed indexing on both historical and real-time data.

To index, Splunk does not require a “re-store” of any of the original data, but stores a compressed copy of the original data (along with its indexing information) allowing you to delete or otherwise move (or remove) the original data. Splunk then utilizes this “searchable repository” from which to efficiently graph, report, alert, dashboard and visualize in detail.

It just Works

After installation, Splunk is ready to be used. There are no additional “integration steps” required for Splunk to handle data from particular products. To date, Splunk simply works on almost any kind of data or data source you may have access to but should you actually require some assistance, there is a Splunk professional services team that can answer your questions or even deliver specific integration services.

Wrap-up

The Big Data market as measured by vendor revenue derived from sales of related hardware, software and services reached $18.6 billion in calendar year 2013. That represents a growth rate of 58% over the previous year (according to Wikibon data).

Also (according to Wikibon), Splunk had over $283 million of that “big data revenue” and has an even brighter outlook for this year. More to come for sure…

The Splunk Evolution

The term “Big Data” is used to describe information so large and complex that it becomes almost impossible to process using traditional methods. Because of the volume and/or unstructured nature of this data, making it useful or turning it into what the industry is calling “operational intelligence(OI) is extremely difficult.

According to information provided by International Data Corporation (IDC), unstructured data (generated by machines) may account for more than 90% of the data in today’s organizations.

This type of data, (usually found in enormous and ever increasing volumes) records some sort of activity, behavior, or measurement of performance. Today, organizations are missing the opportunities that big data can provide because they are focused on structured data using traditional tools for business intelligence (BI) and data warehousing (DW).

The Splunk EvolutionUsing these main-stream methods- such as relational or multi-dimensional databases in an attempt to understand big data is problematic (to say the least!) Attempting to use these tools for big data solution development requires serious experience and the development of very complex solutions and even then, in practice they do not allow enough flexibility to “ask any question” or get those questions answered in real time—which is now the expectation, not a “nice to have” feature.

Splunk – “solution accelerator”

Splunk started by focusing on the information technology department supporting the monitoring of servers, messaging queues, websites, etc. but is now recognized for its ability to help with the specific challenges (and opportunities) of effectively organizing and managing massive amounts of any kind of machine-generated big data.

Getting “right down to it”, Splunk reads (almost) any (even real-time) data into its internal repository, quickly indexes it and makes it available for immediate analytical analysis and reporting.

Typcial query languages depend on schemas. A (database) schema is how the data is to be “placed together” or structured. This structure is based upon the knowledge of the possible applications that will consume the data, the facts or type of information that will be loaded into the database, or the (identified) interests of the possible end-users. Splunk uses a “NoSQL” approach that is reportedly based on UNIX concepts and does not require any predefined schema.

Correlating of Information

Using Splunk Search, it is possible to easily identify relationships and patterns in your data and data sources based upon:

  • Time, proximity or distance
  • Transactions, either a single transaction or a series
  • Sub-searches (these are searches that actually take the results of one search and then use them as input or to effect other searches)
  • Lookups to external data and data sources
  • SQL-like “joins”,
  • Etc.

Keep in mind that the powerful capabilities built into Splunk do not stop with just flexible searching and correlating. With Splunk, users can also quickly create reports and dashboards with charts, histograms, trend lines, and much other visualization without the cost associated with the structuring or modeling of the data first.

Conclusion

Splunk has been emerging as a definitive leader for collecting, analyzing and visualizing machine big data. Its universal method of organizing and extracting insights from massive amounts of data, from virtually any source of data, has opened up and will continue to open up new opportunities for itself in unconventional areas. Bottom line – you’ll be seeing much more form Splunk!