Perficient Enterprise Information Solutions Blog

Blog Categories

Subscribe via Email

Subscribe to RSS feed


Follow Enterprise Information Technology on Pinterest

Reference Architecture – Cloud MDM for Salesforce


In my earlier post Why MDM should be part of CRM Strategy I discussed the importance of having a MDM strategy along with the CRM initiative. Proliferation of cloud CRM solutions like Salesforce is a blessing and it can also be an IT nightmare, if it is not managed properly.  In many companies Salesforce implementation is left to the Business discretion with minimal IT interaction.

In one of our client interaction, the IT leader was in fact trying his best not to address the Master Data issue and just implement what the Business is asking without addressing the Master Data implications. CRM implementation or enhancement is an opportunity for IT to get engaged and put their Master Data strategy in place. See my post on Why Master Data is different from CRM?

Though Master Data has been a familiar term many companies are far from implementing it. Mostly lack of ownership and cost justification barriers hold up the progress. In this blog I want to show the reference architecture for Cloud MDM solution from Informatica which can be an alternative to the traditional MDM on the cloud (see – my earlier post). Total cost of ownership will be much less than the traditional MDM.cloud_mdm_ref_arch

Typically the customer records are created in Salesforce and some key application systems like ERP or on-line store (eCommerce) sites. Cloud MDM which is typically installed in one of the Salesforce instances (Master) can be integrated with subscribing and Authoring applications synchronously or asynchronously depending on the timing requirements. All the de-dup and stewardship activities are performed using the Cloud MDM tool. Informatica Cloud MDM is a native Salesforce application which is deployed in the client’s Salesforce instance. This Architecture shows the expanded Cloud MDM solution integrating to Enterprise applications. This architecture supports only Customer Master.

Using this Architecture most of the Customer Master Data can be managed. This can also be a stepping stone for a full-blown future multi-domain MDM initiative.

See also our recent webinar on Cloud MDM specifically addressing salesforce based Customer Master Data management and the following related blog posts Cloud MDM – Are we getting closer? , Reference Architecture for Public Cloud Platform, Customer Analytics and Master Data, Master Data – Why is it different from CRM?, and Why MDM should be part of CRM Strategy?

The Industrialization of Advanced Analytics

Gartner recently released its predictions on this topic in a report entitled, “Predicts 2015: A Step Change in the Industrialization of Advanced Analytics”. This has very interesting and important implications for all companies aspiring to become more of a digital business. The report states that failure to do so impacts mission-critical activities such as acquiring new customers, doing more cross-selling and predicting failures or demand.

shutterstock_167204534Specifically, business, technology and BI leaders must consider:

  • Developing new uses cases using data as a hypothesis generator, data-driven innovation and new approaches to governance.
  • Emergence of analytics marketplaces, which Gartner predicts will be more commonly offered in a Platform as a Service model (PaaS) by 25% of solution vendors by 2016
  • Solutions based on the following parameters: optimum scalability, ease of deployment, micro-collaboration and macro-collaboration and mechanisms for data optimization
  • Convergence of data discovery and predictive analytics tools
  • Expanding technologies advancing analytics solutions: cloud computing, parallel processing and in-memory computing
  • “Ensemble-learning” and “deep learning”. The former defined as synergistically combining predictive models through machine-learning algorithms to derive a more valuable single output from the ensemble. In comparison, deep learning achieves higher levels of classification and prediction accuracy through the development of additional processing layers in neural networks.
  • Data lakes (raw, largely unfiltered data) vs data warehouses and solutions for enabling exploration of the former and improving business optimization for the latter
  • Tools that bring data science and analytics to “citizen data scientists”, who’ll soon outnumber skilled data scientists 5-to-1

Leaders in the emerging analytics marketplace, include:

  • Microsoft with its Azure Machine Learning offering
    • For further info, check out:
  • IBM with its Bluemix offering

Finally, strategy and process improvement, while being fundamental and foundational, aren’t enough. The volume and complexity of big data along with the convergence between data science and analytics requires technology-enabled business solutions to transform companies into effective digital businesses. Perficient’s broad portfolio of services, intellectual capital and strategic vendor partnerships with emerging and leading big data, analytics and BI solution providers can help.

Cloud MDM – Reference Architecture for Public Cloud Platform

Moving the MDM deployment to a cloud platform involves several considerations ranging from technical gotchas to getting the internal buy-in. Getting the buy-in is a different topic of discussion – see my earlier posts on Data Governance.

The advantages of moving to cloud eliminates some of the administrative and server farms related maintenance tasks but does not eliminate the overall responsibility. The apprehension of moving to cloud, especially the critical enterprise data like Customer Master or Product Master, can be daunting. Product Master would be an easier concept to sell than the customer master. Though, in reality, a lot of the customer information resides on the cloud already (think Salesforce).


Key Considerations (Product Master Data)

  • Product MDM tool on the cloud can consolidate, enrich, improve quality and publish to consuming applications but the catalog management very well be within the firewalls of the Enterprise.
  • Provisioning has to be well-thought-out and synchronizing the Enterprise user accounts with cloud can be a challenge.
  • Avoid the IT only approach and involve Data Governance to make sure business benefits are realized.
  • Synchronization of published data with the applications and the timeliness has to be addressed.

This architecture depicts three types of applications.

  1. Sources (Applications) which create/modify Master Data and receives Master Data with the ability to synchronize (upsert) the data from the Master Data Hub.
  2. The second type which can be a source of Master Data and consumes the latest Master Data for reference but no need for synchronization.
  3. The third category of Applications which are purely subscribers (consuming) applications which may include external partners.

Providing the Product Master on the cloud makes sense especially for publishing to customer-facing portals and to third-party subscribers like partners.

Cloud MDM – are we getting closer?

As Master Data Management matures and becomes a standard application within the enterprise architecture landscape, variations of cloud offerings are slowly emerging. Many companies are not even aware of these solution offerings. You can broadly classify the types of cloud offerings into 3 categories. There are key advantages and disadvantages in pursuing these architectures.

The concern in deploying cloud MDM is always security and the apprehension of leaving the critical data on the cloud. However there are specific cases where it makes perfect sense to leave it on the cloud.


Specific Cloud offerings

There are specific cloud MDM solution, which can be classified as add-on product to an existing cloud application like Salesforce. For example Informatica offers cloud MDM product which runs on salesforce and other new offerings in this space are also in the works. Extending the capability of these offerings to include enterprise applications are a possibility. These offerings tend to be single domain offering like customer master. Security concerns are minimal to none because it runs natively on the salesforce platform and all the provisioning is done at the salesforce platform. The idea is less apprehensive because it is nothing new but an add-on to salesforce. Risk and the stability of the product will be less of a concern yet this has to be addressed internally.

Public Cloud (SaaS model)

There are products which offer public (shared) cloud MDM with certain limitations. Works well for prototyping and move to other options. Downside is you are sharing the environment with other companies. If the data is not critical and internal concerns can be managed this can be an option as well. Security is something one has to address as the data is being shared between multiple customers. Internal selling will be little difficult for this offering.  Definitely there will be apprehension because of security and the idea that one has to share the key data with other clients is hard to overcome as well.  Product risk and limitations like number of records etc. are considerations.

Public Cloud (Standard MDM hosted)

Other variation of this cloud offering is to have the MDM software deployed on the cloud platform like Amazon and the likes. You will own the software except it is deployed on the cloud platform. Saves the hurdle of managing the platform and avoids the infrastructure bottleneck. All the advantages/ disadvantages of standard MDM applies to this architecture. More detailed reference architecture  in the upcoming blogs.

MDM SaaS model

Several companies offer MDM tool as SaaS – more like salesforce. Detailed understanding and cost/ benefits has to be assessed. Information about the product may not be readily available. This will be great if it fits all the needs of the organization but they tend to solve only certain aspects of it, mostly Creation of Master Data.

There is a host of Cloud  MDM options available and it is definitely beneficial in terms of time to market and as a stepping stone to the Enterprise MDM. Depending on the company culture and the current state these are viable alternatives to traditional MDM.

See also the webinar on Cloud MDM:  Creating an Effective MDM strategy for Salesforce


Key strategies for Data Quality

We have witnessed in numerous client engagements Data Quality (DQ) is a never-ending battle and in many companies IT is in the midst of fixing and re-fixing the data rather than developing solutions and managing the applications. Data quality is not confined to IT but it is an effort which involves all the users of the Data especially Business.

Building a company wide initiatives can be a hard sell as the enormity and the scope is complex. However applying some of the proven key strategies will  help create the awareness and gain the support for DQ. The key idea is to not fix the problem over and over but understand and communicate the bigger picture to solve the problem as the IT and Business Information management matures.


If it is not measured you will never know what you are dealing with. Setting up key quality measures and documenting impacts is the best way to get support. As part of any key initiative include DQ measures which can be gathered and reported periodically. Having the key information metrics is sure way of getting the needed attention. Some of the measures could be as simple as:

  • Down time caused by quality issues
  • Man hours invested in repeat problems
  • Ownership of the Quality issues


Any data project should consider the trust aspects of the data. Introducing Data certification process for adding new data, especially large batch data will add tremendous improvements in business participation and overall quality improvements. Measuring the key quality information like missing/null values and wrong information (invalid values), rejections, warnings should be communicated and remedied within the expected timeframe. Many times we find the fix is in the up-stream systems which will eliminate current and future data issues.

Creating a data certification process is the best way to engage the business and gaining their trust. The key idea is to make the business responsible for the Data not the IT.

DQ as part of SDLC

Data Quality should be part of SDLC to guarantee acceptable quality. Development Projects should include time for reporting the key quality measures. This will greatly improve the trustworthiness of the data which in turn helps the adaptations of the new application faster.

Leverage Governance   

Governance is the best way to gain support for the quality initiatives, most of the time when it comes to cutting cost, quality related development time is cut out because it is perceived as nice to have. DG can mandate these requirements and get the necessary support. Make sure there is a process for getting on to the DG agenda and leverage it to bring about the DQ transformation.

In short, making incremental changes to existing process to quality as a key component will help build the case for the broader process changes and tools needed to manage the overall quality.

Link to earlier DQ blog:

What is the worth of Data Quality to organizations?

Lambda Architecture for Big Data – Quick peek…

In the Big Data world Lambda architecture created by Nathan Marz is a standard technique applied to solve many predictive analytics problems. This architecture effectively delivers the streaming data and batch data to combine the past information with the current changes, producing a comprehensive platform for predictive framework.


Lambda Architecture

On a very high generic level the architecture has 3 components.

  • Batch Layer, which has all the processed batch data from the past.
  • Speed Layer or real-time feed of similar or same information.
  • Servicing layer which holds the batch views relevant for the queries needed by the predictive analytics

Lcapambda architecture solves the issue of intended output can change because of code changes.  In other words enhancement in code for better data processing is achieved by keeping the original input data intact or read only. Though some may claim that Lambda architecture is an exception to CAP theorem is debatable.

In reality, programming for batch and the stream typically needs two different set of codes. This is an issue because business logic and other enhancements has to be done in two different places. Creating a single API for both batch and real-time data can be one way to hide the complexity for the higher level code but the fact remains there are two different branches for processing at the lower level.

Extended lambda Architecture

Assuming you are satisfied with the limitations of Lambda architecture, most predictive analytics needs past data along with  the data captured within the enterprise. Including those key data will enhance the overall quality and provide the most available data for the predictive engine.

As the industry matures, these techniques will become more robust and will provide the best available data faster than ever. As we now take star schemas and their variations as a given for Data Warehousing, Lambda architecture and their variations will be prevalent in the near future as well.



Bootstrapping Data Governance – Part II

As I mentioned in my earlier post, building the vision for Data Governance happens through multiple meetings, interactions and discussions at various levels. Depending on the company culture and type of industry, progress may end up being faster or slower. Leadership in building the awareness and linking the business issues which can be solved through DG are key strategies to gain support for setting up DG organization.



Here are some of the key activities that need to happen in building the vision, even if you are getting external help:

  • Identify key initiatives and link them to benefits of DG effectiveness
  • Use the strategic planning meetings / Road Map opportunities to include DG track
  • Name someone capable as Data Steward even if it is a part-time role
  • Gather and Highlight the Data Quality issues and the business impacts


Once you have established support from the Executives, start building the business case using the major initiatives to pay for it. Though establishing the DG seems trivial, it has several levels of complexity. Part of the preparation is to knowing the Data issues, remedies and approaches clearly before launching the DG. Key SME’s and Stewards have to work on collecting the information even if external help is brought in. Ultimately insiders will know what works within the organization.

  • Identifying and getting the support of Key SME’s and stakeholders
  • Current business process and pitfalls (Use Business SME’s/ Leaders)
  • Estimating how much of SME’s time and involvement is needed for DG – for socializing the idea
  • Knowing your Key supporters and potential resistance
  • Keeping as much information ready as possible before even engaging external help
  • Create a plan to keep the DG operating independently with Business in the driving seat
  • IT should be servant leader, take the lead in doing all the grunt work and help DG to make the right decision
  • Build / gather the material for business case using the key initiatives / imperatives which can fund the DG effort
  • Start with specific goals agenda and expand once success / participation rate meets expectations
  • Ultimately the execution is the key – will discuss that in the following posts.

Support & Sponsorship

Once sponsorship and support is secured, swift execution and follow through is a must. Bringing in the external help at this point will be very beneficial. Doing the ground work early on and preparing the needed artifacts, any relevant information for DG and quality in general will greatly help in cutting down the current state and future state development, Road Map and in applying the best practices.

In summary, early ground work is crucial for developing the road map with key initiatives for short term wins and quick ROI. Seasoned managers know the value of ground work and they don’t waste time while the planning is in progress. Also the technology (tools, platform) business case should be built at this point along with establishing the DG. Folding the tools expenses as part of the key initiative is always a winning strategy.

Bootstrapping Data Governance – Part I

A lot has been said and written about Data Governance (DG) and the importance of having one. However it is still a mystery for many companies to create an effective DG. Based on our experience majority of the companies in their early stages of DG fall into one of these areas:


  1. Had too many false starts
  2. Not much impact and the DG lost much of the support
  3. No clue, not even attempted

Why is it so difficult to set up a reasonably functioning Data Governance?

The typical scenario is that IT leads the Data Governance initiative, as part of the overhaul of the IT or as part of a new initiative like re-building Data Warehouse / launching Master Data Management program. Too often companies tend to establish DG with limited vision and narrow scope with minimal business involvement. The problem areas and possible pitfalls companies run into  can be broadly classified under three major areas for the DG establishment phase viz., Vision, Preparation and Sponsorship & Support.


Getting the Executive buy-in and setting the Data Governance vision is a process of evolution. Typically this takes 3 – 12 months of pre-work through casual meetings and by including DG topic in the strategy meeting agendas for discussion. Awareness through common education by attending industry seminars/ conferences is another dimension for setting the vision. If DG concept has been discussed and socialized for some time, then leveraging the common understanding to launch the program is the next step.


Being prepared is the best way to avoid false starts. Opportunity to launch the DG happens when you are least prepared. It is not easy to devote time to DG incubation when you have burning issues around you. But those burning issues especially the catastrophic events may escalate the urgency for DG and may gain unprecedented Executive support or even mandate from the top. Now you are definitely stuck if you are least prepared.

Sponsorship & Support

Once you get the go ahead, approaching DG without a holistic vision and complete picture will water down the momentum and slowly the support will start to disappear. Keeping the executive team committed to DG means, producing meaningful results and engaging the business in the planning through execution of DG.


DG Establishment is followed by the organization’s ability to successfully execute the DG mandates. Again putting together a solid approach of people, process and technology will guarantee the success of DG.

In the next segment let’s look at nimble and effective strategies to keep the DG a successful organization from establishment through execution.

Strategic Trends for 2015

trends_2015We are almost at the end of 2014. Time to check out the 2015 trends and compare with what has been the focus in 2014. Looking at the top 10 trends in Information Management, some things have changed and some have moved up or down the list.

However, the same old challenges pretty much remain. We saw a significant emphasis on Data Visualization and Big Data push in 2014 and this trend will continue.

Big Data remains in the top 10 in some shape or form, virtualization and cloud management is getting complex and is something organizations have to deal with. Especially hybrid cloud is becoming a part of the Enterprise Architecture fabric.

The common theme in all these trends are the complexity and the security / governance aspects.  Data sources, creation and management is lot different in the last 5 years than ever before. Enterprise data is not confined to the firewalls and corporate data centers. Data centers continue to evolve and the applications continue to reside outside the norm. Ownership, responsibility, quality and trust worthiness is becoming real complex. Knowing what to trust, filtering the noise from the real information is becoming partly art and science.

New era of data centers include cloud infrastructure (public and private), traditional enterprise data centers, Cloud applications and accessibility through variety of devices including personal devices. Forging a security framework and governing the data becomes lot more critical and urgent.

Having a disciplined Governance organization with agility to respond and manage business information becomes a critical component of successful Information management. As the complexity, vulnerability  and risk increases, forming and managing the policies to secure the corporate data is vital. Governing the information  goes beyond the responsibility of Information Technology. Gone are the days where Business can hand a wish list and IT builds an application. Business and IT has to work closely to create Governance policies and procedures  to tackle this paradigm shift.

Connect with Perficient on LinkedIn here

Hadoop’s Ever-Increasing Role

With the advent of Splice Machine and the release of Hive 0.14 we are seeing Hadoop’s role in the data center continue to grow. Both of these technologies support limited transactions against data stored in HDFS.

Untitled design (4)Now, I would not suggest moving your mission-critical ERP systems to Hive or Splice Machine, but the support of transactions is opening up Hadoop to support more use cases, especially those use cases supported by RDBMS based data warehouses. With transaction support there is a more elegant way to handle slowly changing dimensions of all types in Hadoop now that records can be easily updated. Fact tables with late-arriving information can be updated in place. With transactional support, Master Data can be supported more efficiently. The writing is on the wall: more and more of the functionality that has been historically provided by the data warehouse is now moving to the Hadoop cluster.

To address this ever-changing environment, enterprises must have a clear strategy for evolving their Big Data capabilities within their enterprise architecture. This Thursday, I will be hosting a webinar, “Creating the Next-Generation Big Data Architecture,” where we will discuss Hadoop’s different roles within in a modern enterprise’s data architecture.