by August 19th, 2013on
In one of my earlier posts I recommended an approach for loading larger amounts of data into TM1 applications, and provided a high-level explanation of this approach, which I call the “File Route” solution.
Some More Thoughts
I like this load-strategy because, in following best practice guidelines, it leverages TM1’s proven ETL tool, TurboIntegrator to load data. TurboIntegrator (or TI) is a fast and efficient method for loading rows and rows of data and can be programmed to handle most presumed exception conditions. The following are a few more thoughts on implementing such a solution.
Generally “bulk data load” does not refer to the process of receiving formatted data files from source systems or directly querying a ledger for data, what it does refer to is the need for a simple, robust process for business users to enter lager amounts of data into TM1 without having to type in the data, record by record. For example, when making top-side adjustments to a forecast it may be considered reasonable to expect a user to key-in or edit certain cells in a cube (using a formatted cube view or WebSheet, for example) but if a user must provide actuals for 100’s of accounts and for dozens of cost centers, than data entry isn’t going to work. Here is where a “bulk load” solution comes in handy.
Think of a bulk load process as an assembly line that continuously moves from point A (the user’s desktop) to point B (a location with TM1). Users “drop” chunks of data onto a “receiving point” or “drop box” where an automated, intelligent process receives the “package”, “logs it as a request”, (does some verification) and then indicates to an individual loader process that there is work to be done. Once a loader process has processed the request (loads the data), another process validates the results of the processing request and then notifies the requestor (the user). (During each of these “steps” the process keeps a status object (perhaps a cube) updated with the latest processing information which can be viewed by the user).
More architectural thoughts to keep the following in mind when considering your file route solution:
- Users will need to adopt a somewhat standard format for submitting their data. “Somewhat” means that there can be some flexibility in the format, but where formats differ, “logical rules” need to be strictly enforced to allow program logic to identify and understand data (as the TM1 Architect, you’ll need a good understanding of all data requirements for this!)
- You will need to determine the cadence that is used by the solution to check for new data. How often? Every 30 minutes? Every hour? On demand? This will require some experimentation of average file sizes, data “overlaps”, security, etc.
- Security requirements. For example, how will you restrict users from submitting data on behalf of others? Will you need to restrict WHEN or how often a user can submit data for load? How about maximum file sizes? Etc.
- User feedback. It is absolutely imperative that you provide a near real time method for monitoring and auditing the status of load requests to each user.
- Expirations. You will need to consider the idea of “expiring” requests. . Based upon a variety of factors, will a load request ever become “stale” and not need to be loaded?
- You solution must always load to absorption or input cubes, never to any key application calculation or business process cube.
- After processing data files, always move them from the drop box folder to an offline location for archival and later (if required) audit.
- Build your solution as “generic” as possible! Do not “hard code” or “custom link” anything. The process should easily manage multiple load requests and, in a perfect world, have the ability to identify the format of the file (based upon its destination and perhaps other business logic), apply appropriate business logic and then load the data.
There is more to think about when designing and implementing a usable and scalable bulk load solution of course – but this approach works and works well. I’ve used it (or something similar to it) throughout my career with great success. Don’t believe me? Need some specifics? Give me a call.