Sometimes it is good to state the obvious, because we tend to not see what is in front of our eyes. So here I would like to start by saying ECM (Enterprise Content Management) is not only about storing content, but also about being able to efficiently use the content. It is common knowledge that in order to use something you need to be able to find it. When talking about ECM, in a world where most content is digital, the first thing that comes to mind is full text indexing. Full text indexing works well in many cases, but after many years of using this technology I still see problems when applying this concept to engineering drawings, scanned receipts, pictures, and even some “fully OCR” pdfs. So what else is out there to help users find their documents?
Since humans started to create large repositories of knowledge (libraries), many ways to organize and classify this information have emerged. In our digital world the two major tendencies seem to be: first, Complex Folder structures designed for people who like to navigate to their content (browser approach); and second, infinite number of attributes (meta-data or tags) to be used with a powerful search engine (tagging or search approach).
How do I search?
I personally love search engines (Google Search Appliance, FAST, Autonomy, and even spotlight); but when I scan expense reports into my PC, search engines don’t work well, even when I use OCR and put very descriptive names on them (maybe because I don’t remember those names?). So, how do I find my receipts and reports during tax season? I store those scanned images in a folder structure that uses the project name and the date (Taxonomy!!!).
I’ve lost count of how many times I’ve been asked whether a system should be based on a flat folder structure with hundreds of attributes or vice-versa. I believe that taxonomy and attributes (tags) should complement each other. Just as Johnny Gee says, the folder structure needs to be kept simple while keeping the company’s organization in mind; I also agree with him when he says that the attributes need to be limited. I don’t think I can give a magic number that would say how much is too much, but I can assure you that when things get too complicated no one will use them.
Finally, it is good to remember that since every one needs to get maximum utilization out of their assets, it is important to have tools that can be exploited; and when all is said and done, the main reason for having an ECM system is to improve accessibility of the content; NOT to lock the content in a safe that people (users) cannot access. So I recommend an organized folder structure that is simple enough for “tagger-people”, and an object model that “browser-people” would understand. In other words, the best way to find content is to achieve a balanced mix between a good search engine (and object model) and a good taxonomy.
What would you say is the best way to Search?