I’ve been reflecting on my experience last week at IBM Think. As ever, it feels good to get back to my roots and see familiar faces and platforms. What struck me, though, was the unfamiliar. Seeing AWS, Microsoft, Salesforce, Adobe, SAP, and Oracle all manning booths at IBM’s big show was jarring, as it’s almost unheard of. It’s a testament to my current rallying cry for prioritizing the focus on how to make a diversity of platforms work better together by making data flow all directions, with minimal effort. I see many partners focusing in on this by supporting a diversity of data integration patterns in zero copy or zero elt patterns (a recurring theme, thank you Salesforce). In this environment of radical collaboration, I think something really compelling might’ve gotten lost… a little open source project they launched called InstructLab.
IBM spent a lot of time talking about how now is the time to SCALE your investments in AI, how it’s time to get out of the lab and into production. At the same time, there was a focus on fit for purpose AI, using the smallest, leanest model possible to achieve the goal you set.
Think Big. Start Small. Move Fast.
I always come back to one of our favorite mantras, Think Big. Start Small. Move Fast. What that means here is that we have this opportunity to thread the needle. It’s not about going from the lab to the enterprise-wide rollouts in one move. It’s about identifying the right, most valuable use cases and building tailored, highly effective solutions for them. You get lots of fast little wins that way, instead of hoping for general 10% productivity gains across the board, you’re getting 70+% productivity gain on specific measurable tasks.
This is where we get back to InstructLab, a model- agnostic open source AI project created to enhance LLMs. . We’ve seen over and over that general-purpose LLMs perform well for general-purpose tasks, but when you ask them to do something specialized, you’re getting intern in their first week results. The idea of InstructLab is to be able to track a taxonomy of knowledge and task domains, choose a foundation model that’s trained on the most relevant branches of the taxonomy, then add additional domain-specific tuning with a machine-amplified training data set. This opens the door to effective fine tuning. We’ve been advising against it because most enterprises just don’t have enough data to move the needle and make the necessary infrastructure spend for the model retraining to be worth it. With the InstructLab approach, we can, as we so often do in AI, borrow an idea from Biology–amplification. We use an adversarial approach to amplify a not-big-enough training set by adding additional synthetic entries that follow the patterns in the sample.
The cool thing here is that, because IBM chose the Apache 2 license for everything, they’ve open sourced, including Granite, it’s now possible to use InstructLab to train new models with Granite models as foundations, and decide to keep it private or open source it and share it with the world. This could be the start of a new ecosystem of trustable open-source models that have been trained for very specific tasks that meet the demands of our favorite mantra.
Move Faster Today
Whether your business is just starting its AI journey or seeking to enhance its current efforts, partnering with the right service provider makes all the difference. With a team of over 300 AI professionals, Perficient has extensive knowledge and skills across various AI domains. Learn more about how Perficient can help your organization harness the power of emerging technologies- contact us today.