Platforms and Technology Articles / Blogs / Perficient https://blogs.perficient.com/category/services/platforms-and-technology/ Expert Digital Insights Fri, 27 Feb 2026 01:11:09 +0000 en-US hourly 1 https://blogs.perficient.com/files/favicon-194x194-1-150x150.png Platforms and Technology Articles / Blogs / Perficient https://blogs.perficient.com/category/services/platforms-and-technology/ 32 32 30508587 3 Topics We’re Excited About at TRANSACT 2026 https://blogs.perficient.com/2026/02/26/3-topics-were-excited-about-at-transact-2026/ https://blogs.perficient.com/2026/02/26/3-topics-were-excited-about-at-transact-2026/#respond Fri, 27 Feb 2026 01:07:37 +0000 https://blogs.perficient.com/?p=390619

For years, digital wallets in the U.S. have been steady but unspectacular—useful for tap‑to‑pay, not exactly groundbreaking. But the energy in payments today is coming from somewhere unexpected: the crypto wallet world. Stablecoins now exceed $300 billion in circulation, and the infrastructure behind them is delivering the kind of security, interoperability, and user control traditional payments have long needed. 

That shift sets the stage for TRANSACT 2026, where Perficient’s Director of Payments, Amanda Estiverne, will moderate “Keys, Tokens & Trust: How Crypto Wallets Unlock Tomorrow’s Payments,” unpacking how these technologies can finally push digital wallets into their next era. 

“Beyond the session I’m moderating on crypto wallets—and how this technology is set to supercharge tokenization, transform digital identity, and reinvent the very idea of a mobile wallet—I’m fired up for several powerhouse conversations.” – Amanda Estiverne 

Here are three topics we’re looking forward to exploring—and why they matter now. 

  1. Security That Actually Builds Trust 

Security remains one of the biggest obstacles to broader U.S. digital wallet adoption—but it’s also the area where crypto wallets offer the clearest blueprint forward. Having spent years securing billions in digital assets in high‑risk environments, crypto wallets have refined capabilities such as multi‑signature authentication, advanced biometrics, tokenization, and decentralized key management. They show how strong security and user‑friendly design can coexist.

As regulators sharpen guidance and consumers demand more control over their data, these crypto‑born approaches are becoming increasingly relevant to mainstream payments. In her session, Amanda will explore how these wallet innovations—originally designed for digital assets—can address the core security concerns holding back U.S. mobile wallets and help transform them from simple tap‑to‑pay tools into trusted financial hubs.

“ETA Transact is the gathering place for the entire payments ecosystem. Banks, networks, fintechs, processors, and regulators all come together under one roof to explore what’s next in payments.” – Amanda Estiverne 

  1. Interoperability Across Rails and Borders

One of the most persistent challenges in payments is fragmentation—different rails, incompatible systems, and cross‑border friction that create cost and complexity for businesses and consumers alike. Crypto wallets, by contrast, were designed for interoperability from the start. A single wallet can span multiple networks, assets, and payment types without the user having to think about what’s happening behind the scenes.

It’s a timely shift: real‑time payments are scaling, embedded finance is showing up in more places than ever, and stablecoins have now crossed $300 billion in circulation. With tokenized deposits, stablecoins, and traditional rails now coexisting, payment providers need ways to make these systems work together in a unified experience.

Amanda’s session will break down how the cross‑network, cross‑border capabilities pioneered in crypto wallets can help overcome the interoperability gaps limiting today’s mobile and digital wallets—and why solving this is key to building the next generation of payments.

  1. Identity and Personalization in the AI Era

Digital wallets are quickly becoming more than a place to store cards. With AI, they can deliver smarter, more contextual experiences—from personalized rewards to anticipatory recommendations to voice‑enabled commerce. But to power these experiences responsibly, wallets need identity models that balance personalization with user privacy and control.

Crypto wallets have long used decentralized identity credentials that allow individuals to share only what’s necessary for each interaction. As AI‑driven personalization becomes the norm, that selective‑sharing model becomes even more valuable.

Amanda’s session will explore how decentralized identity frameworks emerging from the crypto space—and now reinforced by tokenization—can give digital wallets the foundation they need to support personalized, AI‑enhanced experiences while still preserving user trust.

“Agentic commerce, stablecoins and digital assets, digital identity, personalized payments, and instant payments are among the key themes shaping the conversation. The financial system is undergoing massive transformation, and these emerging areas will play a defining role in the infrastructure of tomorrow’s payments ecosystem.” – Amanda Estiverne 

Discover the Next Payment Innovation Trends 

Transact 2026 is where theory meets practice. Where banks, networks, fintechs, processors, and regulators pressure-test ideas and forge the partnerships that will define the next era of payments.

Amanda’s session focuses on how crypto‑wallet innovations—biometrics, tokenization, decentralized identity, and cross‑border interoperability—can help U.S. mobile wallets finally graduate from tap‑to‑pay conveniences into trusted, intelligent financial hubs.

“It’s where partnerships are forged, new ideas are pressure-tested, and the future of how money moves begins to take shape.” – Amanda Estiverne 

For payment leaders exploring what comes next, this conversation offers a grounded look at the capabilities most likely to redefine digital wallets across security, identity, interoperability, and user experience.

Attending TRANSACT 2026? Come by the Idea Zone at 1:40pm on Thursday, March 19th to hear the exclusive insights. Not attending? Contact Perficient to explore how we help payment and Fintech firms innovate and boost market position with transformative, AI-first digital experiences and efficient operations.

]]>
https://blogs.perficient.com/2026/02/26/3-topics-were-excited-about-at-transact-2026/feed/ 0 390619
2026 Regulatory Reporting for Asset Managers: Navigating the New Era of Transparency https://blogs.perficient.com/2026/02/20/2026-regulatory-reporting-for-asset-managers-navigating-the-new-era-of-transparency/ https://blogs.perficient.com/2026/02/20/2026-regulatory-reporting-for-asset-managers-navigating-the-new-era-of-transparency/#respond Fri, 20 Feb 2026 20:01:52 +0000 https://blogs.perficient.com/?p=390547

The regulatory landscape for asset managers is shifting beneath our feet. It’s no longer just about filing forms; it’s about data granularity, frequency, and the speed at which you can deliver it. As we move into 2026, the Securities and Exchange Commission (SEC) has made its intentions clear: they want more data, they want it faster, and they want it to be more transparent than ever before.

For financial services executives and compliance professionals, this isn’t just a compliance headache—it’s a data infrastructure challenge. The days of manual spreadsheets and last-minute scrambles are over. The new requirements demand a level of agility and precision that legacy systems simply cannot support. If you’re still relying on manual processes to meet these evolving standards, you’re not just risking non-compliance; you’re risking your firm’s operational resilience.

The Shifting Landscape: More Data, More Often

The theme for 2026 is “more.” More frequent filings, more detailed disclosures, and more scrutiny. The SEC’s push for modernization is driven by a desire to better monitor systemic risk and protect investors, but for asset managers, it translates to a significant operational burden.

Take Form N-PORT, for example. What was once a quarterly obligation with a 60-day lag is transitioning to a monthly filing requirement due within 30 days of month-end. This tripling of filing frequency doesn’t just mean three times the work; it means your data governance and reporting engines must be “always-on,” capable of aggregating and validating portfolio data on a continuous cycle.

The “Big Three” for 2026: Form PF, 13F, and N-PORT

While there are numerous reports to manage, three stand out as critical focus areas for 2026: Form PF, Form 13F, and Form N-PORT. Each has undergone significant changes or is subject to new scrutiny that demands your attention.

Form PF: The Private Fund Data Deep Dive

The amendments to Form PF, adopted in February 2024, represent a sea change for private fund advisers. With a compliance date of October 1, 2026, these changes require more granular reporting on fund structures, exposures, and performance. Large hedge fund advisers must now report within 60 days of quarter-end, and the scope of data required—from detailed asset class breakdowns to counterparty exposures—has expanded significantly. This isn’t just another new report. It’s a comprehensive audit of your fund’s risk profile, delivered quarterly.

Form 13F: The Institutional Standard

For institutional investment managers exercising discretion over $100 million or more in 13(f) securities, Form 13F remains a cornerstone of transparency. Filed quarterly within 45 days of quarter-end, this report now requires the companion filing of Form N-PX to disclose proxy votes on executive compensation. This linkage between holdings and voting records adds a new layer of complexity, requiring firms to seamlessly integrate data from their portfolio management and proxy voting systems.

Form N-PORT: The Monthly Sprint

A shift to monthly N-PORT filings is a game-changer for registered investment companies. The requirement to file within 30 days of month-end means that your month-end close process must be tighter than ever. Any delays in data reconciliation or validation will eat directly into your filing window, leaving little margin for error.

The Operational Burden: Hidden Costs of Manual Processes

It’s easy to underestimate the time and effort required to produce these reports. A “simple” quarterly update can easily consume a week or more of a compliance officer’s time when you factor in data gathering, reconciliation, and review.

For a large hedge fund adviser, we at Perficient have seen a full Form PF filing taking two weeks or more of dedicated effort from multiple teams. When you multiply this across all your reporting obligations, the cost of manual processing becomes staggering. And that’s before you consider the opportunity cost—time your team spends wrangling data is time they aren’t spending on strategic initiatives or risk management.

The Solution: Automation and Cloud Migration

The only viable path forward is automation. To meet the demands of 2026, asset managers must treat regulatory reporting as a data engineering problem, not just a compliance task. This means moving away from siloed spreadsheets and towards a centralized, cloud-native data platform.

By migrating your data infrastructure to the cloud, you gain the scalability and flexibility needed to handle large datasets and complex calculations. Automated data pipelines can ingest, validate, and format your data in real-time, reducing the “production time” from weeks to hours. This isn’t just about efficiency; it’s about accuracy and peace of mind. When your data is governed and your processes are automated, you can file with confidence, knowing that your numbers are right.

Key Regulatory Reports at a Glance

To help you navigate the 2026 reporting calendar, we’ve compiled a summary of the key reports, their purpose, and what it takes to get them across the finish line.

Sec Forms Asset Managers Must File

Your Next Move

If your firm would like assistance designing or adopting regulatory reporting processes or migrating your data infrastructure to the cloud with a consulting partner that has deep industry expertise – reach out to us here.

]]>
https://blogs.perficient.com/2026/02/20/2026-regulatory-reporting-for-asset-managers-navigating-the-new-era-of-transparency/feed/ 0 390547
Simplifying API Testing: GET Requests Using Karate Framework https://blogs.perficient.com/2026/02/16/simplifying-api-testing-get-requests-using-karate-framework/ https://blogs.perficient.com/2026/02/16/simplifying-api-testing-get-requests-using-karate-framework/#respond Mon, 16 Feb 2026 06:07:23 +0000 https://blogs.perficient.com/?p=369929

The GET HTTP method is commonly used to retrieve data from a server. In this blog, we’ll explore how to automate GET requests using the Karate testing framework, a powerful tool for API testing that supports both BDD-style syntax and rich validation capabilities.

We’ll cover multiple scenarios starting from a simple GET call to advanced response validation using files and assertions.

Step 1: Creating the Feature File

To begin, create a new feature file named GetApi.feature in the directory:

/src/test/java/features

Ensure the file has a valid .feature extension. We’ll use the ReqRes API, a public API that provides dummy user data.

Scenario 1: A Simple GET Request

Feature: Get Api feature 
Scenario: Get API Request 
    Given url 'https://reqres.in/api/users?page=2' 
    When method GET 
    Then status 200 
    And print response

 

Feature: Get Api feature

  Scenario: Get API Request
    Given url 'https://reqres.in/api/users?page=2'
    When method GET
    Then status 200
    And print response

Step-by-Step Breakdown:

  • Sends a GET request to https://reqres.in/api/users?page=2
  • Asserts the status code is 200
  • Prints the response for visibility.

Sample Response:

{
  "page": 2,
  "per_page": 6,
  "total": 12,
  "data": [
    {
      "id": 7,
      "email": "michael.lawson@reqres.in",
      "first_name": "Michael",
      "last_name": "Lawson",
      "avatar": "https://reqres.in/img/faces/7-image.jpg"
    },
    {
      "id": 8,
      "email": "lindsay.ferguson@reqres.in",
      "first_name": "Lindsay",
      "last_name": "Ferguson",
      "avatar": "https://reqres.in/img/faces/8-image.jpg"
    }
    // additional data here...
  ]
}

If the expected status is changed, for example to 201, the scenario will fail:

GetApi.feature:11 - status code was: 200, expected: 201, response time: 1786, url: https://reqres.in/api/users?page=2

Scenario 2: GET Request with Background

Feature: Get Api feature

  Background:
    Given url 'https://reqres.in/api'
    And header Accept = 'application/json'

  Scenario: Get API Request with Background 
    Given path '/users?page=2'
    When method GET
    Then status 200
    And print responseStatus

The Background section allows us to define common settings such as URLs and headers once and reusing them across multiple scenarios.

Scenario 3: GET Request with Query Parameter

Feature: Get Api feature

  Background:
    Given url 'https://reqres.in/api'
    And header Accept = 'application/json'

  Scenario: Get API Request with Query Parameter 
    Given path '/users'
    And param page = 2
    When method GET
    Then status 200
    And match header Connection == 'keep-alive'
    And print "response time: " + responseTime

This approach explicitly adds query parameters using param, improving flexibility.

Output:

[print] response time: 1319

Scenario 4: Verifying the Response with Assertions

In Karate, assertions are used to validate the behavior of APIs and other test scenarios. These assertions help verify the response values, structure, status codes, and more. Karate provides several built-in functions for assertions, making it easy to validate complex scenarios.

The match keyword is the most common assertion method in Karate. It can be used to match:

  • Simple values (strings, numbers, etc.)
  • Complex objects (arrays, nested structures)

Karate allows rich assertion syntax using the match keyword. It supports:

  • Exact Matching

  • Partial Matching

  • Fuzzy Matching

  • Boolean Assertions

Examples:

Exact Matching

Scenario: Validate exact match 
Given url 'https://reqres.in/api/users/2' 
When method GET Then match response.data.first_name == 'Janet'

 

Scenario: Validate exact match
  Given url 'https://reqres.in/api/users/2'
  When method GET
  Then match response.data.first_name == 'Janet'

Partial Matching

Scenario: Validate partial match
  Given url 'https://reqres.in/api/users/2'
  When method GET
  Then match response contains { "data": { "first_name": "Janet" } }

Fuzzy Matching:

Fuzzy matching ignores extra fields and only focuses on the structure or specific fields:

Scenario: Fuzzy match example
  Given url 'https://reqres.in/api/users/2'
  When method GET
  Then match response == { data: { id: '#number', email: '#string' } }
  • #number and #string are placeholders that match any numeric or string values, respectively
  • Karate also supports using assert for boolean expressions. It evaluates the expression and returns true or false.

Combined Assertions

Feature: Get Api feature

  Background:
    * Given url 'https://reqres.in/api'
    * And header Accept = 'application/json'

  Scenario: Get API Request with Assertions
    Given path '/users'
    And param page = 2
    When method GET
    Then status 200
    And match response.data[0].first_name != null
    And assert response.data.length == 6
    And match response contains deep {"data":[ {"name": "blue turquoise"}]} 
    And match response contains deep {"text": "To keep ReqRes free, contributions towards server costs are appreciated!"}
    And match header Content-Type == 'application/json'<br />And match header Connection == 'keep-alive'

Scenario 5: Validating Responses Using External Files

File validation using the Karate framework can be achieved by verifying the content, existence, and attributes of files in various formats (e.g., JSON, XML, CSV, or even binary files). Karate provides built-in capabilities to handle file operations, such as reading files, validating file contents, and comparing files.

In this scenario:

  • We use the read function to read the data.json file from the classpath.
  • The match step is used to validate the content of the response against the expected data in the file.
Feature: Get Api feature
  
  Background: set up the base url
     * Given url 'https://reqres.in'  
     * And header Accept = 'application/json'
  
  Scenario: Get API Request with File Validation
    Given path '/api/users?page=2'
    When method get
    Then status 200
    * def actualResponse = read("../JsonResponse.json") // create a veriable to store the data from external json file
    And print 'File-->" , actualResponse
    And match response == actualResponse

In this session, we’ve demonstrated how to

  • Create a basic GET request in Karate

  • Use background steps for reusability

  • Pass query parameters

  • Assert and validate API responses

  • Leverage file-based validation

The Karate framework makes it easy to write expressive, maintainable, and powerful API tests. By combining built-in features like match, assert, and read, you can create robust test suites to ensure your APIs behave as expected.

By leveraging scenarios, parameters, and assertions, we can effectively automate and validate API requests.

]]>
https://blogs.perficient.com/2026/02/16/simplifying-api-testing-get-requests-using-karate-framework/feed/ 0 369929
Building a Marketing Cloud Custom Activity Powered by MuleSoft https://blogs.perficient.com/2026/02/12/building-a-marketing-cloud-custom-activity-powered-by-mulesoft/ https://blogs.perficient.com/2026/02/12/building-a-marketing-cloud-custom-activity-powered-by-mulesoft/#comments Thu, 12 Feb 2026 17:37:13 +0000 https://blogs.perficient.com/?p=390190

The Why…

Salesforce Marketing Cloud Engagement is incredibly powerful at orchestrating customer journeys, but it was never designed to be a system of record. Too often, teams work around that limitation by copying large volumes of data from source systems into Marketing Cloud data extensions—sometimes nightly, sometimes hourly—just in case the data might be needed in a journey. This approach works, but it comes at a cost: increased data movement, synchronization challenges, latency, and ongoing maintenance that grows over time.

Custom Activities, which are surfaced in Journey Builder, open the door to a different model. Instead of forcing all relevant data into Marketing Cloud ahead of time, a journey can request exactly what it needs at the moment it needs it. When you pair a Custom Activity with MuleSoft, Marketing Cloud can tap into real-time, orchestrated data across your enterprise—without becoming another place where that data has to live.

Example 1: Weather

Consider a simple example like weather-based messaging. Rather than pre-loading weather data for every subscriber into a data extension, a Custom Activity can call an API at decision time, retrieve the current conditions for a customer’s location, and immediately branch the journey or personalize content based on the response. The data is used once, in context, and never stored unnecessarily inside Marketing Cloud.

Example 2: Enterprise Data

The same pattern becomes even more compelling with enterprise data. Imagine a post-purchase journey that needs to know the current status of an order, a shipment, or a service case stored in a system like Data 360. Instead of replicating that operational data into Marketing Cloud—and keeping it in sync—a Custom Activity can call MuleSoft, which in turn retrieves and aggregates the data from the appropriate back-end systems and returns only what the journey needs to proceed.

Example 3: URL Shortener for SMS (Real-Time)

While Marketing Cloud Engagement does provide it own form of a URL shortener, some companies want to use Bitly.  Typically in order to use a Bitly URL we would have to move our logic to Server Side Javascript (SSJS) so the API call to Bitly could be made in the SSJS, and then we could use the URL in our text message.  SSJS forces us to use Automation Studio which cannot be run in real-time and must be scheduled.  This is very important to note, that being able to do API calls within the flow of a Journey is very powerful and helps to meet more real-time use cases. With these Custom Activities we can ask Mulesoft to call the Bitly API which returns the shortened URL so then it can be used in the email or SMS message.

This is where MuleSoft truly shines. It acts as a clean abstraction layer between Marketing Cloud and your enterprise landscape, handling authentication, transformation, orchestration, and governance. Marketing Cloud stays focused on customer engagement, while MuleSoft owns the complexity of integrating with source systems. The result is a more scalable, real-time, and maintainable architecture—one that reduces data duplication, respects system boundaries, and enables richer, more contextual customer experiences.

The How….

So how does this actually work in practice? In the next section, we’ll walk through how a Marketing Cloud Custom Activity can call a MuleSoft API in the middle of a Journey, receive a response in real time, and use that data to drive decisions or personalization. We’ll focus on the key building blocks—what lives in Marketing Cloud, what belongs in MuleSoft, and how the two communicate—so you can see how this pattern comes together without turning Marketing Cloud into yet another integration layer.

Part 1 – Hosted Files

Every Marketing Cloud Custom Activity starts with hosted files. These files provide the user interface and configuration that Journey Builder interacts with, making them the foundation of the entire solution. At a minimum, this includes five main files/folders.

  1. index.html – This is what you see in Journey Builder when you click on the Custom Activity to configure it.
  2. config.json – This holds the Mulesoft endpoint to call and what output arguments will be used.
  3. customactivity.js – The javascript that is running behind the index.html page.
  4. postmonger.js – More javascript to support the index.html page
  5. A folder called images must exist and a single icon.png image should exist in it.  This image is shown within Journey Builder.

Blog Ca Files

These files tell Marketing Cloud how the activity behaves, what endpoints it uses, and how it appears to users when they drag it onto a journey. While the business logic ultimately lives elsewhere, within Mulesoft in our example, hosted files are what make the Custom Activity feel native inside Journey Builder.

In this pattern, hosted files are intentionally lightweight. Their primary responsibility is to capture configuration input from the marketer—such as which API operation to call, optional parameters, or behavior flags—and pass that information along when the journey executes. They are not responsible for complex transformations, orchestration, or direct system-to-system integrations. By keeping the hosted files focused on presentation and configuration, you reduce coupling with backend systems and make the Custom Activity easier to maintain, update, and reuse across different journeys.

A place to do a simple proof of concept is on GitHub if you want to try this yourself.  You can easily create these four files and one folder in a repo.  If you use GitHub, then you do have to use the Pages functionality in GitHub to make that repo public.  This public URL will then be used when we configure the ‘Installed App’ in Marketing Cloud Engagement later.

In production, Custom Activity config.json and UI assets should be hosted on an enterprise‑grade HTTPS platform like Azure App Service, AWS CloudFront/S3, or Heroku—not GitHub.

One thing I had to overcome is that the config.json gets cached at the Marketing Cloud server level as talked about in this post.  So when I had to make changes to my config.json, I would create a new folder (v2 / v3) in my repository and then use that path in my Installed Package in the Component added in Journey Builder.

Part 2 – API Server – Mulesoft

This is really the beauty here.  Instead of building API calls in SSJS that are hard to debug, difficult to scale and hard to secure, we get to pass all of that off to an enterprise API platform like Mulesoft.  It really is the best of both worlds.  There are basically two main pieces on the Mulesoft side: A) Five endpoints to develop and B) security.

The Five Endpoints.

Journey Builder uses four lifecycle endpoints to manage the activity and one execute endpoint to process each contact and return outArguments used for decisioning and personalization.

The five endpoints that have to be developed in Mulesoft are…

Endpoint Called When Per Contact? Returns outArguments?
/save User saves config ❌ ❌
/validate User publishes ❌ ❌
/publish Journey goes live ❌ ❌
/execute Contact hits activity ✅ ✅
/stop Journey stops ❌ ❌

For the save, validate, publish and stop in Mulesoft they need to return a 200 status code and can return an empty JSON string of {} in the most basic example.

For the execute method, it should also return a 200 status code and simple json that looks like this for any outArguments…  { status: “myStatus” }

The Security.

The first piece of security is configured in the config.json file.   There is a useJwt key that can either be true of false for each of the endpoint.   If it is true, then Mulesoft will receive an encoded string based on the JWT Signing Secret that was created from the Installed Package in Marketing Cloud.  If jwt is false then Mulesoft will just receive the plain JSON.  For production level work we should make sure jwt is true.
We can also use an OAuth 2.0 Bearer Token.  We want to make sure that our Mulesoft endpoints are only responding to calls coming from Marketing Cloud Engagement.

Part 3 – Journey Builder – Custom Activities

Once the configuration details are setup in the app described in step 2, then creating the custom activity and adding it to the Journey is pretty quick.
  1. Go to the ‘Installed Package’ in setup and create a new app following these steps.
    1. When you add your ‘Component’ to the Installed App selecting ‘Customer Updates’ in the ‘Category’ drop-down worked for me.
    2. My ‘Endpoint URL’ had a format like this:  https://myname.github.io/my_repo_name/v3/
      Blog Ca Package
  2. Create a new Journey
  3. Your new Custom Activity will show up in the Components panel on the left-hand side.  Since we selected ‘Customer Updates’ in step 1 above, our ‘Send to Mulesoft V3a’ Custom Activity shows in that section.   The name under the icon comes from the config.json file.  The image is the icon.png from the images folder.
    Blog Jb View
  4. Once you drag your Custom Activity onto the Journey Builder page you will be able to click on it to configure it.
  5. The user interface from the index.html will display when you click on it so you can configure your Custom Activity.  Note that this user interface could be changed to collect whatever configuration needs to be collected.
    Blog Ca Indexpage
  6. When the ‘Done’ buttons are clicked on the page, then the javascript runs and saves the configuration details into the Journey Builder itself.  In my example the gray and blue ‘Done’ buttons are hooked to the same javascript and really do the same thing.

Part 4 – How to use the Custom Activity

outArguments

Now that we have our Custom Activity configured and in our journey, now the integration with Mulesoft becomes a configuration detail which is so great for admins.  In the config.json file there are two places where the outArguments are placed.
The first is in the arguments section towards the top.  Here I can provide a default value for my status field, which is this case is the very intuitive “DefaultStatus”.  🙂
"arguments": {
   "execute": {
     "inArguments": [],
     "outArguments": [
       {
         "status": "DefaultStatus"
       }
     ],
     "url": "https://mymuleAPI.partofurl.usa-e1.cloudhub.io/api/marketingCloud/execute",
     "useJwt": false,
     "timeout": 60000,
     "retryCount": 3,
     "retryDelay": 3000,
     "concurrentRequests": 5
   }
 },

The second place is lower in the config.json file in the schema section and describes the actual data type for my output variable.  We can see the status variable is a ‘Text’ field, that has access = visible and direction = out.

"schema":{
      "arguments":{
          "execute":{
              "inArguments": [],
              "outArguments":[
                  {
                      "status":{
                          "dataType":"Text",
                          "isNullable":true,
                          "access":"visible",
                          "direction":"out"
                      }
                  }
              ]
          }
      }
  }

Note in the example below that I did not use a typical status value like ‘Not Started’, ‘In Progress ‘ and ‘Done’.  That would have made more sense. 🙂  Instead I was running five records through my journey with various versions of my last name: Luschen, Luschen2, Luschen3, Luschen4 and Luschen5.  So Mulesoft was basically received these different spellings through the json being passed over, parsed it out of the incoming json and then injected it into the response json in the status field.  This is what the incoming data extension looked like.

Blog De

An important part of javascript turned out to be setting the isConfigured flag to true in the customActivity.js file.  This makes sure Journey Builder understands that node has been configured when the journey is ‘Validated’ before it is ‘Activated’.

activity.metaData = activity.metaData || {};
activity.metaData.isConfigured = true;

Now that we have our ‘status’ field as an output from Mulesoft via the Custom Activity, I will describe how it can be used in either a Decision Split or some AmpScript.

Decision Split

The outArguments show up under the ‘Journey Data’ portion of the configuration screen.  Once you select the ‘status’ outArgument you configure the rest of the decision split like any other one you have built before.
Blog Ca Decision Split
Blog Ca Decision Split2

AmpScript

These outArguments are also available as send context attributes so they are easy to use in any manner you want within your AmpScript for either email or SMS personalization.
%%[
SET @status = AttributeValue(“status”)
]%%
%%=v(@status)=%%

The Wrap-up…

As you let the flexibility of these Custom Activities sink in, it really creates a lot of flexible patterns.  The more data we can surface to our marketing team, the more dynamic, personalized and engaging the content will become.  While we all see more campaigns and use cases being developed on the new Agentforce Marketing, we all know that Marketing Cloud Engagement has some legs to it yet.  I hope this post has given you some ideas to make your Marketing team look like heros as they use Journey Builder to its fullest potential!

I want to thank my Mulesoft experts Anusha Danda and Jana Pagadala for all of their help!

Please connect with me on LinkedIn for more conversations!  I am here to help make you a hero with your next Salesforce project.

Example Files…

Config.JSON

{  
  "workflowApiVersion": "1.1",
  "metaData": {
    "icon": "images/icon.png",
    "category": "customer",
    "isConfigured": true,
    "configOnDrop": false
  },
  "type": "REST",
  "lang": {
    "en-US": {
      "name": "Send to MuleSoft V3a",
      "description": "Calls MuleSoft to orchestrate downstream systems V3a."
    }
  },
  "arguments": {
    "execute": {
      "inArguments": [],
      "outArguments": [
        {
          "status": "DefaultStatus"
        }
      ],
      "url": "https://myMuleAPI.rajrd4-1.usa-e1.cloudhub.io/api/marketingCloud/execute",
      "useJwt": true,
      "timeout": 60000,
      "retryCount": 3,
      "retryDelay": 3000,
      "concurrentRequests": 5
    }
  },
  "configurationArguments": {
    "applicationExtensionKey": "MY_KEY_ANYTHING_I_WANT_MULESOFT_TEST",
    "save":    { "url": "https://myMuleAPI.rajrd4-1.usa-e1.cloudhub.io/api/marketingCloud/save",    "useJwt": true },
    "publish": { "url": "https://myMuleAPI.rajrd4-1.usa-e1.cloudhub.io/api/marketingCloud/publish", "useJwt": true },
    "validate":{ "url": "https://myMuleAPI.rajrd4-1.usa-e1.cloudhub.io/api/marketingCloud/validate","useJwt": true },
    "stop":    { "url": "https://myMuleAPI.rajrd4-1.usa-e1.cloudhub.io/api/marketingCloud/stop",    "useJwt": true }
  },
  "userInterfaces": {
    "configModal": { "height": 480, "width": 480 }
  },
  "schema":{
      "arguments":{
          "execute":{
              "inArguments": [],
              "outArguments":[
                  {
                      "status":{
                          "dataType":"Text",
                          "isNullable":true,
                          "access":"visible",
                          "direction":"out"
                      }
                  }
              ]
          }
      }
  }
}

Index.html

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8" />
  <title>Terry – JB → Mule Custom Activity</title>
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <style>
    body { font-family: system-ui, -apple-system, Segoe UI, Roboto, Arial, sans-serif; margin: 24px; }
    label { display:block; margin-top: 16px; font-weight:600; }
    input, select, button { padding: 8px; font-size: 14px; }
    button { margin-top: 20px; }
    .hint { color:#666; font-size:12px; }
  </style>
</head>
<body>
  <h2>Send to MuleSoft – Custom Activity</h2>
  <p class="hint">Configure the API URL and (optionally) bind a Journey field3.</p>

  <label for="apiUrl">MuleSoft API URL</label>
  <input id="apiUrl" type="url" placeholder="https://api.example.com/journey/execute2" style="width:100%" />

  <label for="fieldPicker">Bind a field from Entry Source (optional)</label>
  <select id="fieldPicker">
    <option value="">— none —</option>
  </select>

  <button id="done">Done</button>

  <!-- Postmonger must be local in your repo - ADD BEGIN AND CLOSE BRACKETS BELOW-->
  script src="./postmonger.js"></script
  <!-- Your Postmonger client logic - ADD BEGIN AND CLOSE BRACKETS BELOW-->
  script src="./customActivity.js?v=2026-02-02v1"></script
</body>
</html>

 

CustomActivity.js

/* global Postmonger */
(function () {
  'use strict';

  // Create the Postmonger session (bridge to Journey Builder)
  const connection = new Postmonger.Session();

  // Journey Builder supplies this payload when we call 'ready'
  let activity = {};
  let schema = [];
  let pendingSelectedField = null;  // holds saved token until options exist

  document.addEventListener('DOMContentLoaded', () => {
    // Listen to JB lifecycle events
    connection.on('initActivity', onInitActivity);
    connection.on('requestedTokens', onTokens);
    connection.on('requestedEndpoints', onEndpoints);
    connection.on('requestedSchema', onRequestedSchema); // common pattern in field pickers
    connection.on('clickedNext', onDone);

    // Signal readiness and request useful context
    connection.trigger('ready');
    connection.trigger('requestTokens');
    connection.trigger('requestEndpoints');

    // Optionally, ask for Entry Source schema (undocumented but widely used in the field)
    connection.trigger('requestSchema');

    // Bind UI
    document.getElementById('done').addEventListener('click', onDone);
  });

  function onInitActivity (payload) {
    activity = payload || {};
    // Re-hydrate UI if the activity is being edited
    try {
      const args = (activity.arguments?.execute?.inArguments || [])[0] || {};
      if (args.apiUrl) document.getElementById('apiUrl').value = args.apiUrl;
      if (args.selectedField) document.getElementById('fieldPicker').value = args.selectedField;
      pendingSelectedField = args.selectedField;
    } catch (e) {}
  }

  function onTokens (tokens) {
    // If you ever need REST/SOAP tokens, they arrive here
    // console.log('JB tokens:', tokens);
  }

  function onEndpoints (endpoints) {
    // REST base URL for BU, if you need it
    // console.log('JB endpoints:', endpoints);
  }

  function onRequestedSchema (payload) {
    schema = payload?.schema || [];
    const select = document.getElementById('fieldPicker');

    // Keep current value if re-opening
    const current = select.value;
    // Reset options (leave the first '— none —')
    select.length = 1;

    // Populate with Entry Source keys (e.g., {{Event.APIEvent-UUID.Email}})
    schema.forEach(col => {
      const opt = document.createElement('option');
      opt.value = `{{${col.key}}}`;
      opt.textContent = col.key.split('.').pop();
      select.appendChild(opt);
    });

    if (current) select.value = current;
    if (pendingSelectedField) select.value = pendingSelectedField;
    
  }

  function onDone () {
    const apiUrl = document.getElementById('apiUrl').value?.trim() || '';
    const selectedField = document.getElementById('fieldPicker').value || '';

    // Validate minimal config
    if (!apiUrl) {
      alert('Please provide a MuleSoft API URL.10');
      return;
    }
    // alert(selectedField);

    // Build inArguments that JB will POST to /execute at run time
    const inArguments = [{
      apiUrl,            // static value from UI
      selectedField      // optional mustache ref to Journey Data
    }];

    // Mutate the activity payload we received and hand back to JB
    activity.arguments = activity.arguments || {};
    activity.arguments.execute = activity.arguments.execute || {};
    activity.arguments.execute.inArguments = inArguments;

    activity.metaData = activity.metaData || {};
    activity.metaData.isConfigured = true;

    // Tell Journey Builder to save this configuration
    connection.trigger('updateActivity', activity);
  }
})();

 

]]>
https://blogs.perficient.com/2026/02/12/building-a-marketing-cloud-custom-activity-powered-by-mulesoft/feed/ 3 390190
The Missing Layer: How On-Device AI Agents Could Revolutionize Enterprise Learning https://blogs.perficient.com/2026/02/06/the-missing-layer-how-on-device-ai-agents-could-revolutionize-enterprise-learning/ https://blogs.perficient.com/2026/02/06/the-missing-layer-how-on-device-ai-agents-could-revolutionize-enterprise-learning/#comments Fri, 06 Feb 2026 13:29:58 +0000 https://blogs.perficient.com/?p=390162

A federated architecture for self-improving skills — from every employee’s laptop to the company brain.


Every enterprise has the same problem hiding in plain sight. Somewhere between the onboarding wiki that nobody reads, the Slack threads that disappear after a week, and the senior engineer who carries half the team’s knowledge in their head — institutional knowledge is dying. Not because companies don’t try to preserve it, but because the systems we’ve built to capture it are fundamentally passive. They wait for someone to write a doc. They wait for someone to search. They never learn on their own.

What if every employee’s computer had an AI agent that watched, learned, and guided — and every night, those agents pooled what they’d learned into something smarter than any of them alone?

The State of Enterprise AI Assistants: Smart But Shallow

Today’s enterprise AI tools — Google Agentspace, Microsoft Copilot, Moveworks, Atomicwork — follow the same pattern. A large language model sits in the cloud, connected to your company’s knowledge base. Employees ask questions, the model retrieves answers. It works. But it has three fundamental limitations.

First, all intelligence is centralized. The model only knows what’s been explicitly fed into the knowledge base. It doesn’t learn from the thousands of micro-interactions employees have daily — the workarounds they discover, the mistakes they make, the shortcuts they invent.

Second, there’s no feedback loop from the edge. When a new hire spends 40 minutes figuring out that the VPN must be connected before accessing the PTO portal, that hard-won knowledge dies in their browser history. The next new hire will spend the same 40 minutes. The system never improves from use.

Third, one model serves everyone the same way. A junior developer and a senior architect get the same answers, in the same depth, with the same assumptions about what they already know.

A Different Architecture: Agents That Learn at the Edge

Imagine a three-tier system where intelligence lives at every level — on the employee’s device, on the department server, and at the company core. Each tier runs a different class of model, owns a different scope of knowledge, and communicates on a defined rhythm.

Tier 1: The On-Device Agent (7B–14B Parameters)

Every employee’s workstation runs a small but capable language model — something in the 7B to 14B parameter range, like Llama 3 8B or Qwen 2.5 14B. This model is paired with two things that make it useful: skills and memory.

Skills are structured instructions — think of them as markdown playbooks that tell the agent how to guide the user through specific tasks. A “setup-dev-environment” skill walks a new developer through installing dependencies, configuring their IDE, and running the test suite. A “code-review-checklist” skill ensures PRs meet team standards. These aren’t hardcoded — they’re living documents that the agent reads and follows, and they can be updated without retraining the model.

Memory comes in two layers. Short-term memory captures the day’s interactions: what the user asked, where they got stuck, what worked, what corrections they made. This is append-only, timestamped, and stored locally. Long-term memory is a curated set of facts about the user — their role, expertise level, preferred tools, recurring tasks — that persists across sessions and personalizes every interaction.

The on-device agent is always available, even offline. It responds instantly because there’s no round-trip to a server. And critically, sensitive information — proprietary code, internal discussions, personal struggles — never leaves the machine during the workday.

Tier 2: The Department Server (40B Parameters)

Each department — Engineering, Operations, Sales — runs its own server with a more powerful model in the 40B parameter range. This server has three jobs.

Collecting learnings. On a configurable schedule — real-time, hourly, or nightly depending on the organization’s needs — each device pushes its short-term memory deltas to the department server. Not the raw conversation logs, but distilled learnings: “User discovered that the staging deploy requires flag --skip-cache after the recent infrastructure migration.” A privacy filter strips personally identifiable information before anything leaves the device.

Semantic merging. This is where the 40B model earns its keep. When Device A reports “Docker builds fail on M-series Macs without Rosetta” and Device B reports “ARM architecture causes container build errors on Apple Silicon,” the server recognizes these as the same insight expressed differently. It merges them into a single, authoritative entry in the department’s golden copy — the canonical knowledge base for that team.

Conflict resolution with authority. Not all learnings are equal. The system uses an authority model inspired by API authentication scopes. Each device agent carries a token encoding the user’s role and trust level. A junior developer’s correction gets queued for review. A senior engineer’s correction is auto-merged. A team lead can approve or reject queued items. This prevents the golden copy from being polluted by well-intentioned but incorrect contributions while ensuring high-confidence knowledge flows freely.

After merging, the department server pushes updated skills back to all devices. Tomorrow morning, when a new hire boots up, their agent already knows about the --skip-cache flag — because someone else discovered it yesterday.

Tier 3: The Company Master Server (70B Parameters)

At the top sits the most powerful model — 70B parameters — responsible for the company-wide knowledge layer. This server doesn’t communicate with individual devices. It only syncs with department servers, exchanging golden copies on a daily or weekly cadence.

The key constraint: departments don’t share raw learnings with each other. Engineering doesn’t see Sales’ objection-handling patterns; Sales doesn’t see Engineering’s debugging workflows. This is both a privacy boundary and a relevance filter — most departmental knowledge is only useful within that department.

But the master server can synthesize cross-cutting insights that no single department would discover alone. If Engineering’s golden copy contains “API response times increased 3x after the v2.4 release” and Sales’ golden copy contains “customer complaints about dashboard loading times spiked this week,” the 70B model connects the dots. It pushes a unified advisory to both departments: Engineering gets “customer-facing impact confirmed — prioritize the performance regression,” and Sales gets “engineering is aware of the dashboard slowdown — expected resolution timeline: 48 hours.”
Company Master Server

Each Device Runs

The Daily Rhythm

The system operates on a natural cycle:

Morning. Department servers push updated skills to all devices. Each agent loads the latest golden copy fragments relevant to its user’s role. A new developer gets the freshly refined “setup-dev-environment” skill. A senior engineer gets the latest “production-incident-response” playbook with patterns learned from last week’s outage.

Workday. Each on-device agent guides its user, answers questions, and logs everything to short-term memory. When a user corrects the agent — “No, that’s wrong, you need to run migrations before starting the server” — the agent captures the correction with the user’s authority level.

Sync interval. Based on organizational preference, devices push their learnings to the department server. This could be real-time streaming for fast-moving teams, hourly batches for a balance of freshness and bandwidth, or nightly bulk uploads for organizations prioritizing minimal disruption.

Server processing. The department’s 40B model performs semantic merging — deduplicating, resolving conflicts, filtering PII, and distilling raw observations into authoritative skill updates. High-trust contributions go straight to the golden copy. Lower-trust contributions are queued for review.

Company sync. On a separate, slower cadence, department servers exchange golden copies with the company master. The 70B model looks for cross-departmental patterns and pushes synthesized insights back down.

The Interface: A Chatbot and Coding Agent on Every Machine

The three-tier architecture is the brain. But what the employee actually interacts with is a local chatbot and coding agent running on their machine — powered by the on-device model and grounded in the golden copy that was pushed down that morning.

This isn’t a generic AI assistant. It’s an agent that knows the company’s way of doing things, because the golden copy is the company’s accumulated, distilled operational knowledge. Every answer, every suggestion, every code change it proposes is informed by the patterns, standards, and hard-won lessons that the entire department has contributed to.

For Developers: A Coding Agent That Knows Your Codebase Standards

A developer opens their IDE and the on-device coding agent is available inline — similar to how tools like GitHub Copilot or Cursor work today, but backed by the department’s golden copy rather than a generic training corpus. When the developer writes a new API endpoint, the agent doesn’t just autocomplete syntax. It suggests the error handling pattern that the team standardized last quarter. It flags that the developer is about to use a deprecated internal library that three other engineers already migrated away from. It proposes the exact test structure that passed code review most consistently, based on patterns the department server distilled from hundreds of merged PRs.

If the developer asks “how do I connect to the staging database?” the agent doesn’t give a generic PostgreSQL tutorial. It gives the team’s specific connection string format, reminds them to use the read-only replica for queries, and mentions the VPN requirement — all because those details were learned by other developers’ agents, merged into the golden copy, and pushed down as part of this morning’s skill update.

For New Hires: A Conversational Onboarding Guide

A new operations hire opens the chatbot on day one and simply asks: “What should I do first?” The agent responds with a structured onboarding path tailored to their role — not from a static wiki, but from a living skill that has been refined by the struggles and discoveries of every previous new hire. It walks them through account setup, tool installation, and first tasks step by step, answering follow-up questions in context.

When the new hire asks a question the agent can’t answer confidently, it says so — and logs the gap. That gap becomes a learning signal: if three new hires in a row ask the same unanswered question, the department server flags it as a missing skill that needs to be authored by a senior team member. The system doesn’t just answer questions. It discovers which questions should have answers but don’t yet.

For Everyone: A Knowledge Q&A Layer

Beyond coding and onboarding, the chatbot serves as a universal knowledge interface. “What’s the process for requesting a new AWS account?” “Who owns the billing microservice?” “What changed in the deployment pipeline last week?” These questions get answered instantly from the golden copy, with the confidence that the answers reflect the department’s current, collectively validated understanding — not a stale Confluence page from 2023.

The agent can also proactively surface relevant knowledge. If it detects that a developer is working on the authentication module (based on file context), it might surface a note from the golden copy: “Reminder: the auth module has a known race condition under high concurrency. See the workaround documented after the January incident.” This isn’t the agent being clever — it’s the golden copy doing its job, putting the right knowledge in front of the right person at the right time.

Why On-Device Matters

Running a model on every employee’s machine isn’t just an architectural choice — it unlocks capabilities that cloud-only systems can’t match.

Privacy by design. Code, internal communications, and personal context never leave the device during work hours. Only distilled, anonymized learnings sync to the server. This matters enormously for regulated industries and for employee trust.

Zero-latency guidance. The agent responds in milliseconds, not seconds. For a developer in flow state, the difference between an instant inline suggestion and a 2-second cloud round-trip is the difference between staying focused and being interrupted.

Personalization without centralization. The on-device agent knows this user’s preferences, skill level, and work patterns. It adapts its explanations, adjusts its depth, and remembers past conversations — all locally, without the server needing to maintain per-user state.

Offline resilience. The agent works on airplanes, in server rooms with restricted connectivity, and during cloud outages. The skills it loaded that morning are sufficient for most guidance tasks.

The Federated Learning Parallel

This architecture mirrors a well-established pattern in machine learning: federated learning. Google uses it to improve phone keyboards — each device trains locally on your typing patterns, sends only model weight updates (not your texts) to a central server, and the server aggregates improvements that benefit all users.

The difference is that traditional federated learning operates on model weights — opaque numerical tensors. This system operates on natural-language skills and memories — human-readable markdown that can be version-controlled, audited, and manually edited. An engineering manager can open the golden copy, read every skill in plain English, and decide whether a particular learning should be promoted, revised, or rejected. This transparency is critical for enterprise adoption where auditability and human oversight are non-negotiable.

There’s also a conceptual parallel to knowledge distillation in ML research, where a large “teacher” model’s knowledge is compressed into a smaller “student” model for edge deployment. Here, the 70B company model’s synthesized insights are distilled into skill updates that the 7B device models can act on — not through weight transfer, but through updated natural-language instructions.

Concrete Scenarios

New Developer Onboarding (Week 1)

Monday morning. The developer’s laptop has a 7B model loaded with the Engineering department’s latest skills. The “new-hire-onboarding” skill activates automatically.

The agent walks through environment setup step by step. At step 4, the developer hits an error: node-gyp fails on their specific macOS version. They spend 15 minutes finding the fix on Stack Overflow and tell the agent: “I needed to install Xcode Command Line Tools first — add that as a prerequisite.”

The agent logs this to short-term memory with the user’s authority level (junior). At the next sync cycle, the department server receives this learning. Since three other new hires hit the same issue last month (already in the golden copy as a known friction point), the server’s 40B model upgrades the severity and adds the prerequisite to the onboarding skill.

Tuesday morning, the next new hire’s agent already includes: “Before proceeding, verify Xcode Command Line Tools are installed: xcode-select --install.”

Cross-Department Insight Discovery

The Engineering golden copy contains: “API latency P99 increased from 200ms to 800ms after deploying service mesh v3.2.”

The Sales golden copy contains: “Three enterprise prospects paused contract negotiations citing ‘platform performance concerns’ this quarter.”

Neither department connected these. During the weekly company sync, the master 70B model identifies the correlation and pushes an advisory to both: Engineering receives a business-impact escalation, and Sales receives a technical context update with an estimated resolution timeline sourced from Engineering’s incident tracking.

Open Questions and Honest Limitations

This architecture is a synthesis of existing building blocks — on-device models, skill-based agent systems, federated sync patterns, semantic merging — assembled in a way that doesn’t exist as a product today. Several hard problems remain.

Merge quality at scale. Semantic merging works well with 10 devices. With 500, the volume of daily learnings could overwhelm even a 40B model’s ability to meaningfully synthesize. Hierarchical sub-teams within departments — team leads running intermediate merges — may be necessary.

Skill drift. If the golden copy evolves continuously, skills from six months ago might be unrecognizable. Version control and the ability to diff skill changes over time are essential. Treating the golden copy as a git repository with commit history is one approach.

Model capability at the edge. A 7B model can follow instructions and log observations, but its reasoning is limited. It might misinterpret a user’s correction or log a false insight. The authority system mitigates this — low-trust contributions get reviewed — but it doesn’t eliminate the risk.

Adoption friction. Employees need to trust that their on-device agent isn’t surveillance. The system must be transparently opt-in for the learning cycle, with clear boundaries between what stays local and what syncs. The privacy filter must be verifiable, not just promised.

Hardware cost. Running a 7B model on every employee’s laptop requires machines with sufficient RAM and ideally a capable GPU. For many knowledge workers with modern laptops, this is already feasible. For organizations with aging hardware fleets, it may require phased rollout.

What Exists Today

The building blocks are real and available now:

  • On-device models in the 7B–14B range run comfortably on Apple Silicon Macs and modern workstations using tools like Ollama, llama.cpp, and LM Studio.
  • Skill-based agent frameworks — notably the AgentSkills open standard developed by Anthropic and adopted by multiple platforms — define exactly how to package instructions as markdown files that agents can discover and follow.
  • Memory architectures with short-term daily logs and long-term curated knowledge are production-tested in platforms like OpenClaw, which uses MEMORY.md for persistent facts and memory/YYYY-MM-DD.md for daily context.
  • Self-improving agent patterns exist in the wild — OpenClaw’s community has published skills that capture corrections and learnings automatically, and the Foundry plugin demonstrates a full observe-learn-write-deploy loop on a single device.
  • Federated learning is a mature field in ML research, with frameworks like NVIDIA FLARE and Flower enabling distributed training across devices.
  • Hierarchical multi-agent architectures — supervisor agents coordinating specialist agents across departments — are in production at companies like BASF (via Databricks) and documented extensively by Microsoft and Salesforce.

What nobody has assembled is the specific combination: on-device small models learning from daily use, syncing through department servers with semantic merging and authority-based trust, rolling up to a company-wide master that discovers cross-departmental patterns — all operating on human-readable, version-controllable, natural-language skills rather than opaque model weights.

The Bet

The bet is simple. Today’s enterprise AI is a library — it holds knowledge and waits for you to ask. The architecture described here is a living organism — it learns from every employee, improves overnight, and wakes up smarter each morning.

Every company already has the knowledge it needs to onboard faster, debug quicker, and operate more efficiently. That knowledge just lives in the wrong places: in people’s heads, in forgotten Slack threads, in tribal rituals passed from senior to junior. An on-device AI agent that captures this knowledge as it’s created — and a federated system that distills it into something the whole organization can benefit from — doesn’t require any breakthrough in AI capability. It requires assembling pieces that already exist into a system that nobody has built yet.

The pieces are on the table. Someone just needs to put them together.


This post explores a conceptual architecture for federated, on-device AI agents in enterprise settings. The building blocks referenced — AgentSkills, OpenClaw, federated learning frameworks — are real, production-available technologies. The specific three-tier system described is a proposed design, not an existing product.

]]>
https://blogs.perficient.com/2026/02/06/the-missing-layer-how-on-device-ai-agents-could-revolutionize-enterprise-learning/feed/ 4 390162
Helm Unit Test: Validate Charts Before Deploy https://blogs.perficient.com/2026/02/02/helm-unit-tests-validate-charts-before-deployment/ https://blogs.perficient.com/2026/02/02/helm-unit-tests-validate-charts-before-deployment/#respond Mon, 02 Feb 2026 06:51:17 +0000 https://blogs.perficient.com/?p=390079

Why Helm UnitTest?

Untested charts risk syntax errors, wrong resources, or missing configs that surface only during installs. Unit tests render templates locally, catch issues early, and ensure values like replicas or images work as expected.

CI/CD pipelines run these tests automatically, blocking merges with failures. Using helm unittest will lower prod incidents from template bugs.

Install Helm-Unittest
Prerequisites: Helm 3.8+, Git, basic YAML skills.

Install the plugin:

helm plugin install

https://github.com/helm-unittest/helm-unittest.git

helm unittest --help  # Verify installation

Chart Structure

Organize your chart like this:

my-app/
├── Chart.yaml
├── values.yaml
├── templates/
│ ├── deployment.yaml
│ └── service.yaml
└── tests/
└── basic_test.yaml

Write Your First Test

Create tests/basic_test.yaml:

suite: basic deployment tests
templates:
  - deployment.yaml
tests:
  - it: should have correct kind
    asserts:
      - isKind:
          of: Deployment

  - it: sets replicas correctly
    set:
      replicaCount: 3
    asserts:
      - equal:
          path: spec.replicas
          value: 3

  - it: uses correct image
    set:
      image.repository: nginx
      image.tag: "1.21"
    asserts:
      - equal:
          path: spec.template.spec.containers[0].image
          value: nginx:1.21

This suite renders deployment.yaml, applies value overrides, and verifies structure.

Assertion Purpose Example
isKind Check resource type of: Deployment
equal Exact path match path: spec.replicas value: 2
matchSnapshot Full template validation {}
hasKey Field existence path: metadata.labels.app
matchRegex Pattern matching path: metadata.name value: "^my-app-.*"
contains Document count count: 1
isEmpty No documents {}
fileExists Template presence path: templates/deployment.yaml

Test labels, resources, env vars, and volumes similarly.

Run Tests

From chart root:

helm unittest .                    # Default run
helm unittest . -v                 # Verbose output
helm unittest . --output-type junit > test-results.xml  # CI reports
Green output shows passes; failures list exact path mismatches.

 

Green output shows passes; failures list exact path mismatches.

Example verbose run:

✓ basic_test.yaml: "should have correct kind" 
✓ basic_test.yaml: "sets replicas correctly"

 

Advanced Testing

Multiple Values Files:

- it: works with prod values
values:
- ../../values-prod.yaml
asserts:
- greater:
path: spec.template.spec.containers[0].resources.limits.memory
value: 1Gi

Subcharts: Prefix with subchartName//.
Conditionals: Test if blocks with varied set values.

Snapshots for golden files:

- it: matches production snapshot
  set:
    env: production
  asserts:
    - matchSnapshot: {}

Update with helm unittest . -u.

CI/CD Integration

GitHub Actions workflow (.github/workflows/test.yaml):

name: Helm Test
on: [pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - uses: azure/setup-helm@v4
      with:
        version: v3.15.0
    - run: helm plugin install https://github.com/helm-unittest/helm-unittest
    - run: helm unittest . --output-type junit > junit.xml
    - uses: actions/upload-artifact@v4
      with:
        name: test-results
        path: junit.xml

 

Best Practices

  • Follow AAA: Arrange (set values), Act (render), Assert (check).

  • One assertion per test for clear failures.

  • Test defaults, overrides, and edge cases (replicaCount: 0).

  • Combine with helm lint and helm template --validate.

  • Version control tests/ with your chart.

  • Use descriptive it: names.

Common Pitfalls

  • PathCheck typos: Use JSONPath like spec.template.spec.containers.[0].name.

  • Snapshot drift: Review changes before -u.

  • Global values: Scope with global.* paths.

Start testing your charts today—your future self will thank you

]]>
https://blogs.perficient.com/2026/02/02/helm-unit-tests-validate-charts-before-deployment/feed/ 0 390079
Unlocking the Power of On-Device AI with Google AI Edge https://blogs.perficient.com/2026/02/01/unlocking-the-power-of-on-device-ai-with-google-ai-edge/ https://blogs.perficient.com/2026/02/01/unlocking-the-power-of-on-device-ai-with-google-ai-edge/#comments Sun, 01 Feb 2026 23:11:54 +0000 https://blogs.perficient.com/?p=389883

In the rapidly evolving world of artificial intelligence, the shift from cloud-based processing to on-device AI is transforming how we interact with technology. Google is at the forefront of this revolution with Google AI Edge, a comprehensive suite of tools designed to help developers deploy high-performance AI directly on mobile, web, and embedded devices.

This recent rollout changes the game for how developers add smart features to applications. By moving processing to the edge, everything runs directly on the device—meaning faster performance, no need for an internet connection, and significantly better privacy since sensitive data stays local.

True Cross-Platform Support

One of the standout features of this update is its flexibility. In the past, running models across different ecosystems was a headache. Google AI Edge solves this with robust cross-platform support.

A single model can now work smoothly across Android, iOS, web browsers, and even small embedded hardware. Furthermore, it supports major frameworks like JAX, Keras, PyTorch, and TensorFlow, allowing you to avoid painful conversions when switching tools.

The Google AI Edge Stack

Google AI Edge isn’t just a single tool; it’s a full ecosystem designed to bridge the gap between complex ML models and consumer hardware.

Gemini Generated Image F0pphlf0pphlf0pp
The Google AI Edge Architecture

1. LiteRT (formerly TensorFlow Lite)

Recently renamed to LiteRT, this is the backbone of on-device execution. It is a high-performance runtime that enables fast model running with hardware acceleration (optimizing performance across NPUs, GPUs, and CPUs).

2. MediaPipe

If you need speed and ease of use, MediaPipe provides “low-code” solutions for common tasks. This includes ready-made APIs for object detection, hand tracking, and text processing.

3. Gemini Nano

The crown jewel of efficient AI, Gemini Nano is Google’s most efficient model built specifically for on-device tasks. With recent updates, Gemini Nano is now available for Android testing, making it much easier to build advanced, generative AI apps.

Experience it Live: The Google AI Edge Gallery

Reading about on-device AI is one thing, but seeing it in action is another. Google has released the AI Edge Gallery, an open-source Android and iOS application that showcases what’s possible today.

The Gallery isn’t just a tech demo; it’s a playground where you can run GenAI models fully offline. Key features include:

  • Tiny Garden: An experimental mini-game where you use natural language to plant and water flowers—processed entirely offline.
  • Ask Image: Snap a photo and ask questions about it using visual question answering capabilities.
  • Audio Scribe: Real-time transcription and translation of speech.
  • Performance Metrics: For developers, the app displays real-time benchmarks like “Time To First Token” (TTFT) so you can see exactly how fast a model runs on your specific hardware.

Get Started

Developers who want quicker, smarter user experiences should definitely explore this update. Whether you are looking to integrate Gemini Nano into your app or just want to test the limits of your smartphone, Google AI Edge provides the pathway.

]]>
https://blogs.perficient.com/2026/02/01/unlocking-the-power-of-on-device-ai-with-google-ai-edge/feed/ 1 389883
Deep Thinking with AI Clusters: The Future of Distributed Intelligence https://blogs.perficient.com/2026/01/29/deep-thinking-with-ai-clusters-the-future-of-distributed-intelligence/ https://blogs.perficient.com/2026/01/29/deep-thinking-with-ai-clusters-the-future-of-distributed-intelligence/#respond Thu, 29 Jan 2026 20:49:26 +0000 https://blogs.perficient.com/?p=390033

In an era where artificial intelligence shapes every facet of our digital lives, a quiet revolution is unfolding in home labs and enterprise data centers alike. The AI Cluster paradigm represents a fundamental shift in how we approach machine intelligence—moving from centralized cloud dependency to distributed, on-premises deep thinking systems that respect privacy, reduce costs, and unlock unprecedented flexibility.

This exploration dives into the philosophy behind distributed AI inference, the tangible benefits of AI clusters, and the emerging frontier of mobile Neural Processing Units (NPUs) that promise to extend intelligent computing to the edge of our networks.

Screenshot1
The AI Cluster dashboard provides an intuitive interface for submitting inference jobs and monitoring worker status

The Philosophy of Deep Thinking in Distributed Systems

Traditional AI deployment follows a client-server model: send your data to the cloud, receive processed results. This approach, while convenient, creates fundamental tensions with privacy, latency, and control. AI clusters invert this paradigm.

“Deep thinking isn’t just about model size—it’s about creating the conditions where complex reasoning can occur without artificial constraints imposed by network latency, privacy concerns, or API rate limits.”

An AI cluster operates on three core principles:

1. Locality of Computation

Data never leaves your network. Whether processing proprietary code, sensitive documents, or experimental research, the inference happens within your controlled environment. This isn’t just about security—it’s about creating a space for uninhibited exploration where the AI can engage with your full context.

2. Heterogeneous Resource Pooling

A cluster doesn’t discriminate between hardware. NVIDIA CUDA GPUs, Apple Silicon with Metal acceleration, and even CPU-only nodes work together. This democratizes AI access—you don’t need a $40,000 H100; your gaming PC, MacBook, and old server can contribute meaningfully.

3. Emergent Capabilities Through Distribution

When workers specialize based on their capabilities, the cluster develops emergent behaviors. Large models run on powerful nodes for complex reasoning, while smaller models handle quick queries on lighter hardware. The system self-organizes around its constraints.

Architecture of Thought: How AI Clusters Enable Deep Reasoning

The AI Cluster architecture is deceptively simple yet profoundly effective. At its heart lies a coordinator—a Flask-based API server managing job distribution via Redis queues. Workers, running on diverse hardware, poll for jobs, download cached models, execute inference, and return results.

┌─────────────────────────────────────────────────────────────┐
│                    User Request Flow                        │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Browser/API Client                                        │
│         │                                                   │
│         ▼                                                   │
│   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    │
│   │ Coordinator │───▶│ Redis Queue │───▶│   Workers   │    │
│   │  (Flask)    │    │  (Job Pool) │    │ (GPU/CPU)   │    │
│   └─────────────┘    └─────────────┘    └─────────────┘    │
│         │                                      │            │
│         │◀────────────────────────────────────┘            │
│         │         Results + Metrics                        │
│         ▼                                                   │
│   ┌─────────────┐                                          │
│   │  WebSocket  │ ───▶ Real-time Progress Updates          │
│   └─────────────┘                                          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

What makes this architecture conducive to deep thinking?

Asynchronous Processing: Jobs enter a queue, freeing users from synchronous waiting. This enables batch processing of complex, multi-step reasoning tasks that might take minutes rather than seconds.

Context Preservation: The system supports project uploads—entire codebases can be zipped and provided as context. When the AI generates code, it does so with full awareness of existing patterns, dependencies, and architectural decisions.

Model Selection Flexibility: From 6.7 billion parameter models for quick responses to 70 billion parameter behemoths for nuanced reasoning, the cluster dynamically routes jobs to appropriate workers based on model requirements and hardware capabilities.

Screenshot4
The Model Management interface lets you download and manage models of various sizes—from efficient 7B models to powerful 32B variants

The Tangible Benefits of Local AI Clusters

Beyond philosophical advantages, AI clusters deliver concrete benefits that compound over time:

Benefit Cloud API Approach AI Cluster Approach
Cost Per-token billing, unpredictable at scale One-time model download, electricity only
Privacy Data sent to third-party servers Data never leaves your network
Availability Dependent on internet, subject to outages Works offline after initial setup
Rate Limits Throttled during high demand Limited only by your hardware
Customization Fixed model versions, limited tuning Choose any GGUF model, quantization level
Latency Network round-trip overhead Local network speeds (sub-millisecond)

Real-World Scenario: Code Generation at Scale

Consider a development team generating AI-assisted code reviews for 1,000 pull requests monthly. With cloud APIs charging $0.01-0.03 per 1K tokens, costs quickly escalate to hundreds or thousands of dollars. An AI cluster running on existing hardware reduces this to electricity costs—often pennies per day.

Screenshot3
The Job History view tracks all completed inference tasks, showing model used, worker assignment, and execution timestamps

The Mobile NPU Frontier: Extending Intelligence to the Edge

Perhaps the most exciting development in distributed AI isn’t happening in data centers—it’s happening in your pocket. Modern smartphones contain dedicated Neural Processing Units capable of running billions of operations per second with remarkable energy efficiency.

Understanding Mobile NPUs

Mobile NPUs are specialized accelerators designed for machine learning workloads:

  • Apple Neural Engine: 16 cores delivering up to 35 TOPS (trillion operations per second) on iPhone and iPad
  • Qualcomm Hexagon NPU: Integrated into Snapdragon processors, offering up to 45 TOPS on flagship Android devices
  • Samsung Exynos NPU: Dedicated AI blocks for on-device inference
  • Google Tensor TPU: Custom silicon optimized for Pixel devices
Screenshot5
The Workers dashboard displays connected compute nodes—here showing a Mac-mini leveraging Apple’s Neural Engine for Metal-accelerated inference

Why Mobile NPUs Matter for AI Clusters

The integration of mobile NPUs into AI cluster architectures represents a paradigm shift:

Ubiquitous Compute Availability

Every smartphone becomes a potential worker node. A team of 10 people effectively adds 10 NPU accelerators to the cluster during work hours—and these aren’t trivial resources. Modern mobile NPUs can run 3-7 billion parameter models in quantized formats.

Energy Efficiency Advantage

Mobile NPUs are engineered for battery-constrained environments. They deliver impressive performance-per-watt, often 10-100x more efficient than desktop GPUs for inference workloads. For always-on edge inference, this efficiency is transformative.

Latency at the Edge

For applications requiring immediate response—voice interfaces, real-time code suggestions, on-device translation—mobile NPUs eliminate network round-trips entirely. The AI thinks where you are, not where the server is.

Integration Pathways for Mobile NPU Workers

Integrating mobile devices into an AI cluster requires careful consideration of their unique constraints:

Mobile NPU Integration Architecture:

┌─────────────────────────────────────────────────────────────┐
│                    Coordinator Server                       │
│   ┌─────────────────────────────────────────────────────┐   │
│   │     Job Queue with Device Capability Matching       │   │
│   │                                                     │   │
│   │  [Complex Job: 70B Model] ───▶ Desktop GPU Worker  │   │
│   │  [Medium Job: 7B Model]  ───▶ MacBook Metal        │   │
│   │  [Light Job: 3B Model]   ───▶ Mobile NPU Worker    │   │
│   │  [Edge Job: 1B Model]    ───▶ Any Available NPU    │   │
│   └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Mobile Workers:
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│   iPhone 15   │  │  Pixel 8     │  │  Galaxy S24  │
│   Neural Eng  │  │  Tensor TPU  │  │  Exynos NPU  │
│   (15 TOPS)   │  │  (27 TOPS)   │  │  (20 TOPS)   │
└──────────────┘  └──────────────┘  └──────────────┘

The coordinator must understand device capabilities—battery level, thermal state, NPU availability, and supported model formats. Jobs are then intelligently routed:

  • Background inference: When devices are charging and idle, they can process larger batches
  • On-demand edge inference: Immediate local processing for time-sensitive requests
  • Federated processing: Distribute large jobs across multiple mobile devices for parallel execution

Deep Thinking: The Cognitive Benefits of Distributed AI

Beyond technical metrics, AI clusters enable qualitative improvements in how we interact with artificial intelligence:

Unhurried Reasoning

Cloud APIs optimize for throughput and revenue. Local clusters optimize for quality. When you’re not paying per-token, you can allow the model to “think” longer, generate multiple candidates, and self-critique. This creates space for emergent reasoning patterns that rushed inference precludes.

Contextual Continuity

With project uploads and persistent context, the AI develops a coherent understanding of your work over time. It’s not starting from zero with each request—it’s building on accumulated knowledge of your codebase, your patterns, your preferences.

Experimental Freedom

Without cost concerns, developers explore more freely. Ask the AI to generate ten different implementations. Request detailed explanations of every design decision. Iterate on prompts until they’re perfect. This experimental abundance is where breakthrough insights emerge.

“The best tool is the one you use without hesitation. When AI assistance is free and private, you integrate it into your workflow at the speed of thought.”

Building Your Own AI Cluster: Key Considerations

For those inspired to build their own distributed AI infrastructure, consider these foundational elements:

Hardware Requirements

Model Size Minimum VRAM/RAM Recommended Hardware
3-7B (Q4) 4-8 GB Entry GPU, Apple M1, Mobile NPU
13-14B (Q4) 10-16 GB RTX 3060+, Apple M1 Pro+
33-34B (Q4) 20-24 GB RTX 3090/4090, Apple M2 Max+
70B (Q4) 40-48 GB Multi-GPU, Apple M2 Ultra

Network Architecture

Isolate your cluster on a dedicated subnet for security. The AI Cluster architecture uses 10.10.10.0/24 by default, with API key authentication and Redis password protection. All traffic stays internal—the coordinator never exposes endpoints to the internet.

Model Selection Strategy

Choose models that match your primary use cases:

  • Code generation: DeepSeek Coder V2 (16B), Qwen 2.5 Coder (32B)
  • General reasoning: Mixtral, Llama 3
  • Quick responses: Smaller 7B models with aggressive quantization

The Future: Convergence of Cloud, Edge, and Mobile

The trajectory is clear: AI inference is becoming increasingly distributed. The future cluster won’t distinguish between a rack-mounted server and a smartphone—it will see a heterogeneous pool of capabilities, dynamically allocating workloads based on real-time conditions.

Key developments to watch:

  • Improved mobile inference frameworks: Core ML, NNAPI, and TensorFlow Lite are rapidly closing the gap with desktop frameworks
  • Federated learning integration: Clusters that not only infer but continuously improve through distributed training
  • Hybrid cloud-edge architectures: Local clusters handling sensitive/frequent workloads while burst capacity comes from cloud providers
  • Specialized edge accelerators: Dedicated NPU devices (like Coral TPU) at $50-100 price points

Conclusion: Thinking Without Boundaries

AI clusters represent more than a technical architecture—they embody a philosophy of democratized intelligence. By distributing computation across diverse hardware, keeping data private, and eliminating usage costs, we create conditions for genuine deep thinking.

The addition of mobile NPUs extends this philosophy to its logical conclusion: intelligence that follows you, processes where you are, and thinks at the speed your context demands.

Whether you’re a solo developer in a home lab or an enterprise team building internal AI infrastructure, the principles remain constant: maximize locality, embrace heterogeneity, and design for the deep thinking that emerges when artificial intelligence is liberated from artificial constraints.

Start Your Journey

The AI Cluster project is open source under AGPL-3.0, with commercial licensing available. Explore the architecture, deploy your first worker, and experience what it means to have an AI that truly works for you.

Components included: Flask coordinator, universal Python worker, React dashboard, and comprehensive documentation for Proxmox deployment.

]]>
https://blogs.perficient.com/2026/01/29/deep-thinking-with-ai-clusters-the-future-of-distributed-intelligence/feed/ 0 390033
Just what exactly is Visual Builder Studio anyway? https://blogs.perficient.com/2026/01/29/just-what-exactly-is-visual-builder-studio-anyway/ https://blogs.perficient.com/2026/01/29/just-what-exactly-is-visual-builder-studio-anyway/#respond Thu, 29 Jan 2026 15:40:45 +0000 https://blogs.perficient.com/?p=389750

If you’re in the world of Oracle Cloud, you are most likely busy planning your big switch to Redwood. While it’s easy to get excited about a new look and a plethora of AI features, I want to take some time to talk about a tool that’s new (at least to me) that comes along with Redwood. Functional users will come to know VB Studio as the new method for delivering page customizations, but I’ve learned it’s much more.

VB Studio has been around since 2020, but I only started learning about it recently. At its core, VB Studio is Oracle’s extension platform. It provides users with a safe way to customize by building around their systems instead of inside of it. Since changes to the core code are not allowed, upgrades are much less problematic and time consuming.  Let’s look at how users of different expertise might use VB Studio.

Oracle Cloud Application Developers

I wouldn’t call myself a developer, but this is the area I fit into. Moving forward, I will not be using Page Composer or HCM Experience Design Studio…and I’m pretty happy about that. Every client I work with wants customization, so having a one-stop shop with Redwood is a game-changer after years of juggling tools.

Sandboxes are gone. VB Studio uses Git repositories with branches to track and log every change. Branches let multiple people work on different features without conflict, and teams review and merge changes into the main branch in a controlled process.

And what about when these changes are ready for production? By setting up a pipeline from your development environment to your production environment, these changes can be pushed straight into production. This is huge for me! It reduces the time needed to implement new Oracle modules. It also helps with updating or changing existing systems as well. I’ve spent countless hours on video calls instructing system administrators on how to perform requested changes in their production environment because their policy did not allow me to have access. Now, I can make these changes in a development instance and push them to production. The sys admin can then view these changes and approve or reject them for production. Simple!

Maxresdefault

Low-Code Developers

 

Customizations to existing features are great, but what about building entirely new functionality and embedding it right into your system?  VB Studio simplifies building applications, letting low-code developers move quickly without getting bogged down in traditional coding. With VB Studio’s visual designer, developers can drag and drop components, arrange them the way they want, and preview changes instantly. This is exciting for me because I feel like it is accessible for someone who does very little coding. Of course, for those who need more flexibility, you can still add custom logic using familiar web technologies like JavaScript and HTML (also accessible with the help of AI). Once your app is ready, deployment is easy. This approach means quicker turnaround, less complexity, and applications that fit your business needs perfectly.

 

Experienced Programmers

Okay, now we’re getting way out of my league here, so I’ll be brief. If you really want to get your hands dirty by modifying the code of an application created by others, you can do that. If you prefer building a completely custom application using the web programming language of your choice, you can also do that. Oracle offers users a wide range of tools and stays flexible in how they use them. Organizations need tailored systems, and Oracle keeps evolving to make that possible.

 

https://www.oracle.com/application-development/visual-builder-studio/

]]>
https://blogs.perficient.com/2026/01/29/just-what-exactly-is-visual-builder-studio-anyway/feed/ 0 389750
Hybrid AI: Empowering On-Device Models with Cloud-Synced Skills https://blogs.perficient.com/2026/01/28/hybrid-ai-empowering-on-device-models-with-cloud-synced-skills/ https://blogs.perficient.com/2026/01/28/hybrid-ai-empowering-on-device-models-with-cloud-synced-skills/#respond Wed, 28 Jan 2026 22:41:31 +0000 https://blogs.perficient.com/?p=389905

Learn how to combine Firebase’s hybrid inference with dynamic “AI Skills” to build smarter, private, and faster applications.

The landscape of Artificial Intelligence is shifting rapidly from purely cloud-based monoliths to hybrid architectures. Developers today face a critical choice: run models in the cloud for maximum power, or on-device for privacy and speed? With the recent updates to Firebase AI Logic, you no longer have to choose. You can have both.

In this post, we will explore how to implement hybrid on-device inference and take it a step further by introducing the concept of “AI Skills.” We will discuss how to architect a system where your local on-device models can dynamically learn new capabilities by syncing “skills” from the cloud.

1. The Foundation: Hybrid On-Device Inference

According to Firebase’s latest documentation, hybrid inference enables apps to attempt processing locally first and fall back to the cloud only when necessary. This approach offers significant benefits:

  • Privacy: Sensitive user data stays on the device.
  • Latency: Zero network round-trips for common tasks.
  • Cost: Offloading processing to the user’s hardware reduces cloud API bills.
  • Offline Capability: AI features work even without an internet connection.

How to Implement It

Using the Firebase AI Logic SDK, you can initialize a model with a preference for on-device execution. The SDK handles the complexity of checking if a local model (like Gemini Nano in Chrome) is available.

// Initialize the model with hybrid logic
const model = getGenerativeModel(firebase, {
  model: "gemini-1.5-flash",
  // Tells the SDK to try local execution first
  inferenceMode: "PREFER_ON_DEVICE", 
});

// Run the inference
const result = await model.generateContent("Draft a polite email declining an invitation.");
console.log(result.text());

When the app first loads, you may need to ensure the on-device model is downloaded. The SDK provides hooks to monitor this download progress, ensuring a smooth user experience rather than a silent stall.

2. What Are “AI Skills”?

While the model provides the “brain,” it needs knowledge and tools to be effective. In the evolving world of Agentic AI, we differentiate between the Agent, Tools, and Skills.

Drawing from insights at Cirrius Solutions and Data Science Collective, here is the breakdown:

Component Definition Analogy
Agent The reasoning engine (e.g., Gemini Nano or Flash). The Chef
Tools Mechanisms to perform actions (API calls, calculators). The Knife & Pan
Skills Modular, reusable knowledge packages or “playbooks” that teach the agent how to use tools or solve specific problems. The Recipe

Skills vs. Tools: A Tool might be a function to `send_email()`. A Skill is the procedural knowledge (often defined in a `SKILL.md` or structured JSON) that tells the agent: “When the user asks for a refund, check the policy date first, calculate the amount, and then use the email tool to send a confirmation.”

3. Adding Skills to On-Device Models via Cloud Sync

The limitation of on-device models is often their size; they cannot “know” everything. However, by combining Hybrid Inference with AI Skills, we can create a powerful architecture where the device is the engine, but the cloud provides the fuel.

Here is a strategy to dynamically add skills to your on-device model without updating the entire app binary:

The Architecture

  1. Cloud “Skill Registry”: Host your skills (instruction sets, prompts, and lightweight tool definitions) in a real-time cloud database (like Firestore) or configuration service (Firebase Remote Config).
  2. Synchronization: When the app launches, it syncs the latest “Skills” relevant to the user’s context.
  3. Local Injection: These skills are injected into the on-device model’s system instructions or context window at runtime.

Implementation Strategy

Imagine a “Customer Support” skill. Instead of hardcoding the support rules into the app, we fetch them dynamically.

// 1. Fetch the latest 'Skill' from the Cloud (e.g., Firestore or Remote Config)
const supportSkill = await fetchSkillFromCloud("refund_policy_v2");
// supportSkill.content = "Authorized to refund if purchase < 30 days. Use tool: processRefund(id)."

// 2. Initialize the On-Device Model with this new Skill
const localModel = getGenerativeModel(firebase, {
  model: "gemini-nano",
  inferenceMode: "PREFER_ON_DEVICE",
  systemInstruction: `You are a helpful assistant. 
                      Current Skill Module: ${supportSkill.content}` 
});

// 3. Execute locally
// The on-device model now "knows" the new refund policy without an app update.
const response = await localModel.generateContent("Can I get a refund for my order from last week?");

Why This Matters

This “Cloud-Sync Skill” architecture solves the biggest problem of local AI: stale knowledge.

  • Dynamic Updates: Did your business logic change? Update the Skill in the cloud, and every on-device model updates instantly.
  • Personalization: Sync different skills for different users (e.g., “Admin Skills” vs. “User Skills”) while still keeping the heavy processing on their own device.

Conclusion

By leveraging Firebase’s Hybrid Inference, developers can finally bridge the gap between cloud capability and local privacy. But the true game-changer lies in treating your AI not just as a static model, but as an agent that can learn new Skills dynamically from the cloud.

This architecture—Local Brain, Cloud Skills—is the blueprint for the next generation of intelligent, responsive, and efficient applications.

]]>
https://blogs.perficient.com/2026/01/28/hybrid-ai-empowering-on-device-models-with-cloud-synced-skills/feed/ 0 389905
The Desktop LLM Revolution Left Mobile Behind https://blogs.perficient.com/2026/01/26/the-desktop-llm-revolution-left-mobile-behind/ https://blogs.perficient.com/2026/01/26/the-desktop-llm-revolution-left-mobile-behind/#respond Mon, 26 Jan 2026 19:44:56 +0000 https://blogs.perficient.com/?p=389927

Large Language Models have fundamentally transformed how we work on desktop computers. From simple ChatGPT conversations to sophisticated coding assistants like Claude and Cursor, from image generation to CLI-based workflows—LLMs have become indispensable productivity tools.

Desktop with multiple windows versus iPhone single-app limitation
On desktop, LLMs integrate seamlessly into multi-window workflows. On iPhone? Not so much.

On my Mac, invoking Claude is a keyboard shortcut away. I can keep my code editor, browser, and AI assistant all visible simultaneously. The friction between thought and action approaches zero.

But on iPhone, that seamless experience crumbles.

The App-Switching Problem

iOS enforces a fundamental constraint: one app in the foreground at a time. This creates a cascade of friction every time you want to use an LLM:

  1. You’re browsing Twitter and encounter text you want translated
  2. You must leave Twitter (losing your scroll position)
  3. Find and open your LLM app
  4. Wait for it to load
  5. Type or paste your query
  6. Get your answer
  7. Switch back to Twitter
  8. Try to find where you were

This workflow is so cumbersome that many users simply don’t bother. The activation energy required to use an LLM on iPhone often exceeds the perceived benefit.

“Opening an app is the biggest barrier to using LLMs on iPhone.”

Building a System-Level LLM Experience

Rather than waiting for Apple Intelligence to mature, I built my own solution using iOS Shortcuts. The goal: make LLM access feel native to iOS, not bolted-on.

iOS Shortcuts workflow diagram for LLM integration
The complete workflow: Action Button → Shortcut → API → Notification → Notes

The Architecture

My system combines three key components:

  • Trigger: iPhone’s Action Button for instant, one-press access
  • Backend: Multiple LLM providers via API calls (Siliconflow’s Qwen, Nvidia’s models, Google’s Gemini Flash)
  • Output: System notifications for quick answers, with automatic saving to Bear for detailed responses
iPhone Action Button triggering AI assistant
One press of the Action Button brings AI assistance without leaving your current app.

Three Core Functions

I configured three preset modes accessible through the shortcut:

Function Use Case Output
Quick Q&A General questions, fact-checking Notification popup
Translation English ↔ Chinese conversion Notification + clipboard
Voice Todo Capture tasks via speech Formatted list in Bear app

Why This Works

The magic isn’t in the LLM itself—it’s in the integration points:

  • No app switching required: Shortcuts run as an overlay, preserving your current context
  • Sub-second invocation: Action Button is always accessible, even from the lock screen
  • Persistent results: Answers are automatically saved, so you never lose important responses
  • Model flexibility: Using APIs means I can switch providers based on speed, cost, or capability

The Bigger Picture

Apple Intelligence promises to bring system-level AI to iOS, but its rollout has been slow and its capabilities limited. By building with Shortcuts and APIs, I’ve created a more capable system that:

  • Works today, not “sometime next year”
  • Uses state-of-the-art models (not Apple’s limited on-device options)
  • Costs pennies per query (far less than subscription apps)
  • Respects my workflow instead of demanding I adapt to it

Try It Yourself

The iOS Shortcuts app is more powerful than most users realize. Combined with free or low-cost API access from providers like Siliconflow, Groq, or Google AI Studio, you can build your own system-level AI assistant in an afternoon.

The best interface is no interface at all. When AI assistance is a single button press away—without leaving what you’re doing—you’ll actually use it.

]]>
https://blogs.perficient.com/2026/01/26/the-desktop-llm-revolution-left-mobile-behind/feed/ 0 389927
Perficient Included in the IDC Market Glance: Healthcare Ecosystem, 4Q25 https://blogs.perficient.com/2026/01/22/perficient-included-in-idc-market-glance-healthcare-ecosystem/ https://blogs.perficient.com/2026/01/22/perficient-included-in-idc-market-glance-healthcare-ecosystem/#respond Thu, 22 Jan 2026 20:09:10 +0000 https://blogs.perficient.com/?p=389743

Healthcare organizations are managing many challenges at once: consumers expect digital experiences that feel as personalized as other industries, fragmented data in silos slows strategic decision-making, and AI and advanced technologies must integrate seamlessly into existing care models. 

Meeting these demands requires more than incremental change—it calls for digital solutions that unify access to care, trusted data, and advanced technologies to deliver transformative outcomes and operational efficiency. 

IDC Market Glance: Healthcare Ecosystem, 4Q25

We’re proud to share that Perficient has been included in the “IT Services” category in the IDC Market Glance: Healthcare Ecosystem, 4Q25 report (Doc# US54010025, December 2025). This segment includes systems integration organizations providing advisory, consulting, development, and implementation services, as well as products or solutions. 

We believe this inclusion reinforces our expertise in leveraging AI, data, and technology to deliver intelligent tools and intuitive, compliant care experiences that drive measurable value across the health journey.  

We believe this commitment aligns with critical shifts IDC Market Glance highlights in its latest report, which emphasizes how healthcare organizations are activating advanced technology and AI. IDC Market Glance shares, “Health systems and payers are moving more revenue into value-based care and capitated risk, pushing tech buyers to favor solutions that improve quality metrics, lower total cost of care, and help hit incentive thresholds.” 

As the industry evolves, IDC predicts: “Technology buyers will likely favor vendors that align revenue models to customer risk arrangements, plug seamlessly into large platforms, and demonstrate human-centered design that supports clinicians rather than replacing them.” 

To us, this inclusion validates our ability to help healthcare organizations maximize technology and AI to drive transformative outcomes, power enterprise agility, and create seamless, consumer-centric experiences that build lasting trust.

Intelligent Solutions for Transformative Outcomes 

These shifts are actively transforming the healthcare ecosystem, challenging leaders to rethink how they deliver care and create value. Our partnerships with leading organizations show what’s possible: moving AI from pilot to production, building interoperable data foundations that accelerate insights, and designing human-centered solutions that empower care teams and improve the cost, quality, and equity of care. 

Easing Access to Care With a Commerce-Like Experience 

We helped Rochester Regional Health reimagine its digital front door to triage like a clinician, personalize like a concierge, and convert like a commerce platform—creating a seamless experience that improves access, trust, and outcomes. The mobile-first redesign introduced smart search, dynamic filters, and real-time booking, driving a 26% increase in appointment scheduling and saving $79K+ monthly in call center costs. As a result, this transformative work earned three industry awards, recognizing the solution’s innovation in accessibility, engagement, and measurable impact on patient care.

Consumers expect frictionless access to care, personalized experiences, and real-time engagement. Our recent Access to Care Report reveals more than 45% of consumers aged 18–64 have used digital-first care instead of their regular provider—and 92% of them believe the quality is equal or better. To deliver on consumers’ expectations, leaders need a unified digital strategy that connects systems, streamlines workflows, and gives consumers simple, reliable ways to find and schedule care.

Explore how our Access to Care research continues to earn industry awards or learn more about our strategic position ofind care experiences. 

Empowering Care Ecosystems Through Interoperable Data Foundations 

We helped a healthcare insurance leader build a single, interoperable source of truth that turns healthcare data into a true strategic asset. Our FHIRenabled solution ingests, normalizes, and validates data from internal and external systems and shares a consolidated, reliable dataset through API connectors, gateways, and extracts, grounded in data governance. Ultimately, this interoperable data foundation accelerates time to market, minimizes downtime through EDI and API modernization, and ensures the right data reaches the right hands at the right time to power consumergrade experiences, while confidently meeting interoperability standards. 

Discover our platform modernization and data management capabilities.  

Accelerating Member Support With Human-Centered GenAI Innovation 

We helped a leading Blue Cross Blue Shield health insurer transform CSR support by deploying a natural language Generative AI benefits assistant powered by AWS’s AI foundation models and APIs. The intelligent assistant mines a library of ingested documents to deliver tailored, member-specific answers in real time, eliminating cumbersome manual processes and PDF downloads that previously slowed resolution times. Beyond faster answers, this human-centered solution accelerates benefits education, equips agents to provide relevant information with greater speed and accuracy, and demonstrates how generative AI can move from pilots into core infrastructure to support staff rather than replace them.

Read more about our AI expertise or explore our human-centered design services. 

Build Your Scalable, Data-Driven Future 

From insight to impact, our healthcare expertise  equips leaders to modernize, personalize, and scale care. We drive resilient, AI-powered transformation to shape the experiences and engagement of healthcare consumers, streamline operations, and improve the cost, quality, and equity of care.

We have been trusted by the 10 largest health systems and the 10 largest health insurers in the U.S., and Modern Healthcare consistently ranks us as one of the largest healthcare consulting firms.

Our strategic partnerships with industry-leading technology innovators—including AWS, Microsoft, Salesforce, Adobe, and  more—accelerate healthcare organizations’ ability to modernize infrastructure, integrate data, and deliver intelligent experiences. Together, we shatter boundaries so you have the AI-native solutions you need to boldly advance business.

Ready to Turn Fragmentation Into Strategic Advantage? 

We’re here to help you move beyond disconnected systems and toward a unified, data-driven future—one that delivers better experiences for patients, caregivers, and communities. Let’s connect  and explore how you can lead with empathy, intelligence, and impact. 

]]>
https://blogs.perficient.com/2026/01/22/perficient-included-in-idc-market-glance-healthcare-ecosystem/feed/ 0 389743