Skip to main content


AI Summit: Tagging Medical Records Create Vital Data and Analytics

Doug Kemp of Innodata spoke about getting key medical data into a form that can be understood and used.  Like many things, components of AI can parse and understand this data but it can’t be done by itself.  In this particular case, getting even small things wrong in your data has huge implications when using it.

Quote: Most medical records, even in electronic form, is not stored in a way that it can be easily extracted from a computer.

If you start here and believe that trustworthy data is core to any AI applications then you need to address this gap first.


  • Market for AI is expected to be $191B by 2024
  • 85% of those that will fail due to data issues

Common Challenges that sink most of these projects:

  • Scarcity of semantically enriched data
  • Lack of clean accurate, quality data
  • Inability to interpret the right data

So how do you get to success?

You can’t completely automate this. It’s been tried.  Even small errors can prove disastrous, especially with medical records and data. Humans still have a place here.  Keep in mind that this type of data comes in many flavors including:

  • Text document
  • Images
  • Video
  • Web document
  • Audio files
  • EHR structured data
  • EHR unstructured data

The good news: just using AI to pull this in and understand it, Innodata has reached 92% success rates. Humans are still needed to take it further.

Key to this is the creation of taxonomy or categories so y9ou can assign meaning. A lab, for example, would get metadata mapped to a taxonomy with a range of information including thresholds, lower limit, upper limit, actual test result, etc.

Mike Porter’s comments: In this particular case, not that the AI has been trained to understand over 1,000 different lab values.  In other words, like all other AI experts are saying, this is a journey.  It doesn’t happen immediately.  You gain great value from digitizing and then understanding this type of data. It just takes time and effort.

Things you need to think about

  • The component of time: it’s critical in analyzing the cluster of data. You can’t just capture flat data.
  • Getting the proper taxonomy that allows the AI to correctly categorize the vital data
  • Thinking through the purpose of the data investigation. This drives what and how you use the data to train the AI

You can read more about AI Summit here. You can also read some more of the discussions happening and our recaps here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Michael Porter

Mike Porter leads the Strategic Advisors team for Perficient. He has more than 21 years of experience helping organizations with technology and digital transformation, specifically around solving business problems related to CRM and data.

More from this Author

Follow Us