The technology to scan paper documents and images has been in existence for years, but the ability to extract information from the content has been limited for a variety of reasons. Machine learning along with other technology advances is now opening up a new frontier.
Machine learning is a sub-discipline of artificial intelligence that has explosive potential to automate many routine tasks that humans now do. Through a process of continuous discovery and refinement, computers can “learn” by identifying patterns that appear repeatedly in a given set of data. They can also go one step further to find other common traits that aren’t explicitly stated. For example, a machine learning algorithm that knows that cats have whiskers and long tails can deduce by scanning images that they also have soft fur and pointy ears.
Machine learning will be a major theme of the upcoming Google Cloud Next ’18 conference in London. In fact, the pre-conference boot camp program features a full-day deep dive on Google’s Tensorflow cloud-based machine learning platform. Customers will discuss applications of machine learning ranging from supply chain management to decoding ancient Egyptian hieroglyphics.
Iron Mountain will be there too, debuting the fruits of a Google partnership that will enable our customers to extract information locked up in documents and images and import it into a business analytics engine.
The new solution, Iron Mountain InSight™, is a content services platform delivered as a service that learns from unstructured and semi-structured information in documents and images to enable predictive analytics that drive business value. The software can extract information from paper or born digital documents at high speed without the need for human oversight and feed it into a sophisticated analytics engine. Machine Learning and AI leveraging Google Tensorflow helps Iron Mountain InSight enable breakthrough new capabilities for our customers. Some examples include:
- A media company can easily search its library of images and video assets and leverage InSight to automatically generate a treasure trove of metadata for each asset that can be searched to quickly access valuable assets based on voice and video similarity, type, creation date, genre, etc.
- An energy company can retrieve decades of production data for all its wellheads to help enable yield forecasting and maintenance scheduling.
- A mortgage company can conduct online title searches that include data in documents created on paper decades ago or support mortgage processing through the use of predictive analytics and automated detection of issues that would otherwise slow down processing.
- A utility can scan photos of its field equipment captured by drones to identify the assets most at risk of failure.
This kind of sophisticated predictive analysis has been impossible in the past because data analytics was limited to information stored in the rows and columns of computer databases. The cost of transcribing data from paper documents by hand was too high to be practical in most use cases. Machine learning and pattern recognition algorithms are now enabling computers to find and extract that information automatically.
Computer algorithms are very good at matching patterns. In fact, they are already better than humans at facial recognition and will soon eclipse our abilities to recognize voices as well. Automated teller machines can decode handwritten numbers on paper checks and AI-driven handwriting analysis is even being used to detect fraud and forgeries. These same technologies can also be applied to finding information in paper or born-digital documents by scanning for patterns and capturing data from recognized fields.
For example, Iron Mountain InSight can scan old bills of lading to recognize fields such as quantities, SKU numbers, prices, dates and addresses and import that data into analytical tools for historical trend analysis. It can figure out which fields in a handwritten paper form contain phone numbers and import that information into a contact database. It can effectively make a file full of paper records searchable.
But there’s more to Iron Mountain InSight than that. Sophisticated visual recognition capabilities can interpret images to determine what’s in them, including where the image was captured, the context of the scene and the identity of individuals in the frame. Imagine the value this capability could deliver to law enforcement agencies, insurance companies, event organizers, media companies, stock photo agencies and anyone whose business involves image interpretation and analysis.
Iron Mountain is in a unique position to help our customers with their digital transformation and deliver on the promise of extracting information insights from their data. As the leader in records management, we’re experts in long-term document storage and retrieval. We understand the legal and regulatory needs of our customers and provide the highest levels of data protection. Iron Mountain InSight customers can be confident that the data our machine learning algorithms retrieve will be treated with the same chain-of-custody controls that we bring to our document management services. Customers who already use our document scanning services can now add intelligent data extraction and analysis for both physical and born-digital content to the mix.
The new service launched in September. If you’re in London October 10 or 11, stop by Booth f34 at Google Cloud Next for a demonstration!