Sun Microsystems, today part of Oracle Inc., surprised the world with TV commercials in the early Nineties claiming that “The Network is the Computer.” No one really understood it at the time. At the time most of us were dealing with client/server computing (a PC program communicating with a database) or with terminal-based host applications. A few bytes were sent back and forth, but depending on the scenario, the intelligence was either with the PC program or the mainframe computer, but certainly not with the network.
Then came the World Wide Web. It brought a uniform way to access information based on Hypertext Transfer Protocol (HTTP). Introduced for human-to-machine communications, HTTP soon paved the road for a concept called the Service Oriented Architecture (SOA) and on its coattails appeared protocols like WebServices and REST that enabled machine-to-machine communications.
SOA and the implementation protocols quickly picked up as, unlike their transaction monitor predecessors, the protocol standards were open and with open-source libraries freely available. SOA became the best-practice architecture for business systems integration. Even eternal ego-centric Microsoft had to buy into these standards else the company would have lost market momentum for its server products like Sharepoint. The main driver behind SOA was (and is) modularization and reuse of the enterprise software environment.
From there, it was a small step to a marketing buzzword like “Cloud”. To HTTP, it makes little difference if the called server (better: its IP address) is in-premise or elsewhere in the world. The rest is history. Be it within a single organization or on the public web – calling different information resources based on standardized APIs is the predominant information systems design model of our times. Such concepts come with different names but essentially perform the same: services, mash-ups, portals, microservices, and many more.
What is an API?
The term Application Programming Interface or API is today mostly used in the context of remote calls over the Web (we won’t dive into other meanings in this article). So we focus on the modern remote usage of such an interface and its impact on Records and Information Management (RIM).
An API essentially consists of these three parts:
Obviously, a request has to go somewhere. In the Web world, there is the so-called Uniform Resource Identifier (URI) that determines where in Web’s huge address space a service API shall be called
To read or write data to a certain address, clients and servers need to be able to talk to each other in an ordered way. The sequence of interactions over the network is called the protocol. The ubiquitous protocol for the Web environment is HTTP.
This is an area of special interest to RIM professionals. Few realize but the Web reinvented the significance of the document. It might not correspond with our idea of the well written official record, rather it is the currency of the web that is exchanged between parties. It is a snapshot of server’s data (sometimes called its “status”) that is brought to a certain format and gets transmitted. This mechanism is referred to as the serialization of the server state.
Given a particular client request, a client can expect to receive a document that contains the requested information in the requested format. The best-known document format is HTML, however, any format can potentially be exchanged as long as it conforms to a known protocol content-type, such as pdf, xml, tiff, etc..
To summarize these three terms in one sentence: a client calls an API at a certain address, following a strict communications protocol to interact; eventually, it will get a response back as a document in a format the server can deliver (hopefully what the client requested). The API is one side of the coin while the flip side is called implementation. That means that the API is the entry point for client requests and is independent (at least should be) of the actual implementation.
What does an API Driven World Signify for the Information Professional?
Huge amounts of information are crossing organizational boundaries via API calls. As the use of mobile devices and apps swells, the Internet of Things (IoT) grows and business process integration and digital transformation forge onward this trend will only increase exponentially. Other, more traditional means of communication, starting with physical paper or file transfer in the electronic world, may keep their absolute volume while the use of APIs will continue to explode.
API endpoints on the server side are gates where a lot of traffic passes by. It is an ideal point to capture data and measure usage of an Information Asset. Think of an eCommerce site. Orders come through the API door and it is relatively easy to capture all incoming orders and store them as records or at least tag them for later in-place management. This means effort on the implementation side and an important concept here is called interception. Interception is often used in cases where usage of data is not directly linked to the core mission of a transaction. Log files are often written via interceptors.
Then there is a set of APIs which is directly geared towards RIM functionality. If, for instance, your system needs to offer functionality to apply and lift legal holds, make that system “API-ready”. It is a future-proof approach to offer such functionality via an API, as opposed to a proprietary client. So eDiscovery systems, for instance, can request holds directly without knowing the intrinsic functionality of your system.
The importance of web-based APIs is paramount today and will become even more necessary in the future. Mobile, IoT, augmented reality glasses and digital transformation, to name a few, will all communicate via the Internet and use APIs to communicate. Hence, it is vital to the Information Professional to understand the mechanisms of how APIs works.
The incoming data may never get rendered as a traditional record. In particular, Records Management often still holds to the view that anything important gets eventually printed. With this perspective, you may not be able to capture the complete regulatory information, have it controlled by a consistent lifecycle management and satisfy legal production requests.
The next blog will dive into the importance of metadata in this context and how it is best managed.