By David Fellows, Chief Digital Officer, Acuity Knowledge Partners
The knowledge process outsourcing (KPO) sector has seen unprecedented growth in recent years. Strong tailwinds—a dearth of financial-services talent, the Great Resignation and the shift to remote/hybrid working frameworks—support this growth.
We have seen increased demand from clients not only for research but also for more value-added tasks across the capital markets and banking value chains. The quality of deliverables is not the only yardstick now; there is more emphasis on faster turnaround, which facilitates faster go-to-market.
The volume, variety and speed at which data is produced these days present several opportunities for financial institutions. However, managing this constant flow of data is a significant challenge and one for which the KPO sector has historically relied on highly qualified experts to manage. Capturing, organizing and analyzing data in a market in which minutes matter, but the volume of data is both extremely large and seemingly never-ending, now presents an almost impossible task for human analysts to handle by themselves.
Outsourcers are, therefore, heavily investing in data-science functions, the primary role of which is to build tools to augment analysts’ capabilities by reducing the time it takes them first to prepare and organize the mountains of information to be evaluated and then helping them analyze this data to determine actionable insights.
This article provides insight into how machine learning (ML) has helped augment analysts’ capabilities to provide better service to clients. We also discuss some common challenges companies face when trying to integrate machine learning into routine workflows.
The first example of how machine learning has helped augment analysts’ capabilities relates to the populating of financial models—i.e., taking information from a number of public and private sources and exporting that information into a model that can be used to predict events such as credit default. The types of information available for populating models has increased, and so has the ways in which it is presented.
First, the layout can differ. Even within public-company financials, tables may be demarcated with lines or only be implied using spacing. Notes pertaining to accounting treatment may be handwritten. Date stamps and even coffee spills may cover key pieces of information. Fonts may vary. Scan quality may vary.
The second problem revolves around synonyms, i.e., different terms that mean the same thing. These are major challenges when it comes to populating models. Models have features with extensive descriptions, but how does one know what data should be imported to a model feature when the source uses a different descriptor than your model does?
If a synonym is misinterpreted and incorrectly imported into a model, it would invalidate the results and waste a lot of time. We regularly see scenarios where we are transposing 200 or more values into a model but have more than 10,000 documents to get through (each slightly different from the last)—this gives an idea of the size of the challenge. For well-trained experts, identifying information tables or transposing synonyms is relatively easy (albeit time-consuming). Completing these tasks consistently well enough not to require human supervision is a significant challenge for a computer.
To solve the first problem, some outsourcers have evaluated leading computer vision models but found that while they offer good general performance, it is extremely hard in a finance domain-specific environment to increase their predictive power to make them viable tools.
This last mile of effort proved to be more expensive than the entire journey before it. Therefore, writing bespoke domain-specific computer vision models specifically trained on financial documents is necessary. These domain-specific models now show good predictive power when evaluating financial documents and can reduce the challenges around model drift (the degradation of the predictive power of a model over time).
Overcoming the second half of this challenge (the synonym issue) requires building domain-specific machine-learning models to identify and extract financial domain-specific entities. In our example, there could literally be thousands of synonyms. Once extracted via computer vision models, the entity identification and extraction models can now interpret thousands of synonyms and then map those terms accurately to the target.
It is even possible to do this between languages—for example, by reading a non-English financial document, understanding its synonyms and then correctly mapping those synonyms to their English counterparts. In practice, this process saves analysts hundreds of hours; this means not only that the important business of running the model and evaluating the results is faster, but it’s also possible to run many more models and many different scenarios.
Our second example pertains to deal sourcing, typically in private equity and investment banking scenarios.
Given a target industry and set of criteria, analysts would conduct research against a selected set of information sources, looking to shortlist companies that fit a set of deal criteria. The number of sources under inspection would be limited by the practical constraints of how much one person could look at on any given day. This is the classic “looking for a needle in a haystack” problem for which machine learning and big-data tools are ideally suited.
Instead of monitoring a small selection of data sources that are thought to be the most likely to contain relevant signals, machine-learning models can now monitor large volumes of information across many different sources. In effect, we have gone from analysts monitoring a small portion of the haystack to having machine-learning models monitoring the entire haystack and pointing analysts in the direction of interesting discoveries.
Such machine-learning monitoring tools constantly ingest large amounts of data and evaluate the relevance of that data against a particular set of deal criteria, and, if deemed relevant enough, it’s possible to use natural language processing (NLP) to summarize the information into bite-size chunks before flagging it for review by an expert.
From my experience, machine learning has enabled us to increase the amount and quality of work that one of our analysts can produce. I have described only two examples; there are many more in which using the predictive power of machine learning has helped us expand the capabilities of our analysts.
As you can see, machine learning has turned out to be all about the augmentation of analysts’ capabilities. Therefore, we strongly believe that the key to successful innovation in the KPO sector will always be an amalgamation of technology and domain expertise.
Enterprises that win in the future will be the ones thinking hard about digitalization and asking the tough technology questions of their knowledge partners. And KPOs that understand their clients’ problems and fix them by integrating modern technologies with their domain expertise without stressing the clients’ front-office teams will be outright winners in the long run.