By Alexander Jones, International Banker
In a financial world that is increasingly relying on data and information to obtain that marginal edge and with such resources being represented not only by cold, hard numbers but also by words and mountains of text, the benefits that natural language processing (NLP) presents to financial firms worldwide cannot be underestimated. Whether for efficiency and accuracy gains in the compliance and back-office divisions or opportunities to explore new, untapped sources of alpha for fund managers, the powerful capabilities of NLP look set to help transform the finance industry permanently.
NLP falls under the broader category of artificial intelligence (AI), whereby computers learn to read and interpret words and sentences. As such, it broadly represents the power of computer programmes to understand human language. Tackling unstructured data and extracting useful, actionable insights remains one of the biggest challenges for investment and trading professionals—whether it’s from traditional sources such as company filings and earnings calls or increasingly popular forms of alternative data that can encompass everything from social-media sentiment to geolocation data. And now, with regulatory documentation having ballooned in volume, the need to automate the process of interpreting reams of words has only intensified during the last few years.
NLP begins by requiring the computer to understand the natural-language source, typically using speech-recognition techniques that break down the speech into smaller chunks such as sentences and words. These smaller units are then compared against ones from a previous speech; afterward, each unit is applied to a part-of-speech classification model coded to identify its grammatical properties and context. Once the computer comprehends the speech, the text-to-speech conversion is executed to convert the language into an audible or textual format to be used by the end-user across a range of financial applications, such as chatbots and voice assistants that analyse customer queries or examine corporate documents and news feeds.
Indeed, corporate releases such as earnings calls provide an ideal example of where NLP can add considerable value, with speech-recognition representing a highly important method of analysis of the company’s performance. Such calls provide ample opportunities to analyse the company’s overall health, followed by a presentation of the results and the question-and-answer (Q&A) session during which the company responds to various questions from analysts, offering much scope for scrutiny by NLP algorithms. “What and how they ask the questions, and what and how the company answers, including their tone, are likely to reflect on the company’s stock price,” according to Kelvin Rocha, lead data scientist at Refinitiv Labs. “Profiling the tone of speech, and converting it to text to quantify it across different key topics, such as revenue, is extremely useful.”
Similarly, financial research also provides an ideal use case for NLP, especially given the copious amounts of potential information invariably contained within research documents. For analysts, making sense of all this information as quickly as possible can be critical in generating timely trading recommendations and gaining a competitive edge. And with more recent regulatory requirements mandating that investment firms pay for all research, the need to discern which particular sources of information are the most useful and relevant becomes even more important.
As such, NLP-based solutions have come to fruition over the last few years to improve efficiency when interpreting this voluminous research. State Street Corporation, for instance, launched Quantextual Idea Lab in 2017, which uses NLP and machine learning to more efficiently organise and extract information from various types of content. This has involved implementing research-aggregation software, which tags and classifies the research. Text-summarisation algorithms then shorten the content into condensed form while ensuring the text’s formal tone is not compromised. Ultimately, analysts derive the relevant insights quicker than would otherwise be possible and without losing important information.
Compliance divisions in banks can also leverage NLP applications to enhance a variety of processes. For instance, named-entity recognition can be used to contextualise unstructured content by detecting and labelling concepts of interest, such as people, companies and other entities. This allows compliance officers to retrieve important information quickly and ensure regulatory requirements are met. This has become especially pertinent given the rapidly evolving regulatory landscape financial firms have faced during the last decade or so. With new directives coming into play regularly, documents that often consist of hundreds and thousands of pages can require considerable time and energy to decipher.
But instead of allocating such tasks to compliance professionals or paying external firms to carry out the documentation analyses, firms can increasingly utilise AI and NLP to thoroughly and efficiently extract the required information. NLP techniques can be applied to scan the documents, identify key entities to which the regulations apply, extract metadata and interpret key objectives that the text stipulates. Ultimately, this enables financial institutions to have a better chance of consistently remaining compliant with increasingly onerous regulatory burdens. And by automating much of the process, not only does NLP lower the opportunities for human error but also the risks of the regulations being exposed to cognitive and emotional biases that all humans possess—compliance officers included.
In terms of specific subject areas in which NLP is proving valuable, sustainable investing ranks close to the top. Among the biggest challenges in the sustainability sphere—which includes such activities as environmental, social and governance (ESG) investing, impact investing and socially responsible investing—is the persistence of “greenwashing”, a practice frequently carried out by financial firms, whereby they exaggerate their sustainability credentials and commitments through their formal disclosures but fail to carry them out in practice. This invariably means companies file substantial disclosures related to sustainability, but much of it is effectively meaningless.
Noticing this unfortunate trend, Deutsche Bank found that larger-cap companies tended to receive higher ratings pertaining to ESG risks, which it attributed to their possibly having more resources at hand to write longer—albeit largely irrelevant—reports. As such, the bank created the α-DIG system to more accurately ensure companies are aligning their businesses with sustainability practices in reality. “This uses machine learning algorithms and natural language processing techniques to infer context and understanding from company information that is increasingly subject to greenwashing,” Deutsche Bank noted.
The bank trained the α-DIG’s NLP algorithm to assess company commitments and detect carbon-related discussions in sustainability reports. The result was the identification of five different topics, along with each topic’s top associated keywords. It then created a ranking system to identify which companies are most focused on climate mitigation and adaptation and combined the rankings with other linguistic features extracted from sustainability reports. Ultimately, it exposed significant evidence of greenwashing. “It is a shame that many companies greenwash their communications, especially as this influences the ESG scores calculated by traditional data vendors,” acknowledged Deutsche Bank Research’s Andy Moniz, chief data scientist, and Spyros Mesomeris, global head of quantitative strategy. “But investors, in combination with new technology can begin to uncover the reality of a firm’s ESG position and help model many ESG investors’ ultimate goal: to predict the carbon transition risk and a company’s future carbon footprint.”
According to a research report from Quince Market Insights, the global NLP market was valued at US$9.2 billion in 2019 and is anticipated to grow at a compound annual growth rate (CAGR) of 18.4 percent from 2020 to 2028. In terms of what is achievable, therefore, NLP is set to jump by leaps and bounds this decade. And much of this growth is likely to be driven by financial services. With NLP being increasingly applied across a range of financial use cases and a number of successful applications having already been registered, it is proving to be among the finance industry’s most transformative technologies.