close
close

Latest Post

Hong Kong plans to use technology to replace some officials in the field, says minister Total tax revenue in Kansas in May almost 23% below estimate – Newstalk KZRG

Unlock Editor’s Digest for free

This article is an onsite version of our Unhedged newsletter. Premium subscribers can sign up here to receive the newsletter every working day. Standard subscribers can upgrade to Premium here or explore all FT newsletters

Good morning. French markets fell a little after President Emmanuel Macron called early parliamentary elections, but the reaction was more a resigned Gallic shrug than any kind of panic. Further evidence of the unsecured view that politics doesn’t matter much to markets in the short term (except in extreme cases). Email me: [email protected].

The robots are here

Anyone who works in the information industry – this category includes journalists, software developers and securities analysts – should think about whether or perhaps when a computer will take their job away.

A large language model trained on the copy written for the Financial Times could write newsletters that sound a lot like me. Maybe the letters wouldn’t be entirely convincing today, but it probably won’t be long before they are. Maybe people don’t want to read newsletters written by LLMs, in which case my trip to the knacker’s yard isn’t quite booked yet. But the threat is clear.

Unbiased readers may be less interested in the future of journalism than in that of analysts and portfolio managers. That brings me to a recent paper by three University of Chicago School of Business scholars, Alex Kim, Maximilian Muhn, and Valeri Nikolaev (I’ll call them KMN). The paper, “Financial Statement Analysis with Large Language Models,” uses ChatGPT with financial reports. With relatively little guidance, the LLM turned those reports into earnings forecasts that were more accurate than the analysts’—and the forecasts formed the basis of model portfolios that produced massive excess returns in backtests.

“We provide evidence consistent with large-scale language models exhibiting human-like capabilities in the financial domain,” the authors concluded. “Our results suggest that LLMs have the potential to democratize financial information processing.”

KMN fed ChatGPT thousands upon thousands of balance sheets and income statements, stripped of dates and company names, from a database that spanned 1968 to 2021 and included more than 15,000 companies. Each balance sheet and associated income statement contained the usual two years of data, but was an individual input; the model was not “told” anything about the company’s longer-term history. KMN then asked the model to perform very standard financial analysis (“What has changed in the accounts since last year?”, “Calculate liquidity ratio,” “What is the gross margin?”).

Next – and this proved crucial – KMN asked the model to write economic reports explaining the results of the financial analysis. Finally, they asked the model to predict whether each company’s earnings would increase or decrease over the next year, whether the change would be small, medium or large, and how confident it was in making that prediction.

Earnings forecasting is not particularly easy for humans or machines, even when it is binary. To simplify things significantly, the human predictions (which are drawn from the same historical database) were correct about 57 percent of the time, measured in the middle of last year. That’s better than ChatGPT’s performance before the prompt. After the prompt, however, the model accuracy rose to 60 percent. “This means that GPT significantly outperforms the average financial analyst in forecasting earnings,” KMN wrote.

Finally, KMN created long and short model portfolios based on the companies for which the model was most confident in predicting significant earnings changes. In backtests, these portfolios outperformed the broad equity market by 37 basis points per month on a cap-weighted basis and by 84 basis points per month on an equal-weighted basis (suggesting that the model adds more value with its predictions of smaller stocks’ earnings). That’s a lot of alpha.

I spoke with Alex Kim yesterday, and he was quick to stress that the results are preliminary in nature. This is a proof of concept, not proof that KMN has invented a better mousetrap for stock picking. Kim was equally eager to emphasize KMN’s finding that the key to greater forecast accuracy seems to be asking the model to write text to explain the impact of financial reports. That’s the “human” aspect.

The study raises many questions, especially for someone like me who hasn’t done much research on artificial intelligence. In no particular order:

  1. Overall, the KMN result does not seem surprising to me. There has been plenty of evidence over the years that previous computer models or even plain old linear regressions can outperform the average analyst. The most obvious explanation is that the models or regressions simply find or follow rules. They are therefore not susceptible to biases that are only encouraged or reinforced by the richer information that humans have access to (company reports, executive chatter, etc.).

  2. What is perhaps even more surprising is the fact that an out-of-the-box LLM was able to significantly outperform humans on fairly simple prompts (the model also performed better on simple statistical regressions, performing about as well as specialized “neural network” programs trained specifically to predict earnings).

  3. Of course, all the usual assumptions that apply to any social science study apply here. Many studies are carried out, but only a few are published. Sometimes the results do not hold up.

  4. Some of the best stock pickers specifically avoid Wall Street’s obsession with near-term earnings. Instead, they focus on the structural advantages of companies and the ways the world is changing that give some companies an advantage over others. Can ChatGPT make such “big calls” as effectively as it does short-term earnings forecasts?

  5. What is the job of a financial analyst? If an LLM can predict earnings better than his human competitors most of the time, what is the value of the analyst? Is he there to explain the details of a deal to the portfolio manager who makes the “big decisions”? Is he a conduit of information that connects the company to the market? Will he still be valuable when human buy and sell decisions are a thing of the past?

  6. Perhaps AI’s ability to outperform the average analyst or stock picker won’t change anything. As Joshua Gans of the University of Toronto explained to me, the low value of the average stock picker was demonstrated years ago by the artificial intelligence known as the low-cost Vanguard index fund. What will matter is the ability of LLMs to compete with or support the smartest people in the market, many of whom already use vast amounts of computer power to do their jobs.

I am curious to hear my readers’ opinions on this topic.

A good read

More about Elon Musk’s salary.

FT Unhedged Podcast

Can’t get enough of Unhedged? Listen to our new podcast, featuring 15 minutes of the latest market news and financial headlines twice a week. Find previous editions of the newsletter here.

Recommended newsletters for you

Swamp Notes — Expert insights into the intersection of money and power in US politics. Register here

Chris Giles on central banks — Important news and views on central bank thinking, inflation, interest rates and money. Subscribe here

Leave a Reply

Your email address will not be published. Required fields are marked *