Multimodal deep learning for multi-horizon corporate revenue forecasting

Wu, Qiping

Multimodal deep learning for multi-horizon corporate revenue forecasting

dc.contributor.advisor	Yassine, Abdulsalam
dc.contributor.author	Wu, Qiping
dc.date.accessioned	2026-04-28T13:01:39Z
dc.date.created	2026
dc.date.issued	2026
dc.description.abstract	Corporate revenue forecasting matters for valuation, portfolio management, and capital allocation. However, it is difficult because financial statements mainly reflect the past, while investors and firms often need forecasts from the next quarter to a rolling one-year horizon. This challenge becomes even greater over longer horizons, especially in fast-changing industries. This thesis addresses the problem by building a forecasting framework that starts with a broad quantitative baseline and then extends to a multimodal approach. First, this thesis develops a Temporal Fusion Transformer (TFT) baseline for next-quarter revenue forecasting across 155 continuously listed S&P 500 firms. Under a strict chronological evaluation protocol, the TFT model achieves a test Mean Absolute Percentage Error (MAPE) of 9.31%, a Root Mean Squared Error (RMSE) of 1,973 million USD, and a Mean Absolute Error (MAE) of 1,790 million USD. Controlled ablation analysis further shows that accurate short-horizon forecasting depends not only on autoregressive revenue history, but also on structured firm context, including sector identity, year-over-year growth, and firm scale variables such as total assets and equity. Second, the framework is extended from one-quarter-ahead to four-quarter-ahead forecasting. The results show that forecast accuracy deteriorates as the horizon expands, with MAPE rising from 9.31% at one quarter ahead (𝑡 + 1) to 12.07% at four quarters ahead (𝑡 + 4). A comparison with an LSTM baseline under the same chronological setting further suggests that this deterioration is not specific to a single model, but reflects a broader limitation of purely financial forecasting approaches. The effect is especially pronounced in technology-oriented firms, highlighting the limits of relying only on lagged financial data in non-linear growth environments. Third, the work proposes a multimodal TFT framework that integrates earnings-call-derived textual signals into the forecasting pipeline. Focusing on the Mega-Cap 5 companies, the framework uses both Financial Bidirectional Encoder Representations from Transformers (FinBERT) and a locally deployed Llama-3 8B model to extract finance-domain sentiment and richer generative narrative features from quarterly earnings call transcripts. These results show that transcript-based narrative features improve long-horizon forecasting. Among the models, the Llama-3 representation delivers the biggest improvement. For example, the pure TFT has a MAPE of 53.85%, while the FinBERT+TFT and Llama-3+TFT hybrids reduce it to 48.70% and 43.01%, respectively. Overall, this thesis presents a practically deployable multimodal forecasting framework that bridges the gap between backward-looking financial fundamentals and forward-looking managerial narratives in corporate revenue forecasting.
dc.identifier.uri	https://knowledgecommons.lakeheadu.ca/handle/2453/5592
dc.language.iso	en
dc.title	Multimodal deep learning for multi-horizon corporate revenue forecasting
dc.type	Thesis
etd.degree.discipline	Engineering : Electrical and Computer
etd.degree.grantor	Lakehead University
etd.degree.level	Master
etd.degree.name	Master of Science in Electrical and Computer Engineering

Files

Original bundle

Now showing 1 - 1 of 1

Name:: WuQ2026m-2b.pdf
Size:: 8.05 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.23 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations from 2009