Measuring the communication of the central bank using large language models

While research on central bank communication is gaining prominence in economics and political economy literature, the typically employed dictionary-based techniques are far less efficient than supervised learning approaches. The project uses large volumes of ‘gold-standard’ quality teaching data to obtain a precise insight into the accuracy of certain dictionaries and the expected progress if supervised learning is used for classification. The supervised models include both long-established bag-ofwords-based solutions (e.g. Naive Bayes, Random Forest, etc.) as well as state-of-the-art large language models based on deep neural networks (e.g. BERT, RoBERTa).

The project will provide a comprehensive insight into how accurately a particular method is able to capture latent dimensions of central bank communication. This is particularly important as accurate measurement is a prerequisite for getting reliable results from the resulting econometric models. Preliminary results so far suggest that there are differences in magnitude between the models, which may also have a significant impact on the results of econometric models. The eventual fine-tuned language model will be open access and will be made available in the relevant data repository.

Project participants
Tamás Barczikay
Ákos Máté

Publication
Máté Ákos, Sebők Miklós, Barczikay Tamás. The effect of central bank communication on sovereign bond yields: The case of Hungary. PLOS ONE, 16 (2). pp. 1–28., 2021

Conference paper
Máté Ákos, Barczikay Tamás. European Central Bank communication during crises: ditching the boilerplate?, 4th Annual COMPTEXT Conference, Dublin, 5–7 May 2022.

Repositories
GitHub – poltextlab/central_bank_communication: Replication materials
Script analysing the sentiment of central bank communication
R software package for the analysis of monetary sentiment