Opinion analysis of political texts

The main objective of the project is to develop sentiment and emotion analysis procedures for the analysis of various types of Hungarian texts (online news portals, newspaper articles, political speeches and parliamentary speeches). From each text analysed, the procedures employed aim to extract the pieces of information expressing an assessment. The analysis can be conducted at different levels, depending in part on the basic unit of analysis and whether the object or the trigger of emotion are determined. The literature distinguishes between sentiment analysis along a positive-negative-neutral scale and emotion analysis based on multiple categories. The latter provides much more information on the emotional charge of a given unit.

We have been experimenting with a number of methods to achieve our goal. We have created substantial manually annotated sentiment and emotion corpora, which are then used for dictionary construction, machine training, and testing the effectiveness of machine learning and dictionary-based parsing algorithms. As part of the project, we have created a sentiment dictionary that works along a positive-negative scale, applying a word embedding technique to text published on online news portals. We have developed an inductive emotion categorisation system that is able to distinguish between 12 different emotions in political texts. As our system of categories is convertible to internationally used emotion categories, it is suitable for comparison with such categories.

Using a double-blind coding technique, we constructed a sentiment and emotion corpus of 5,700 sentences annotated at sentence level, and a sentiment and emotion corpus of parliamentary speeches annotated at sub-clause level (HunEmPoli), where the 39,840 identified emotions were linked to the relevant arguments. Both our corpora were annotated under strict quality control and with great unanimity among coders. In the current project phase, we are using training data generated from the HunEmPoli corpus and are fine-tuning the huBERT model to train a sentiment and emotion analysis model. We have also created a website to visualise our parliamentary speech corpus: https://napirendek.hu/erzelmek/.

 

Project participants
Csenge Guba
Orsolya Ring
Martina Katalin Szabó
Bendegúz Váradi
Veronika Vincze

 

Cooperating partners
Budapest University of Technology and Economics, Department of Telecommunications and Media Informatics, SmartLAB 
Charles University, Prague Kempelen Institute, Bratislava
Montana Tudásmenedzsment Kft.
Vistula University, Warsaw

 

Publications
Ring Orsolya, Vincze Veronika, Guba Csenge, Üveges István. HunEmPoli: magyar nyelvű, részletesen annotált emóciókorpusz (HunEmPoli: Hungarian emotions corpus with detailed annotations). In: Berend Gábor, Gosztolya Gábor, Vincze Veronika (szerk.) 19th Conference of Hungarian Computational Linguistics, University of Szeged, Information Technology Institute, 2023

Szabó Martina Katalin, Vincze Veronika, Ring Orsolya, Guba Csenge. Nagyot mondó képviselők? Fokozás a politikai kommunikációban (MPs telling tall tales? Stepping up emphasis in political communication). In: Berend, Gábor; Gosztolya, Gábor; Vincze, Veronika (eds.) 18th Conference of Hungarian Computational Linguistics, University of Szeged, Information Technology Institute, 2022

Üveges István, Vincze Veronika, Ring Orsolya, Guba Csenge. Aspect-based emotion analysis of Hungarian parliamentary speeches. In: Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis, Potsdam, 2022

Ring Orsolya (Erjavec, Tomaž, Ogrodniczuk, Maciej, Osenova, Petya et al.) The ParlaMint corpora of parliamentary proceedings. Language Resources and Evaluation, 2022 

 

Conference papers
Ring Orsolya, Guba Csenge, Vincze Veronika, Üveges István. HunEmPoli: Hungarian emotions corpus with detailed annotations, 19th Conference of Hungarian Computational Linguistics, Szeged, 26–27 January 2023
Üveges István, Vincze Veronika, Ring Orsolya, Guba Csenge. Aspect-based emotion analysis of Hungarian parliamentary speeches. KONVENS 2022, 2nd Workshop on Computational Linguistics for Political Text Analysis, Potsdam, 12–15 September 2022
Szabó Martina Katalin, Vincze Veronika, Ring Orsolya, Guba Csenge. MPs telling tall tales? Stepping up emphasis in political communication. 18th Conference of Hungarian Computational Linguistics, online, 27–28 January 2022

 

Repositories
Github - A novel cost-efficient use of BERT embeddings in 8-way emotion classification on a Hungarian media corpus
Github - Aspect based emotion analysis of Hungarian parliamentary speeches
Github - HunEmPoli corpus
Github - Possibilities and limitations of a lexicon-based sentiment analysis of Hungarian political news