Creation of Hungarian benchmark corpora to evaulation of machine learning algorithm

Creation of parliamentary speeches (1990-2020) and media, coded for public policy and political sentiment and with NLP layouts (eg NER) Media: Népszabadág (2003-2014) Magyar Nemzet (2003-2014) Index (1999-2016) Parlamentary speeches: Interpellation (1990-2020) Urgent Question (1994-2020) Agenda speeches (1990-2020) Speeches before the agenda (1990-2020)

Our aim is to create large, labelled corpora in Hungarian to develop the various machine learning algorithms and test their effectiveness. In the planned project, we will create benchmark databases by further developing and expanding the existing databases of the CAP project.