Research project is (co)funded by the Slovenian Research and Innovation Agency

Project

Member of University of Ljubljana

School of Economics and Business

Code

J5-2554

Project

Quantitative and qualitative analysis of the unregulated corporate financial reporting

Period

1.9.2020 - 31.12.2023

Range on year

1.535 research hours

Head

Igor Lončarski


Research activity

Social sciences/Economics

Research Organisation

Jožef Stefan Institute


Abstract

One of the key functions of the financial system is to facilitate the transfer of funds from savers with excess funds (typically households) to entities that require funds for capital investment (typically nonfinancial companies). The main goal of financial reporting in the financial system is to ensure high-quality, useful information about the financial position of firms, their performance and changes in their financial position is available (IASB Framework 2015) to a wide range of users, including existing and potential investors, financial institutions, employees, the government, etc. The central element of the formal system of financial reporting is accounting standards. The EU has adopted the International Financial Reporting Standards in 2005. The issue of the quality of financial reporting has become one of the central issues during the recent financial crisis and has received considerable attention from the society at large ever since. A reflection of this are the recent changes of financial as well as non-financial reporting and auditing regulations. The collective aim of these developments was to increase transparency of information that firms, and their auditors communicate to users (investors, regulators, broader public users). These developments have collectively resulted in a large increase in the amount of financial and non-financial information provided by the financial reporting system. Most visibly, the annual reports of companies that represent the main output of the financial reporting system, have increased to several hundred pages. This increase sparked concerns that the amount of data exceeds the capacity of investors and other stakeholders to obtain useful information out of these reports. These concerns were noted by stakeholders as well as by standard setters. The purpose of the proposed research is to study the relationships among the characteristics of financial reports and financial indicators with the latest state-of-the-art data collection and data analysis approaches (e.g., eye-tracking devices, deep learning techniques). Effort will be spent also for development of the resources and methodologies that are necessary for such undertakings. Wide usefulness of these results will be strengthened by the use of novel and promising approaches that allow for the use of such tools also with texts in other languages, which are not represented in the resources (e.g., languages that are not well supported with state-of-the-art computational linguistic resources).

The objectives of the proposed project are: 

·         To develop the methodologies for analysis of relations among the characteristics of financial texts (periodic reports) and the business performance of companies and their context.

·         To discover and assess the meaning and importance of patterns in relations among the key entities mentioned in texts of annual reports, both statically and overtime.

·         To study the contextualization of financial concepts with distributional semantics methods. We will observe the changes over sectors and over time relative to financial indicators, and thereby investigate the correlations between financial and textual context of financial terms.

·         To develop methods which allow multilingual analysis of financial reports and the use of a selected subset of developed resources and methods with languages that are not well supported with state-of-the-art computational linguistic resources.

·         To conduct a qualitative analysis of perceptions of users of annual reports and observation of their focus at reading annual reports to find how focused are they at different parts of annual reports and to better understand the reasons of why different user groups focus on different parts of annual reports while reading them.

Our research has implications for a large part of the economic system: financial markets, regulators, standard setters, broader users of annual reports.


Researchers

SICRIS


The phases of the project and their realization

- Descriptive and predictive analytics of financial indicators (WP1)

We created three text collections that vary by geographic location (USA, UK) and theme (financial sentiment, sustainability). We used the FIBO ontology for concept classification and conducted several methodological experiments to improve sentiment detection, including the latest language models such as BERT and finBERT. We specifically focused on multi-task learning, which enhances classification across multiple dimensions, such as sentiment, relevance, and sustainability. We also examined the connections between textual characteristics and financial indicators, where we found weak correlations. Furthermore, we developed new approaches for predicting related sustainability ratings and improving results by enriching data based on prior domain knowledge.

- Contextualisations of financial concepts with distributional semantics methods (WP3)

We are developing methods of distributional semantics for analyzing financial texts. We conducted a diachronic analysis based on clustering vector embeddings and monitoring changes over time, which we applied to American financial reports. We focused on changes in language related to sustainable development, where we observed an increasing use of specific ESG concepts in English annual reports. This part of the work also received an award at a workshop on computing social responsibility. We also performed a comparative analysis of financial themes and their distribution using quantitative indicators from the Refinitiv EIKON Datastream database, aiming to use these financial indicators in semantic analysis. We developed methods for data visualization and explanation, including building workflows on the ClowdFlows platform and using SHAP for model explanations. We enhanced this with methods that utilize the FIBO ontology for data generalization and explaining classification models.

- Qualitative analysis of perceptions of users of annual reports and observation of their focus at reading annual reports (WP5)

Within this work package, we conducted a qualitative analysis of user perceptions of annual reports. We began by observing users while reading annual reports and performed an in-depth analysis of the collected data. We used questionnaires to gather impressions and opinions and compared the findings from observations with questionnaire data. We examined the connections between diachronic and financial analysis and user observations. Preliminary results indicate that users can relatively accurately predict a company's sustainability orientation based on a brief address by the management in the report. We also discovered the influence of personality traits, such as trust in institutions and cognitive style, on user perceptions


Citations for bibliographic records

[1] STEPIŠNIK PERDIH, Timen, PELICON, Andraž, ŠKRLJ, Blaž, ŽNIDARŠIČ, Martin, LONČARSKI, Igor, POLLAK, Senja. Sentiment classification by incorporating background knowledge from financial ontologies. V: EL-HAJ, Mahmoud (ur.), RAYSON, Paul (ur.), ZMANDAR, Paul (ur.). Proceedings of the 4th Financial Narrative Processing Workshop, FNP 2022 Language Resources and Evaluation Conference, LREC 2022, 24 June 2022, Marseille, France : proceedings. Paris: European Language Resources Association = (ELRA), 2022. Str. 17-26. ISBN 979-10-95546-74-0. http://www.lrec-conf.org/proceedings/lrec2022/workshops/FNP/2022.fnp-1.0.pdf. [COBISS.SI-ID 114648579]

 

[2] PURVER, Matthew, MARTINC, Matej, ICHEV, Riste, LONČARSKI, Igor, SITAR ŠUŠTAR, Katarina, VALENTINČIČ, Aljoša, POLLAK, Senja. Tracking changes in ESG representation : Initial investigations in UK Annual reports. V: Proc. of the First Computing Social Responsibility Workshop within the 13th Language Resources and Evaluation Conference, Marseille, France, str. 9-14, ilustr. ISBN 979-10-95546-72-6. (best paper award) http://www.lrec-conf.org/proceedings/lrec2022/workshops/CSRNLP1/pdf/2022.csrnlp1-1.2.pdf. [COBISS.SI-ID 117765379]

 

[3] STEPIŠNIK PERDIH, Timen, POLLAK, Senja, ŠKRLJ, Blaž. JSI at the FinSim-2 task : ontology-augmented financial concept classification. V: LESKOVEC, Jurij (ur.), et al. The Web Conference : companion of the World Wide Web conference (WWW 2021) : [30th edition, Ljubljana, 19th - 23rd April, 2021]. New York: Association for Computing Machinery, 2021. Str. 298-301. ISBN 978-1-4503-8313-4. DOI: 10.1145/3442442.3451383. [COBISS.SI-ID 66960131]. Preprint at: https://arxiv.org/abs/2106.09230 

 

[4] STEPIŠNIK PERDIH, Timen, PELICON, Andraž, ŠKRLJ, Blaž, ŽNIDARŠIČ, Martin, LONČARSKI, Igor, POLLAK, Senja. Sentiment classification by incorporating background knowledge from financial ontologies : presented at London Text Analysis Conference, 2022. [COBISS.SI-ID 133193219]

 

[5] MONTARIOL, Syrielle, MARTINC, Matej, PELICON, Andraž, POLLAK, Senja, KOLOSKI, Boshko, LONČARSKI, Igor, VALENTINČIČ, Aljoša, SITAR ŠUŠTAR, Katarina, ICHEV, Riste, ŽNIDARŠIČ, Martin. Multi-task learning for features extraction in financial annual reports. V: KOPRINSKA, Irena (ur.). Machine learning and principles and practice of knowledge discovery in databases : international workshops of ECML PKDD 2022, Grenoble, France, September 19-23, 2022, : proceedings. Part 2. Cham: Springer, 2023. Str. 7-24. Communications in computer and information science (Print), vol. 1753. ISBN 978-3-031-23632-7. ISSN 1865-0929. DOI: 10.1007/978-3-031-23633-4_1. [COBISS.SI-ID 140549635], Open access available via ARXIV https://arxiv.org/abs/2404.05281

 

[6] ŠTIHEC, Jan, POLLAK, Senja, ŽNIDARŠIČ, Martin. Preliminary experimentation with combinations and extensions of forward-looking sentence detection wordlists. V: Proceedings of the 3rd Financial Narrative Processing Workshop, FNP 2021 : 15–16 September, 2021, Lancaster, UK. Stroudsburg: Association for Computational Linguistics = ACL, 2021. Str. 26-30. https://aclanthology.org/2021.fnp-1.pdf. [COBISS.SI-ID 90604291]

 

[7] ŽNIDARŠIČ, Martin, POLLAK, Senja, PODPEČAN, Vid. Interaktivno eksperimentiranje z besednimi vložitvami v platformi ClowdFlows = Interactive Experimentation with Word Embeddings in the ClowdFlows platform. V: LUŠTREK, Mitja (ur.), GAMS, Matjaž (ur.), PILTAVER, Rok (ur.). Slovenska konferenca o umetni inteligenci = Slovenian Conference on Artificial Intelligence : Informacijska družba - IS 2022 = Information Society - IS 2022 : zbornik 25. mednarodne multikonference = proceedings of the 25th international multiconference : zvezek A = volume A : 11. oktober 2022, 11 October 2022, Ljubljana, Slovenija. Ljubljana: Institut "Jožef Stefan", 2022. Str 47-50. Informacijska družba. ISBN 978-961-264-241-9. ISSN 2630-371X. http://library.ijs.si/Stacks/Proceedings/InformationSociety/2022/IS2022_Volume-A%20-%20SKUI.pdf. [COBISS.SI-ID 128197635]

 

[8] MARTINC, Matej, POLLAK, Senja, ROBNIK ŠIKONJA, Marko. Supervised and unsupervised neural approaches to text readability. Computational linguistics. 2021, vol. 47, no. 1, str. 141-179. ISSN 0891-2017. DOI: 10.1162/coli_a_00398. [COBISS.SI-ID 57293315]

 

[9] ŠKRLJ, Blaž, KOLOSKI, Boshko, POLLAK, Senja. Retrieval-efficiency trade-off of unsupervised keyword extraction. V: PONCELET, Pascal (ur.), IENCO, Dino (ur.). Discovery Science : 25th International Conference, DS 2022 : Montpellier, France, October 10–12, 2022 : proceedings. Cham: Springer Nature, cop. 2022. Str. 379-393, ilustr. Lecture notes in computer science (Internet), 13601. ISBN 978-3-031-18840-4. ISSN 1611-3349. DOI: 10.1007/978-3-031-18840-4_27. [COBISS.SI-ID 136819971] Open access version: https://arxiv.org/pdf/2208.07262.pdf.

 

[10] KOLOSKI, Boshko, MONTARIOL, Syrielle, PURVER, Matthew, POLLAK, Senja. Knowledge informed sustainability detection from short financial texts. V: The 4th Workshop on Financial Technology and Natural Language Processing (FinNLP) with FinSim4-ESG Shared Task : July 24, [2022], Vienna. [S. l.: s. n. 2022, str. 73-79], https://mx.nthu.edu.tw/~chungchichen/FinNLP2022_IJCAI/12.pdf. [COBISS.SI-ID 136802307]

 

[11] ICHEV, Riste, LONČARSKI, Igor, MONTARIOL, Syrielle, POLLAK, Senja, SITAR ŠUŠTAR, Katarina, TOMAN, Aleš, VALENTINČIČ, Aljoša, ŽNIDARŠIČ, Martin. Textual analysis of corporate sustainability reporting and corporate ESG scores. V: CZUPY, Gergely János (ur.), VÍG, Attila András. 14th Annual Financial Market Liquidity Conference, Budapest, Hungary, 9th-10th November, 2023 : book of abstracts. Budapest: Corvinus University of Budapest, 2023. Str. 4. https://www.uni-corvinus.hu/contents/uploads/2023/11/AFML_book_of_abstracts.c95.pdf. [COBISS.SI-ID 171865859]

 

[12] LONČARSKI, Igor, MONTARIOL, Syrielle, POLLAK, Senja, VALENTINČIČ, Aljoša, ŽNIDARŠIČ, Martin. Textual analysis of corporate sustainability reporting and corporate ESG scores. V: 2nd Conference on international finance, sustainable and climate finance and growth, Ljubljana, 18th-20th June 2023. [S. l.]: Future Finance and Economics Association, 2023. https://conference-service.com/ffea-ljubljana-2023/download/99ytg2mv/detailed_program_3.html. [COBISS.SI-ID 159089667]



Dodatne informacije

To top of page