Big Data Analytics

Aims of the course

The proliferation of social media and digitalization in every aspect of business activity resulted in the creation of big data: ample amounts of mostly unstructured data. In a parallel development, we are able to reliably and inexpensively store huge volumes of data, efficiently analyse them, and extract relevant information. The advent of big data is already allowing for better measurement of economic effects and outcomes and is enabling novel research designs across a range of topics. Over time, these data are likely to affect the types of questions economists pose, by allowing for more focus on population variation and the analysis of a broader range of economic activities and interactions.
In the course, we first explore and discuss key big data concepts and present the most commonly used data analytical techniques and tools. The emergent technologies for processing and analyzing large amounts of data will be introduced. We will present the most common methods of in-memory analytics, data mining, and text mining on the cases of the use of these techniques in economic and financial applications. The core part of the course is designed as a hands-on learning experience. Case studies are used to illustrate and enrich the lecture material.

Course syllabus

1. Introduction to Big Data
• Understanding Big Data
• Big Data Technologies
• Analytics Technologies
• Big Data Value and Challenges
2. Data Capturing, Preparation, and Storage
• Data Integration
• Cleansing and Transformation
• Data Mashup
• Data Storage
• Framework used for distributed storage and processing of big data sets
3. In-Memory Data Analytics
• Associative data analytics
• Multidimensional data analytics
4. Data Mining
• Data mining vs. traditional statistics
• Data mining process
• Data Exploration
• Classification
• Regression models
• Association analysis
• Clustering
• Deep learning
• Model evaluation and feature selection
5. Text Mining
• Text mining process
• Corpus
• Web scrapping
• Semantic parsing vs. Bag of words
• Transformations (tokenization, stemming, stopwords filtering, …)
• Analysis (word frequencies, clustering, categorization …)
• Sentiment analysis

Course director(s)

  • Office Hours
  • Wednesday at 12:00 in RZ-404
  •  
  •  
  •  
  •  
  • Office Hours
  • Wednesday at 15:00 in Zoom Room
  •  
  •  
  •  
  •  
 
To top of page