Module 4: Data Science and Machine Learning

An Introduction to Machine Learning

  • What is mathematical modeling?
  • Classic tools
  • How is machine learning different?
  • The pros and cons of delegating to a machine
  • Simple methods (reinforcement; Unsupervised; Supervised learning)

Data Science in Finance

  • Supervised vs unsupervised learning
  • Primer on loss functions and essentials you need
  • Learning and linear models. Multiple linear regression diagnostics
  • Generalised least squares. Mahalanobis distance method
  • Dangers of overfitting. Decomposing estimation error

Classification and Clustering

  • K-Nearest Neighbours
  • Logistic Classifier and explicit maximum likelihood method
  • Principles of Bayesian classification, lowest possible error
  • Discriminant Analysis: Linear (LDA) and Quadratic (QDA)
  • K-mean clustering and hierarchical clustering

Practical Filtering Methods

  • How to specify dynamic systems (states, Markov properties)
  • Weighted least squares method
  • Time-varying regression estimation (Kalman Filtering)
  • Applications of filtering: CAPM betas, continuous-time filtering
  • Introduction to Markov Chains. Hidden Markov Models

Machine Learning & Predictive Analytics

  • Regression: liner regression, bias-variance decomposition, subset selection, shrinkage methods, regression in high dimensions
  • Support Vectors Machines: Classification and regression using SVM’s and kernel methods
  • Dimension reduction: Principal component analysis (PCA), kernel PCA, non-negative matrix decomposition, PageRank
  • Examples (2 worked examples)

Reinforcement Learning

  • What is Reinforcement Learning
  • Reinforcement Learning in terms of classical techniques for pricing derivatives
  • Pricing exotic options using Reinforcement Learning
  • Building advanced asset allocation
  • Trading strategies

AI Based Algo Trading Strategies Using Python

  • Basic financial data analysis with Python and pandas
  • Creating features and label data from financial time series for market prediction
  • Application of classification algorithms from machine learning to predict market movements
  • Vectorized backtesting of algorithmic trading strategies based on the predictions
  • Risk analysis for the algorithmic trading strategies

Digital Signal Processing for Finance

  • Importance of Signal Processing (SP); Characterisation and classification of signals
  • Discrete Time Signals
  • Fourier Transforms and z-Transforms
  • Continuous-time signals and sampling
  • Discrete Fourier Transforms

Co-Integration using R

  • Multivariate time series analysis
  • Financial time series: stationary and unit root
  • Vector Autoregression, a theory-free model
  • Equilibrium and Error Correction Model
  • Eagle-Granger Procedure
  • Cointegrating relationships and their rank
  • Estimation of reduced rank regression: Johansen Procedure
  • Stochastic modelling of equilibrium: Orstein-Uhlenbeck process
  • Statistical arbitrage using mean reversion

Machine Learning Lab

  • Sandbox: conda, environments, Python and R packages, MLib. Data sources
  • Logistic regression as a classifier: loss function, transition probabilities, softmax and appropriate penalty (Ridge regression)
  • Crossvalidation: samples selection and reshuffling. Precision and recall. Is the classifier random?
  • Support Vector Machines: hyperplane intuition, soft vs hard margin. Choice of kernel to tackle non-linear problems
  • Random Forest Classifiers: regression versions of Decision Trees and AdaBoost
  • Vignettes on neural nets to predict market returns, probabilistic programming, and Markov-switching GARCH
Lecture order and content may occasionally change due to circumstances beyond our control; however this will never affect the quality of the program.