# Module 4 - Data Science & Machine Learning

### Text

Data science and machine learning are emerging as new and important areas in the industry. In this module, you'll learn the latest and most important big data and machine learning ideas and techniques used within quantitative finance.

### Row Content

An Introduction to Machine Learning

### Text

• What is mathematical modeling?
• Classic tools
• How is machine learning different?
• The pros and cons of delegating to a machine
• Simple methods (reinforcement; Unsupervised; Supervised learning)

Data Science in Finance

### Text

• Supervised vs unsupervised learning
• Primer on loss functions and essentials you need
• Learning and linear models. Multiple linear regression diagnostics
• Generalised least squares. Mahalanobis distance method
• Dangers of overfitting. Decomposing estimation error

Classification and Clustering

### Text

• K-Nearest Neighbours
• Logistic Classifier and explicit maximum likelihood method
• Principles of Bayesian classification, lowest possible error
• Discriminant Analysis: Linear (LDA) and Quadratic (QDA)
• K-mean clustering and hierarchical clustering

Practical Filtering Methods

### Text

• How to specify dynamic systems (states, Markov properties)
• Weighted least squares method
• Time-varying regression estimation (Kalman Filtering)
• Applications of filtering: CAPM betas, continuous-time filtering
• Introduction to Markov Chains. Hidden Markov Models

Machine Learning & Predictive Analytics

### Text

• Regression: liner regression, bias-variance decomposition, subset selection, shrinkage methods, regression in high dimensions
• Support Vectors Machines: Classification and regression using SVM’s and kernel methods
• Dimension reduction: Principal component analysis (PCA), kernel PCA, non-negative matrix decomposition, PageRank
• Examples (2 worked examples)

AI Based Algo Trading Strategies Using Python

### Text

• Basic financial data analysis with Python and pandas
• Creating features and label data from financial time series for market prediction
• Application of classification algorithms from machine learning to predict market movements
• Vectorized backtesting of algorithmic trading strategies based on the predictions
• Risk analysis for the algorithmic trading strategies

Digital Signal Processing for Finance

### Text

• Importance of Signal Processing (SP); Characterisation and classification of signals
• Discrete Time Signals
• Fourier Transforms and z-Transforms
• Continuous-time signals and sampling
• Discrete Fourier Transforms

Co-Integration using R

### Text

• Multivariate time series analysis
• Financial time series: stationary and unit root
• Vector Autoregression, a theory-free model
• Equilibrium and Error Correction Model
• Eagle-Granger Procedure
• Cointegrating relationships and their rank
• Estimation of reduced rank regression: Johansen Procedure
• Stochastic modelling of equilibrium: Orstein-Uhlenbeck process
• Statistical arbitrage using mean reversion

Machine Learning Lab

### Text

• Sandbox: conda, environments, Python and R packages, MLib. Data sources
• Logistic regression as a classifier: loss function, transition probabilities, softmax and appropriate penalty (Ridge regression)
• Crossvalidation: samples selection and reshuffling. Precision and recall. Is the classifier random?
• Support Vector Machines: hyperplane intuition, soft vs hard margin. Choice of kernel to tackle non-linear problems
• Random Forest Classifiers: regression versions of Decision Trees and AdaBoost
• Vignettes on neural nets to predict market returns, probabilistic programming, and Markov-switching GARCH

### Message Text

Lecture order and content may occasionally change due to circumstances beyond our control; however this will never affect the quality of the program.

### Row Content

Equities & Currencie
Fixed Income