Tải bản đầy đủ (.pdf) (514 trang)

Bookflare net big data and machine learning in quantitative investment

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (8.82 MB, 514 trang )


Table of Contents
Cover
CHAPTER 1: Do Algorithms Dream About Artificial Alphas?
1.1 INTRODUCTION
1.2 REPLICATION OR REINVENTION
1.3 REINVENTION WITH MACHINE LEARNING
1.4 A MATTER OF TRUST
1.5 ECONOMIC EXISTENTIALISM: A GRAND DESIGN OR
AN ACCIDENT?
1.6 WHAT IS THIS SYSTEM ANYWAY?
1.7 DYNAMIC FORECASTING AND NEW
METHODOLOGIES
1.8 FUNDAMENTAL FACTORS, FORECASTING AND
MACHINE LEARNING
1.9 CONCLUSION: LOOKING FOR NAILS
NOTES
CHAPTER 2: Taming Big Data
2.1 INTRODUCTION: ALTERNATIVE DATA – AN
OVERVIEW
2.2 DRIVERS OF ADOPTION
2.3 ALTERNATIVE DATA TYPES, FORMATS AND
UNIVERSE
2.4 HOW TO KNOW WHAT ALTERNATIVE DATA IS
USEFUL (AND WHAT ISN'T)
2.5 HOW MUCH DOES ALTERNATIVE DATA COST?
2.6 CASE STUDIES
2.7 THE BIGGEST ALTERNATIVE DATA TRENDS
2.8 CONCLUSION
REFERENCE



NOTES
CHAPTER 3: State of Machine Learning Applications in
Investment Management
3.1 INTRODUCTION
3.2 DATA, DATA, DATA EVERYWHERE
3.3 SPECTRUM OF ARTIFICIAL INTELLIGENCE
APPLICATIONS
3.4 INTERCONNECTEDNESS OF INDUSTRIES AND
ENABLERS OF ARTIFICIAL INTELLIGENCE
3.5 SCENARIOS FOR INDUSTRY DEVELOPMENTS
3.6 FOR THE FUTURE
3.7 CONCLUSION
REFERENCES
NOTES
CHAPTER 4: Implementing Alternative Data in an Investment
Process
4.1 INTRODUCTION
4.2 THE QUAKE: MOTIVATING THE SEARCH FOR
ALTERNATIVE DATA
4.3 TAKING ADVANTAGE OF THE ALTERNATIVE DATA
EXPLOSION
4.4 SELECTING A DATA SOURCE FOR EVALUATION
4.5 TECHNIQUES FOR EVALUATION
4.6 ALTERNATIVE DATA FOR FUNDAMENTAL
MANAGERS
4.7 SOME EXAMPLES
4.8 CONCLUSIONS
REFERENCES
CHAPTER 5: Using Alternative and Big Data to Trade Macro

Assets
5.1 INTRODUCTION


5.2 UNDERSTANDING GENERAL CONCEPTS WITHIN BIG
DATA AND ALTERNATIVE DATA
5.3 TRADITIONAL MODEL BUILDING APPROACHES AND
MACHINE LEARNING
5.4 BIG DATA AND ALTERNATIVE DATA: BROAD BASED
USAGE IN MACRO BASED TRADING
5.5 CASE STUDIES: DIGGING DEEPER INTO MACRO
TRADING WITH BIG DATA AND ALTERNATIVE DATA
5.6 CONCLUSION
REFERENCES
CHAPTER 6: Big Is Beautiful: How Email Receipt Data Can Help
Predict Company Sales
6.1 INTRODUCTION
6.2 QUANDL'S EMAIL RECEIPTS DATABASE
6.3 THE CHALLENGES OF WORKING WITH BIG DATA
6.4 PREDICTING COMPANY SALES
6.5 REAL TIME PREDICTIONS
6.6 A CASE STUDY: SALES
REFERENCES
NOTES
CHAPTER 7: Ensemble Learning Applied to Quant Equity:
Gradient Boosting in a Multifactor Framework
7.1 INTRODUCTION
7.2 A PRIMER ON BOOSTED TREES
7.3 DATA AND PROTOCOL
7.4 BUILDING THE MODEL

7.5 RESULTS AND DISCUSSION
7.6 CONCLUSION
REFERENCES
NOTES
CHAPTER 8: A Social Media Analysis of Corporate Culture


8.1 INTRODUCTION
8.2 LITERATURE REVIEW
8.3 DATA AND SAMPLE CONSTRUCTION
8.4 INFERRING CORPORATE CULTURE
8.5 EMPIRICAL RESULTS
8.6 CONCLUSION
REFERENCES
CHAPTER 9: Machine Learning and Event Detection for Trading
Energy Futures
9.1 INTRODUCTION
9.2 DATA DESCRIPTION
9.3 MODEL FRAMEWORK
9.4 PERFORMANCE
9.5 CONCLUSION
REFERENCES
NOTES
CHAPTER 10: Natural Language Processing of Financial News
10.1 INTRODUCTION
10.2 SOURCES OF NEWS DATA
10.3 PRACTICAL APPLICATIONS
10.4 NATURAL LANGUAGE PROCESSING
10.5 DATA AND METHODOLOGY
10.6 CONCLUSION

REFERENCES
CHAPTER 11: Support Vector Machine Based Global Tactical
Asset Allocation
11.1 INTRODUCTION
11.2 FIFTY YEARS OF GLOBAL TACTICAL ASSET
ALLOCATION
11.3 SUPPORT VECTOR MACHINE IN THE ECONOMIC


LITERATURE
11.4 A SVR BASED GTAA
11.5 CONCLUSION
REFERENCES
CHAPTER 12: Reinforcement Learning in Finance
12.1 INTRODUCTION
12.2 MARKOV DECISION PROCESSES: A GENERAL
FRAMEWORK FOR DECISION MAKING
12.3 RATIONALITY AND DECISION MAKING UNDER
UNCERTAINTY
12.4 MEAN VARIANCE EQUIVALENCE
12.5 REWARDS
12.6 PORTFOLIO VALUE VERSUS WEALTH
12.7 A DETAILED EXAMPLE
12.8 CONCLUSIONS AND FURTHER WORK
REFERENCES
CHAPTER 13: Deep Learning in Finance: Prediction of Stock
Returns with Long Short Term Memory Networks
13.1 INTRODUCTION
13.2 RELATED WORK
13.3 TIME SERIES ANALYSIS IN FINANCE

13.4 DEEP LEARNING
13.5 RECURRENT NEURAL NETWORKS
13.6 LONG SHORT TERM MEMORY NETWORKS
13.7 FINANCIAL MODEL
13.8 CONCLUSIONS
Appendix A
REFERENCES
Biography
CHAPTER 1


CHAPTER 2
CHAPTER 3
CHAPTER 4
CHAPTER 5
CHAPTER 6
CHAPTER 7
CHAPTER 8
CHAPTER 9
CHAPTER 10
CHAPTER 11
CHAPTER 12
CHAPTER 13
End User License Agreement


List of Tables
Chapter 2
Table 2.2 Key criteria for assessing alternative data
usefulness

Chapter 4
Table 4.1 Average annualized return of dollar neutral,
equally weighted portf...
Table 4.2 Do complaints count predicts returns?
Table 4.3 The average exposure to common risk factors by
quintile
Table 4.4 Regression approach to explain the cross section of
return volatili...
Table 4.5 Complaints factor: significant at the 3% or better
level every year
Chapter 7
Table 7.1 Summary and examples of features per family type
Table 7.2 Analytics
Chapter 8
Table 8.1 Descriptive statistics on the user profiles of
Glassdoor.com
Table 8.2 Summary statistics of Glassdoor.com dataset
Table 8.3 Regression of reviewers' overall star ratings
Table 8.4 Topic clusters inferred by the topic model
Table 8.5 Illustrative examples of reviewer comments
Table 8.6 Descriptive statistics of firm characteristics
Table 8.7 Regression of company characteristics for
performance orientated fi...


Table 8.8 Regression of performance orientated firms and
firm value
Table 8.9 Regression of performance orientated firms and
earnings surprises
Chapter 9

Table 9.1 Performance statistics
Table 9.2 Summary statistics for RavenPack Analytics
Table 9.3 In sample performance statistics.
Table 9.4 Out of sample performance statistics
Table 9.5 Out of sample performance statistics
Table 9.6 Performance statistics
Chapter 10
Table 10.1 Fivefold cross validated predictive performance
results for the Ne...
Chapter 11
Table 11.1 Universe traded
Chapter 13
Table 13.1 Experiment 1: comparison of performance
measured as the HR for LST...
Table 13.2 Experiment 2 (main experiment)
Table 13.3 Experiment 2 (baseline experiment)
Table 13.4 Experiment 2 (stocks used for this portfolio)
Table 13.5 Experiment 2 (results in different market regimes)
Table 13.A.1 Periods for training set, test set and live dataset
in experimen...


List of Illustrations
Chapter 2
Figure 2.1 The law of diffusion of innovation.
Figure 2.2 Spending on alternative data.
Figure 2.3 Alternative dataset types.
Figure 2.4 Breakdown of alternative data sources used by the
buy side.
Figure 2.5 Breakdown of dataset's annual price.

Figure 2.6 Neudata's rating for medical record dataset.
Figure 2.7 Neudata's rating for Indian power generation
dataset.
Figure 2.8 Neudata's rating for US earnings performance
forecast.
Figure 2.9 Neudata's rating for China manufacturing dataset.
Figure 2.10 Neudata's rating for short positions dataset.
Figure 2.11 Carillion's average net debt.
Figure 2.12 Neudata's rating for short positions dataset.
Figure 2.13 Neudata's rating for invoice dataset.
Figure 2.14 Neudata's rating for salary benchmarking dataset.
Figure 2.15 Ratio of CEO total compensation vs employee
average, 2017.
Figure 2.16 Neudata's rating for corporate governance
dataset.
Chapter 3
Figure 3.1 AI in finance classification
Figure 3.2 Deep Learning Framework Example


Figure 3.3 Equity performance and concentration in portfolio
Figure 3.4 Evolution of Quant Investing
Chapter 4
Figure 4.1 Technology Adoption Lifecycle
Figure 4.2 Cumulative residual returns to blogger
recommendations.
Figure 4.3 Annualized return by TRESS bin.
Figure 4.4 TRESS gross dollar neutral cumulative returns.
Figure 4.5 alpha DNA's Digital Bureau.
Figure 4.6 Percentage revenue beat by DRS decile.

Figure 4.7 DRS gross dollar neutral cumulative returns.
Figure 4.8 Cumulative gross local currency neutral returns.
Figure 4.9 Percentile of volatility, by complaint frequency.
Chapter 5
Figure 5.1 Structured dataset – Hedonometer Index.
Figure 5.2 Scoring of words.
Figure 5.3 Days of the week – Hedonometer Index.
Figure 5.4 Bloomberg nonfarm payrolls chart.
Figure 5.5 Fed index vs recent USD 10Y yield changes.
Figure 5.6 USD/JPY Bloomberg score.
Figure 5.7 News basket trading returns.
Figure 5.8 Regressing news volume vs implied volatility.
Figure 5.9 Plot of VIX versus IAI.
Figure 5.10 Trading S&P 500 using IAI based rule vs VIX and
long only.
Figure 5.11 Implied distribution of GBP/USD around Brexit.
Chapter 6


Figure 6.1 Domino's Pizza sales peak at weekends…
Figure 6.2 …and at lunchtime.
Figure 6.3 Most popular pizza toppings: the pepperoni effect.
Figure 6.4 Amazon customers prefer Mondays…
Figure 6.5 …and take it easy at the weekend.
Figure 6.6 How an email receipt is turned into purchase
records.
Figure 6.7 The structure of Quandl's data offering.
Figure 6.8 Sample size over time.
Figure 6.9 Geographic distribution as of April 2017.
Figure 6.10 Coverage of US population on a state by state

basis as of April 2...
Figure 6.11 How long does a user typically spend in our
sample?
Figure 6.12 Six of the most expensive purchases made on
Amazon.com.
Figure 6.13 Seasonal pattern in fundamental data: Amazon's
quarterly sales.
Figure 6.14 Seasonal patterns in big data: Amazon's weekly
sales. The sales i...
Figure 6.15 Expedia's big data bookings split has changed
significantly over ...
Figure 6.16 A timeline for quarterly sales forecasts.
Figure 6.17 Bayesian estimation of quarterly revenue growth:
An example. The ...
Figure 6.18 Negative exponential distribution.
Figure 6.19 Dividing each quarter into 13 weeks.
Figure 6.20 Seasonal patterns in big data: Amazon's weekly
sales. The sales i...


Figure 6.21 Estimated seasonal component, Q1.
Figure 6.24 Estimated seasonal component, Q4.
Figure 6.25 Sales breakdown per type, Amazon.
Figure 6.26 Sales breakdown per region, Amazon.
Figure 6.27 Contributions to sales growth in Q1.
Figure 6.30 Contributions to sales growth in Q4.
Figure 6.31 e commerce vs. headline growth.
Figure 6.32 Headline growth vs. growth in North America.
Figure 6.33 Combining big data and consensus delivers
superior forecasts of t...

Figure 6.34 Improving forecasting ability as the sample size
increases. The p...
Figure 6.35 Big data can be used to predict sales…
Figure 6.36 …and sales surprises.
Figure 6.37 In sample vs. actual sales growth.
Figure 6.38 The results are robust. The data covers the period
2014Q2–2017Q1....
Figure 6.39 Real time prediction of sales growth in 2016 Q2.
The shaded area ...
Figure 6.40 Real time prediction of sales growth in 2016 Q3.
Figure 6.42 Real time prediction of sales growth in 2017 Q1.
Chapter 7
Figure 7.1 Two symbolic trees. Variations in the dependent
variable (y) are ...
Figure 7.2 Hierarchical clustering for rank correlation
between variable. Ran...
Figure 7.3 Fivefold cross validation for tree boosted models.
We maintain all...


Figure 7.4 Confusion matrix illustration. We explain the
confusion matrix in ...
Figure 7.5 Top 20 most important variables. We show the
most important variab...
Figure 7.6 Wealth curve for decile portfolios based on
multifactor signal.
Figure 7.8 Wealth curve for decile portfolios based on the
machine learning m...
Figure 7.9 Annualized performance comparison for each
decile of each model.

Chapter 8
Figure 8.1 Illustrative examples of Glassdoor reviews.
Figure 8.2 Illustrative example of topic modelling. A topic
model assumes tha...
Chapter 9
Figure 9.1 Relative variable importance using ELNET.
Features are scaled by t...
Figure 9.2 Cumulative log returns. The red vertical line marks
the beginning ...
Figure 9.3 Out of sample information ratios. The names on
the x axes specify ...
Figure 9.4 Cumulative log returns.
Figure 9.5 Out of sample performance statistics with
Ensemble.
Chapter 10
Figure 10.1 The NLP pipeline from preprocessing to feature
representation an...
Figure 10.2 Flow of inference into decision and action.
Figure 10.3 Example receiver operator characteristics (ROC)
and precision rec...


Chapter 11
Figure 11.1 Three families of asset allocation.
Figure 11.2 The kernel trick
Figure 11.3 The kernel trick: a non separable case
Figure 11.4 SVR GTAA compared to 60% bond, 40% equity
(non compounded arithme...
Figure 11.5 SVR GTAA compared to 60% bond, 40% equity
(non compounded arithme...

Chapter 12
Figure 12.1 Interacting system: agent interacts with
environment.
Figure 12.2 Cumulative simulated out of sample P/L of
trained model. Simulate...
Chapter 13
Figure 13.1 Recurrent neural network unrolled in time.
Figure 13.2 The rectified linear unit (ReLu) and sigmoid
functions.
Figure 13.3 Memory cell or hidden unit in an LSTM recurrent
neural network.
Figure 13.4 LSTM recurrent neural network unrolled in time.
s for the cell st...


Founded in 1807, John Wiley & Sons is the oldest independent
publishing company in the United States. With offices in North
America, Europe, Australia, and Asia, Wiley is globally committed to
developing and marketing print and electronic products and services
for our customers' professional and personal knowledge and
understanding.
The Wiley Finance series contains books written specifically for
finance and investment professionals as well as sophisticated
individual investors and their financial advisors. Book topics range
from portfolio management to e commerce, risk management,
financial engineering, valuation and financial instrument analysis, as
well as much more.
For a list of available titles, visit our website at
www.WileyFinance.com.



Big Data and Machine Learning
in Quantitative Investment
TONY GUIDA


© 2019 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ,
United Kingdom
For details of our global editorial offices, for customer services and for information about how
to apply for permission to reuse the copyright material in this book please see our website at
www.wiley.com.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording
or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without
the prior permission of the publisher.
Wiley publishes in a variety of print and electronic formats and by print on demand. Some
material included with standard print versions of this book may not be included in e books or
in print on demand. If this book refers to media such as a CD or DVD that is not included in
the version you purchased, you may download this material at .
For more information about Wiley products, visit www.wiley.com.
Designations used by companies to distinguish their products are often claimed as
trademarks. All brand names and product names used in this book are trade names, service
marks, trademarks or registered trademarks of their respective owners. The publisher is not
associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their
best efforts in preparing this book, they make no representations or warranties with respect to
the accuracy or completeness of the contents of this book and specifically disclaim any implied
warranties of merchantability or fitness for a particular purpose. It is sold on the

understanding that the publisher is not engaged in rendering professional services and neither
the publisher nor the author shall be liable for damages arising herefrom. If professional
advice or other expert assistance is required, the services of a competent professional should
be sought.
Library of Congress Cataloging in Publication Data is Available:
ISBN 9781119522195 (hardback) ISBN 9781119522218 (ePub)
ISBN 9781119522089 (ePDF)
Cover Design: Wiley
Cover Images: © Painterr/iStock /Getty Images;
© monsitj/iStock /Getty Images


CHAPTER 1
Do Algorithms Dream About Artificial
Alphas?
Michael Kollo


1.1 INTRODUCTION
The core of most financial practice, whether drawn from equilibrium
economics, behavioural psychology, or agency models, is traditionally
formed through the marriage of elegant theory and a kind of ‘dirty’
empirical proof. As I learnt from my years on the PhD programme at
the London School of Economics, elegant theory is the hallmark of a
beautiful intellect, one that could discern the subtle tradeoffs in agent
based models, form complex equilibrium structures and point to the
sometimes conflicting paradoxes at the heart of conventional truths.
Yet ‘dirty’ empirical work is often scoffed at with suspicion, but
reluctantly acknowledged as necessary to give substance and real
world application. I recall many conversations in the windy courtyards

and narrow passageways, with brilliant PhD students wrangling over
questions of ‘but how can I find a test for my hypothesis?’.
Many pseudo mathematical frameworks have come and gone in
quantitative finance, usually borrowed from nearby sciences:
thermodynamics from physics, Eto's Lemma, information theory,
network theory, assorted parts from number theory, and occasionally
from less high tech but reluctantly acknowledged social sciences like
psychology. They have come, and they have gone, absorbed (not
defeated) by the markets.
Machine learning, and extreme pattern recognition, offer a strong
focus on large scale empirical data, transformed and analyzed at such
scale as never seen before for details of patterns that lay undetectable
to previous inspection. Interestingly, machine learning offers very
little in conceptual framework. In some circles, it boasts that the
absence of a conceptual framework is its strength and removes the
human bias that would otherwise limit a model. Whether you feel it is
a good tool or not, you have to respect the notion that process speed is
only getting faster and more powerful. We may call it neural networks
or something else tomorrow, and we will eventually reach a point
where most if not all permutations of patterns can be discovered and
examined in close to real time, at which point the focus will be almost
exclusively on defining the objective function rather than the structure


of the framework.
The rest of this chapter is a set of observations and examples of how
machine learning could help us learn more about financial markets,
and is doing so. It is drawn not only from my experience, but from
many conversations with academics, practitioners, computer
scientists, and from volumes of books, articles, podcasts and the vast

sea of intellect that is now engaged in these topics.
It is an incredible time to be intellectually curious and quantitatively
minded, and we at best can be effective conduits for the future
generations to think about these problems in a considered and
scientific manner, even as they wield these monolithic technological
tools.


1.2 REPLICATION OR REINVENTION
The quantification of the world is again a fascination of humanity.
Quantification here is the idea that we can break down patterns that
we observe as humans into component parts and replicate them over
much larger observations, and in a much faster way. The foundations
of quantitative finance found their roots in investment principles, or
observations, made by generations and generations of astute investors,
who recognized these ideas without the help of large scale data.
The early ideas of factor investing and quantitative finance were
replications of these insights; they did not themselves invent
investment principles. The ideas of value investing (component
valuation of assets and companies) are concepts that have been
studied and understood for many generations. Quantitative finance
took these ideas, broke them down, took the observable and scalable
elements and spread them across a large number of (comparable)
companies.
The cost to achieving scale is still the complexity in and nuance about
how to apply a specific investment insight to a specific company, but
these nuances were assumed to diversify away in a larger scale
portfolio, and were and are still largely overlooked.1 The relationship
between investment insights and future returns were replicated as
linear relationships between exposure and returns, with little attention

to non linear dynamics or complexities, but instead, focusing on
diversification and large scale application which were regarded as
better outcomes for modern portfolios.
There was, however, a subtle recognition of co movement and
correlation that emerged from the early factor work, and it is now at
the core of modern risk management techniques. The idea is that
stocks that have common characteristics (let's call it a quantified
investment insight) have also correlation and co dependence
potentially on macro style factors.
This small observation, in my opinion, is actually a reinvention of the
investment world which up until then, and in many circles still,


thought about stocks in isolation, valuing and appraising them as if
they were standalone private equity investments. It was a reinvention
because it moved the object of focus from an individual stock to a
common ‘thread’ or factor that linked many stocks that individually
had no direct business relationship, but still had a similar
characteristic that could mean that they would be bought and sold
together. The ‘factor’ link became the objective of the investment
process, and its identification and improvement became the objective
of many investment processes – now (in the later 2010s) it is seeing
another renaissance of interest. Importantly, we began to see the
world as a series of factors, some transient, some long standing, some
short and some long term forecasting, some providing risk and to be
removed, and some providing risky returns.
Factors represented the invisible (but detectable) threads that wove
the tapestry of global financial markets. While we (quantitative
researchers) searched to discover and understand these threads, much
of the world focused on the visible world of companies, products and

periodic earnings. We painted the world as a network, where
connections and nodes were the most important, while others painted
it as a series of investment ideas and events.
The reinvention was in a shift in the object of interest, from individual
stocks to a series of network relationships, and their ebb and flow
through time. It was subtle, as it was severe, and is probably still not
fully understood.2 Good factor timing models are rare, and there is an
active debate about how to think about timing at all. Contextual factor
models are even more rare and pose especially interesting areas for
empirical and theoretical work.


1.3 REINVENTION WITH MACHINE LEARNING
Reinvention with machine learning poses a similar opportunity for us
to reinvent the way we think about the financial markets, I think in
both the identification of the investment object and the way we think
of the financial networks.
Allow me a simple analogy as a thought exercise. In handwriting or
facial recognition, we as humans look for certain patterns to help us
understand the world. On a conscious, perceptive level, we look to see
patterns in the face of a person, in their nose, their eyes and their
mouth. In this example, the objects of perception are those units, and
we appraise their similarity to others that we know. Our pattern
recognition then functions on a fairly low dimension in terms of
components. We have broken down the problem into a finite set of
grouped information (in this case, the features of the face), and we
appraise those categories. In modern machine learning techniques, the
face or a handwritten number is broken down into much smaller and
therefore more numerous components. In the case of a handwritten
number, for example, the pixels of the picture are converted to

numeric representations, and the patterns in the pixels are sought
using a deep learning algorithm.
We have incredible tools to take large scale data and to look for
patterns in the sub atomic level of our sample. In the case of human
faces or numbers, and many other things, we can find these patterns
through complex patterns that are no longer intuitive or
understandable by us (consciously); they do not identify a nose, or an
eye, but look for patterns in deep folds of the information.3 Sometimes
the tools can be much more efficient and find patterns better, quicker
than us, without our intuition being able to keep up.
Taking this analogy to finance, much of asset management concerns
itself with financial (fundamental) data, like income statements,
balance sheets, and earnings. These items effectively characterize a
company, in the same way the major patterns of a face may
characterize a person. If we take these items, we may have a few


hundred, and use them in a large scale algorithm like machine
learning, we may find that we are already constraining ourselves
heavily before we have begun.
The ‘magic’ of neural networks comes in their ability to recognize
patterns in atomic (e.g. pixel level) information, and by feeding them
higher constructs, we may already be constraining their ability to find
new patterns, that is, patterns beyond those already identified by us in
linear frameworks. Reinvention lies in our ability to find new
constructs and more ‘atomic’ representations of investments to allow
these algorithms to better find patterns. This may mean moving away
from the reported quarterly or annual financial accounts, perhaps
using higher frequency indicators of sales and revenue (relying on
alternate data sources), as a way to find higher frequency and,

potentially, more connected patterns with which to forecast price
movements.
Reinvention through machine learning may also mean turning our
attention to modelling financial markets as a complex (or just
expansive) network, where the dimensionality of the problem is
potentially explosively high and prohibitive for our minds to work
with. To estimate a single dimension of a network is to effectively
estimate a covariance matrix of n × n. Once we make this system
endogenous, many of the links within the 2D matrix become a
function of other links, in which case the model is recursive, and
iterative. And this is only in two dimensions. Modelling the financial
markets like a neural network has been attempted with limited
application, and more recently the idea of supply chains is gaining
popularity as a way of detecting the fine strands between companies.
Alternate data may well open up new explicitly observable links
between companies, in terms of their business dealings, that can form
the basis of a network, but it's more likely that prices will move too
fast, and too much, to be simply determined by average supply
contracts.


×