Models for Unexpected Events: BREXITKevin Pratt, Chief Scientist, ZZAlpha, LTD., MSCS, JD
When all predictive models use historical information to make forward predictions, how can they react to the unknown? The assumption is that the future will look like the past with relatively minor readjustment. That approach can work okay when the context is fundamentally unchanged. (That unchangedness is named a "stationarity assumption" in statistics).
In financial predictions (as well as many other contexts), unexpected events occur - infrequently and at irregular intervals. See e.g. "The Black Swan: The Impact of the Highly Improbable", Nassim N.Taleb (2008).
For a financial modeler, the question is how to model for better real-time decisions in reaction to an unknown, unexpected, significant event. (This is a somewhat different question than for the black swan investor who is interested in calculating the time until occurrence, burn rate to stay in the game, and value of the ultimate capture opportunity. See e.g. "The Big Short", Michael Lewis (2010).)
The BREXIT vote illustrates the challenge. The investor looks at the graph below and asks the modeler: "Is your approach (and most current recommendation product) still effective?"
Fundamentally, the investor is asking: a) do you have a model for this new context? b) how good is it? and c) how fast can you "turn it on"? Unfortunately, most modelers would respond: "No; I donít know; and I donít know".
Single optimal models
The most common statistical model is a single, optimized, regression model using a small number of "strong" historical features. The underlying assumption of such models is that there is exactly one predominant, underlying causative driver that is reflected in the features and so can be effectively inferred by the model. A BREXIT event immediately exposes the brittleness of such models. We have all see such models that effectively continue (after the hurricane) to direct all the cars down the freeway into the flooded tunnel while declaring, "It was the optimal routing for all traffic yesterday."
A second common approach uses switched multiple models, where some meta-feature(s) decides which model to apply. E.g. the credit card company that first determines whether the merchant location is far from the home of the customer and if so applies its "traveling" model instead of its usual "near home" model to authorize or decline a transaction. This approach depends on the ability to effectively "cluster" common context types into groupings so that segmented models can be built and optimized. The new and unexpected context seldom fits into an apriori cluster and so confounds the switched multiple model approach.
The third common approach is to resurrect a "best" model from long ago whose model input values "best match" (modelers sometimes say "nearest neighbor") the values available from the current, unexpected event. The difficulty is that the typical inference e.g., "this looks just like the beginning of the great depression" is based on invalid stationarity assumptions about similarity of legal regime, world trade, industrial motivations, commercial alliances, etc. Those assumptions and contexts are typically unstated and were not part of the data (or are no longer available) used in those long ago models.
Multi-causative learning models
A more sophisticated (and more difficult to implement) approach is a multi-causative learning model. Core to a learning model is the continual updating of inferences as outcomes attributed to earlier inputs are added to the historic knowledge. Multi-causative refers to a model technology that accommodates disparate, concurrent causations for an outcome (unlike regression models) e.g. a stock price dropped because of the general market, public news, private news, AND/OR the need of a major investor needing to obtain funds by selling.
For learning models, the challenge is to quickly and correctly incorporate the results associated to changed context inputs. Obviously, if an investment horizon (duration between purchase and sale) is e.g. one month, the result of a purchase today (soon after the BREXIT vote) will not be known for one month, and nothing can be learned about the success of today's purchase a specific security until then. Also obvious is that the price during the duration will bounce around (called "noise" in signal processing) and a downward (or upward) spike may give false preliminary results. The advantage of a multi-causative model is that it has many more available understandings of historical sudden, localized non-stationarities that can immediately approximate the current unexpected situation.
We call the time for a result to be incorporated into learning the "turning radius." The notion is that the first unexpected input may be a fluke, the second helps establish that a non-stationary is persistent, the third adds some confidence, etc. and it is not until the results are known that the model can re-establish itself with great confidence. Of course, a learning model in a dynamic system such as the financial markets is constantly "turning" as input values reassemble into new patterns. In practice, we find a model that turns too quickly can generate long term poorer results than one that learns more cautiously (even setting aside transaction costs).
Our ZZAlpha analytics include modeling of turning radius. But, during the apparently uncertain impacts for the duration of "turning", the multi-causative model is applying its accumulated historical, local non-stationarity learnings. In essence, it does the best that can be immediately done.
A knee-jerk reaction to an unexpected event is to "get out" i.e. disregard models until the dust settles. The difficulty with that "intuitive" approach is that humans produce poor results when applying it in financial markets. Human based "turning radius" is worse than automated learning models (humans stay "snake-bit" longer).
Solid, profitable recommendations in a turbulent world
So, in conclusion, what does a multi-causative learning model say to the questions? a) do you have a model for this new context? b) how good is it? and c) how fast can you "turn it on"?
It says it has a model prepared, it has a credible track record of how well it will likely do based on many prior localized "nearest neighbors" results, and it is already running in real-time. That is about as good as it gets in a turbulent world.
Because ZZAlpha implemented multi-causative learning models over 5 years ago, we have a massive historical store of experience and a certified track record. The future is always uncertain, but - -
Our models are ready for this (and future) BREXIT - - although we never saw it coming.
ConservativeStockPicks.com is an imprint of ZZAlpha Ltd.
Kevin B. Pratt has been Chief Scientist at ZZAlpha LTD. since 2010
He was also Lead, Anti-Fraud Analytics, Sr. Analytics Scientist in the Teradata Big Data and Advanced Analytics Group until June 2016.
There was and is no connection whatsoever between ZZAlpha LTD.and Teradata Corp.