Tuesday, October 10, 2006
The Case for a BlackBox Approach to Modeling the Financial Markets
It is an accepted principle that in attempting to build a model of something in the real world, one should have a deep knowledge of the subject that is being modeled. Because domain expertise is crucial for choice of input variables, as well as clear statement of the relationship between the variables, as well as the quantification of the coefficients [parameters] for each equation of the model. However, in recent times, there is some serious study of blackbox approaches to modeling. A blackbox type of model is one where  choice of Input need not be restricted to those which are justified to be inputs according to the theoretical principles of the subject to be modeled.  A clear statement of the relationship between input variables is not absolutely necessary, instead leaving it to the blackbox to work out the relationship.  For , non-parametric methods machine-learning methods such as neural nets, fuzzy logic and genetic algorithms are used to classify, detect patterns or predict  What comes out of the blackbox is accepted, and there is no need to ask why.
A blackbox approach may seem to be frivolous or indeed heretical, to traditional modelers. But there some some arguements to justify it. Here we will talk about it's place in the modeling of financial markets, specifically, the stock market. Stock markets are really complex beasts which legions of economists past and present have failed to understand fully, and definitely not understood enough to predict. Like the weather, it is a Complex Adaptive System, in which very small differences can, with feedback loops produce results which are completely unpredictable. Not only are all the elements of Chaos theory with it's strange attractors, self-similarity, and self-organization at work here. Elements of pure randomness also exist, as well as deterministic elements such as auto-correlation, cycles, and trends
Stock markets behave differently at different times, being dependent on human psychological behavior[ put simply, their mood and how they perceive a situation]. Not only can a rise in interest rates from 1 % to 2 % have different effects from a rise from 2 % to 3 %. The mood of the market players at the time of the change in rates also counts. Also, a rise from 1 % to 2 % may not have the same effect as a fall from 2 % to 1 %. Also, at different times, a variable may be or may not be a significant input to the model.
Exactly the same news that at one time is viewed as positive for the economy can be perceived at another time as negative. Sometimes high employment is viewed as good for the market increasing the profits of Companies. Sometimes the same employment figures might be viewed as inflationary and likely to raise the demon of stock markets-interest rates. Countless other examples abound of how the same glass can be seen as half-full or half-empty.
Attempts to model the ever-changing character of the market have been made by using regime-switching or threshold models. But so far, this has had limited success as at times the market has changed into a totally different animal that was not envisaged by the model.
The main variables in a model of the U.S. stock market which could give valuations and forecast returns on a share are : The macroeconomic long term interest rate, as represented by the yield on the 30-year Treasury Bond  Earnings per share  Analysts forecast of Earnings growth. A model based on these fundamentals has been constructed and found to be quite robust, by Prof. Zhiwu, Chen of Yale University #. [ see www.valuengine.com ]All that was required was a re-estimation of parameters every month or so. But in addition to the 3 main variables, Prof. Chen's model included factors such as the Industry and Sector the stock is in, it's comparison with peers in the same industry, and many other technicals of the stock viz it's market capitalization [size], liquidity, Beta, Momentum, historical volatility etc.
# let me add that such a model only works because the U.S. market is relatively 'efficient' in pricing shares. This is due to information of all sorts that is available almost instantly via media channels, due to a good regulatory environment that gives relatively high transparency, and due to a sophisticated investor base. Such a fundamentals approach will be less effective in smaller less-developed markets like China, Malaysia, Russia or Brazil. In which case,we should recognize that a stock market has many characteristics of a stochastic process, and that intrinsic value doesn't always determine prices in the short or medium term. A Blackbox approach is even more suitable for modeling a developing stock market.
But it is possible that many other variables could be found for a model of the stocks in the U.S. stock market. The relationship between the stock market and bonds, commodities, the US$, other stock markets, the price of gold, the price of oil, inflation, employment statistics and so on..... A blackbox model that took all these as Inputs, but at regular time intervals re-evaluate their relevance and significance to the output [discarding the insignificant contributors] would be one solution. When so many variables are being used, a way to perform dimension reduction must be done, whether with principal components analysis or by fuzzification, using wavelets etc. Indeed, with globalization, the need for inter-market analysis becomes more urgent. Bond. stocks, Commodities, Forex-they all are tied together in a web of complexity. Also if a sector of the Economy such as Housing has a large multipler effect on Employment, or Comsumer Demand, it could be a significant input variable for a model of the stock market. This is especially so for countries like the USA, with huge population base, where domestic demand accounts for a large proportion of GDP, and exports are not so significant
I anticipate that such a model would have a lot of built-in noise. And so the importance of extensive data preprocessing to normalize, standardize, compress, smooth, detrend the data. A Neuro-Fuzzy model with optimization by Genetic Algorithms at the last stage would seem to be suitable. Of course, even when we use a blackbox approach, selection of variables by a domain expert is necessary. We cannot just throw in everything but the kitchen sink.
One way to enhance the performance of the model [purpose of the model is pattern recognition, classification, and prediction], is to state output in terms of probability-but using a leptokurtic probability distribtion with long fatter tails and a higher, narrower peak such as found in financial data.
Lastly, we have to ask the question why blackbox modeling of the stock market has not really caught on. Certainly the technology is there. So why are models still being painstakingly built with essentially linear and parametric tools like those used by Econometricians viz ARIMA, ARCH, GARCH etc. One of the reasons is that there is still reluctance to do something which is not fully understood, to have no explanation of what goes on inside the blackbox. Until someone can show that such blackbox models produce better performance, the situation is unlikely to change. Also, experience with such models [including my own limited experience with limited access to good data] has shown that models perform better in short term analysis. Just like the weather forecast, when you go beyond three days, many exogenous factors will have unpredictable effects on the model-or small initial differences in the relationship between variables undergo feedback to become of unmanageable proportion after only a few days.