Monday, 9 November 2015

Be Together. Not the Same. Can maths address big data in banking

I recently attended the Alan Turing Institute summit on Data Analytics for Credit Risk and left wondering if statistics is well placed to address the problems of big data and if there were skills in the humanities that were more appropriate.  It also lead me to wonder if the neo-classical synthesis' criticism of institutional economics for not being able to deliver predictive theories is reasonable.  I end up arguing that bankers should build their businesses on the human and social sciences,  not an algorithm driven mathematical science.

While some of the presentations have been released the ones really relevant to forming my opinion are not (yet?) available, and so I will not attribute un-released sources and will present my own inferences of what was said.

Dan Kellet mentioned that geographers sometimes made better (big data) modellers than statisticians, this was at first interesting.  Then another speaker highlighted how a talk at Tate Modern in Liverpool helped him understand a central problem of big data.  He recounted how a photograph of a beach, or a desert, suggests that, from a distance, the sand is uniformly coloured,  but close up you see that every grain has a different colour, from white to black.  Another presenter had given two sequences of similar bank transactions, showing an increase in grocery spending, switching from economy to premium vendors, and a simultaneous increase in cash withdrawals at the weekend.  However,  while the changes looked similar at the start then ended at opposite ends of the spectrum, one customer proved to be a good credit risk the other a bad one.  The explanation is that one set of transactions was  indicative of the starting of a long term relationship and the other of the ending of a long term relationship.

The problem faced by bankers presented with data is not so much one of identifying outliers, as is the case with classical statistics.  The decision whether or not to lend to a heroin addict, an outlier, or a Quaker heiress, a different outlier, is easy. what is needed is to pick out the white and black grains of sand from a distance. However, the whole purpose of classical statistics is to blend the different coloured grains of sand into a single hue, represented by a 'statistic'; the opposite of what is needed.

Of course, most statisticians don't  believe the classical statistics of means, variances and p-values, is going to answer the bankers problem. The solution lies in statistical inference.

A classical statistician, employing frequentist probability,  would observe the tossing of a coin and advise on the chances of it coming up a heads, or observe mortality statistics and advise on an insurance premium, or advise on the ability of an opinion poll to predict an election.  Statistical inference, based on subjective probability, could asses the probability that I voted a particular way in an election, or that I died because I smoked.  The most topical use of Bayesian inference is in Amazon's and Google's predictive modelling to make recommendations.

However, I have never bought a book on the basis that Amazon recommended it to me.  I am not convinced that much can be inferred on the basis that in a order I purchased Brandom's Making it Explicit and Scarey's Cars Trucks and Things that Go .  The algorithm needs 'colour' added to the data, and this is Google and Facebook's advantage in holding information on my interactions (that I am an academic with young children) and interests.

The challenge facing banks is summed up in Google's Android catchphrase "Be together. Not the same" highlighting the heterogeneity of people connected together in a community.  While there are statistical methods that are able to cluster people, the input to these algorithms is a small collection of parameters and the clusters coarse; I am to be convinced that it is worthwhile for Amazon to actually figure out what to recommend to me based on my transaction history and the banks are already struggling with how to improve their decision making based on 'colourless' data.

The repeated references at the summit, however tangential, to the human and social sciences got me thinking.  Statistical techniques, developed to reduce the distributed mass to a point and then identify ‘outliers’, is irrelevant to the challenge of interpreting huge quantities of data.  The modern challenge is, given a mass of information, that looks homogeneous from a distance, to identify similar entities dispersed in the mass.  Hence, extracting value from ‘big data’ requires the repeated study of special cases from which general principles can be deduced, skills typically associated with the humanities.  The distinction is reminiscent of the difference in Quetelet’s and Le Play’s approaches to social science in the nineteenth century.

An explosion of data collection in the after 1820 enabled a number of people to observe that certain 'social' statistics, such as murder, suicide and marriage rates were remarkably stable.  Quetelet explained this in terms of Gaussian  errors. L'homme moyen, 'the average man', was driven by 'social forces', such as egoism, social conventions, and so on, which resulted  penchants, for marriage, murder and suicide, which were reflected as the means of social statistics.  Deviations from the means were as a consequence of the social equivalent of accidental or inconstant physical phenomena, such as friction or atmospheric variation.

These theories  were popular with the public.  France, like the rest of Europe, had been in political turmoil  between the fall of Napoleon Bonaparte in 1813 and the creation of the Second Empire in 1852, following the 1848 Revolution.  During the 1820s there was an explosion in the number of newspapers published in Paris, and these papers fed the middle classes a diet of social statistics that acted as a barometer to predict unrest.  The penchant for murder implied that murder was a consequence of society, the forces that created the penchant were responsible and so the individual murderer could be seen as an 'innocent' victim of the ills of society.

 Despite the public popularity of 'social physics', Quetelet's l'homme moyen  was not popular with many academics.  The term 'social physics' had, in fact, been coined by the French philosopher, Auguste Comte, who, as part of this overall philosophy of science, believed that first humans would develop an understanding of the 'scientific method' through the physical sciences, which they would then be able to apply to, the harder and more important,  'social sciences'.  When Comte realised that Quetelet had adopted his term of `social physics', Comte adopted the more familiar term, sociology for the science of society.

From a technical standpoint, Quetelet had based this theory on an initial observation that physical traits, such as heights, seemed to be Normally (Gaussian) distributed.  The problem was that, apart from the fact that heights are not Normally distributed ( Firstly the Normal distribution is unbounded, and so there is a positive probability of someone having a height of 5 metres or even -3 metres.  Secondly, and contradictory, , the incidence of giants and dwarfs in the real population exceeds the expected number based on a Normal distribution of heights.  Quetelet was confusing `looks like' with `is'.).  Also,   since murders and suicides are 'rare', there can be little confidence in the statistics, and many experts of the time, including Comte, rejected Quetelet's theories on the basis that they did not believe that   'laws of society' could be identified simply by  examining statistics and observing correlations between data  and even Quetelet, later in life counselled against over-reliance in statistics.

Beyond these practical criticisms there were philosophical objections.  The  l'homme moyen was a `statistical' composite of all society and who was governed by  universal and constant laws.  L'homme moyen was nothing like the Enlightenment's l'homme eclaire, the person who applied rational thinking to guide their action, thinking that was guided by science and reason and not statistics.  The decline of Quetelet's theorems in Europe coincides not just with the political stability of the Second Empire, but with a change in attitude.   The poor were no longer unfortunate as a consequence of their appalling living conditions, but through their individual failings, like drunkenness or laziness.    The second half of the nineteenth century was about 'self-help' not the causality of  'social physics'.

The early criticisms of Quetelet did not come from the innumerate, one of the severest critics of Quetelet was a famous mining engineer, Frederic Le Play, a graduate of the Ecole Polytechnique and the Ecole Nationale Superieure des Mines, centres of French rationalism and mathematics.  Le Play felt that  the direct observation of facts  was achieved by the scientist, a trained specialist, closely observing phenomena,  families for example, not by clerks trawling through reams of data. LePlay was not alone in his approach, but part of a much broader movement that dominated German and British science for the first half of the nineteenth century, Naturphilosophie or Romantic science.

Modern science is associated with objectivity, scientists should extract themselves from the world in order to be able to identify the 'truth', perfect knowledge. This objective was the motivation for Laplace's analysis of errors.  While the Revolutionary French pursued truth, in Prussia, Immanuel Kant asked whether objectivity was ever really possible, and, primarily in Germany and empiricist Britain, these doubts evolved into a 'Romantic' view of science, Naturphilosophie.  If Laplace championed Newton's mechanistic view that the universe was a giant clock, the Romantics saw it  more as a complex, 'living' organism and looked to discover universal principles by observing individual subjects.  Two key figures of associated with Romantic science are Charles Darwinand Alexander von Humboldt, both of whom explored the world, literally, travelling thousands of miles and observing nature in the raw.

The impact of  Naturphilosophie on English science is discussed in Richard Holmes's book, The Age of Wonder.  Holmes describes the period as a "second scientific revolution'', but in one way it was a counter-revolution, with  a rejection of the mathematisation of science that had taken place in first scientific revolution and a return to Aristotelian methods of observation and qualitative description.

Possibly the most important legacy of Naturphilosophie was a shift in emphasis in science from considering the world 'as it is' to trying to understand 'how it changes'.  Darwin is the best known example of this shift, biology was not just about the classification of living species but about trying to understand how the diversity of life had come about.

There was a problem with Romantic science: Was it objective?  As Patricia Fara explains
 Victorian [i.e.1837-1901] scientists were appalled to think that subjectivity might lie at the very heart of science. .... Scientists were exhorted to exert self-discipline and behave as if they were recording instruments producing an objective view.
It was in this context that Quetelet's quantitative methods re-emerged in Britain.  In 1850, Sir John Herschel, one of the key figures of the Age of Wonder, reviewed Quetelet's works and concluded that the Law of Errors was fundamental to science. In 1857, Henry Thomas Buckle published the first part of a History of Civilisation in England, which was  an explanation of the superiority of Europe, and England in particular, based on Quetelet's social physics.  Sir Francis Galton combined the work of his half-cousin, Charles Darwin, with that of Quetelet to come up with a statistical model of Hereditary Genius in the 1870s and in the process introduced the concepts of 'reversion to the mean' and statistical correlation. At the start of the twentieth century Galton's statistical approach, was championed by Karl Pearson who said that the questions of evolution and genetics were "in the first place statistical, in the second place statistical and only in the third place biological'', and the aim of  biologists following this approach was to "seek hidden causes or true values'' behind the observed data processed with statistical tools.

Le Play's approach of synthesising a coherent narrative out of hundreds of special cases was forgotten.  The methodology was related to the historical schools of economics and institutional economics.  This ethos seems to have been smothered by the reductionism of modern science.

On the morning of the Monday following the Turing summit I listened to Helga Nowotny being interviewed about her new book, The Cunning of Uncertainty where she notes that we have evolved to identify cause and effect, not connections that lead to complexity and result in uncertainty (around 16:20 on the podcast).  She then argues that there is no (economic) Theory of Innovation, all we have is case studies of innovation.  Case studies are not bad, but describe a specific set of circumstances in a specific context.  In this set up we are unable to create a predictive theory but this does not mean a thoughtful person cannot direct innovation, but it does imply we cannot create an algorithm to produce innovation.  I thought Nowotny here was saying something relevant to economics, is it right toscorn the historical/institutional schools for delivering only case studies and not theories.  Of course, most vernacular training comes from case studies and few practitioners value theory.

At this point one might ask, but what about "machine learning", surely an algorithm that learns could resolve the problems of big data.  However, machine learning in banking is a problem.  The "acid test" of a credit system is, apparently, that it delivers an audit trail as to how a specific decision was made.  The decision needs to be able to be justified.  The nature of machine learning, apparently, makes the task of justification difficult. As the machine learns the algorithm changes and the response to a  set of inputs is unstable and it becomes difficult to follow the "learning machine's train of thought".

Now let us assume that an algorithm could be created that could make audit-able lending decisions by employing Bayesian techniques in conjunction with social media data.  This seemed to be the utopia some of the bankers at  the summit aspired to.  But  one wonders if the legal barriers imposed in the UK and US preventing the synthesis of social data and lending decisions were removed, the result would be that the experts at this synthesis, Google and Facebook, would be better placed to become banks than banks to become data mongers.  The banks, as we know them, would fast disappear; the gods punish us by giving us what we pray for.

Is there an alternative for banks.  I think so, go back to basics.  Treat customers as individuals, not data points on a 50-dimensional space, employ staff who develop narratives around customers from which they can create an intuition about lending.  Old fashioned banking based on the human and social sciences, not an algorithm driven mathematical science that even if it was possible could well lead to their demise.

No comments:

Post a Comment