[Photograph by Literary and Media Committe, TAPMI]
This is the age of big data. We are constantly in quest of more numbers and more complex algorithms to crunch them. We seem to believe that this will solve most of the world’s problems - in economy, society and even our personal lives. As a corollary, rules of thumb and gut instincts are getting a short shrift. We think they often violate the principles of logic and lead us into making bad decisions. We might have had to depend on heuristics and our gut feelings in agricultural and manufacturing era. But this is digital age. We can optimise everything.
Gerd Gigerenzer, a sixty nine year German psychologist who has been studying how humans make decisions for most of his career, doesn't think so. In the real world, rules of thumb not only work well, they also perform better than complex models, he says. We shouldn’t turn our noses up on heuristics, we should embrace them.
That view is increasingly gaining global attention. Partly because of the failure of complex models in predicting major events, such as the financial crisis in 2008, and the election of Donald Trump last year. Partly because Gigerenzer is backing what he says with some cutting edge research. His team at The Center for Adaptive Behavior and Cognition (ABC) at Max Planck Institute for Human Development, Berlin studies the role of heuristics in decision making, in a way that can be coded into a computer programme, tested, and used in the real world. In finance, Bank of England is using their insights to design simpler rules to avert banking crisis. In healthcare, medical organisations are working with them to teach risk literacy to doctors and patients so they can evaluate evidence better. Artificial intelligence developers are looking up at their work to see if they can make machines think better.
An adaptive toolbox
Gigerenzer was in India recently to conduct a seminar, called the Winter School on Bounded Rationality at TA Pai Management Institute, Manipal, where his audience were a skeptical bunch of research scholars from some of the top institutions in India and abroad.
Gigerenzer, a tall man sporting a white mustache and possessing the gait of a country gentleman, was in full flow - explaining, clarifying, defending his worldview using a mix of evidence from research, personal anecdotes and a deadpan display of academic confidence. When a finance professor with an obvious love for mathematical analysis said he didn’t agree with Gigerenzer’s views, he replied, “Well, I have three days to convince you”.
In short, Gigerenzer's arguments go like this. There is a big difference between risk and uncertainty. You are dealing with risk when you know all the alternatives, outcomes and their probabilities. You are dealing with uncertainty when you don’t know all the alternatives, outcomes or their probabilities.
When you are dealing with risk, complex mathematical models and fine tuning them for optimisation work. However, when you are dealing with uncertainty, they don’t work well, because the world is dynamic.
What you then need is a set of simple rules of thumb that are robust and gut instincts sharpened by years of experience. You need an adaptive toolbox. To use the toolbox well, logical rationality - knowing rules such as transitivity and set theorem - won’t suffice. What’s needed is ecological rationality, that is knowing which heuristic works in which environment.
Innovating over a cup of coffee
To study all of these, Gigerenzer has assembled a team of international and interdisciplinary team in ABC at Max Planck Institute. “I don’t believe in the borders of the regular disciplines”, Gigerenzer said. “They may be good for teaching but certainly not for innovation. My group has about 35 researchers and half of them, at any point, are from ten different disciplines - psychology, machine learning, computers, economics, engineering, philosophy, biology and so on. The point is to get all these minds together to solve one problem.”
That problem is: How do humans and other animals make decisions under uncertainty, that is, when time and information are limited and the future is unknown? They try to solve this problem by designing models of how people make decisions, and by conducting experiments, testing one model against another to see where rules of thumb (heuristics) perform better than others.
“Another important thing,” he continued, “is to make them feel like a family, because people from different disciplines typically avoid talking to one another. They claim no one would understand their language. At Max Planck I have put them all on one floor. We have tea and coffee everyday at 4 PM. No one is obligated to come, so they come. And they talk, they ask questions: “What does this concept mean?,” “Why don’t you do that instead of this”. And, so we make progress.”
“It is one of the most intellectually stimulating environments for research,” Özgür Simsek, a researcher in Gigerenzer’s team said. Simsek studied industrial engineering in Turkey, before moving to University of Massachusetts, Amherst to get a masters in operations research and a PhD in artificial intelligence and machine learning. After her doctorate, she was looking for something novel to work on, and found the research on bounded rationality at Max Planck to be interesting. “I thought the research on bounded rationality had perhaps something new to say about AI and machine learning, and perhaps we could bring some of these ideas into AI. At the same time, I thought my computational background, working in algorithms, analyzing algorithms, developing algorithms could be useful in understanding heuristics better,” Simsek said. It is eight years since she joined.
The path that Konstantinos Katsikopoulos took to land in Max Planck Institute was not too different. He studied in Athens, and then went on to get his PhD in operations research from MIT. He came to know about Gigerenzer’s work when he stumbled on a book title “Simple Heuristics that Make Us Smart.” The was intrigued by the words simple and smart. He already knew the limitations of complex optimising models. “When I studied mathematics, the most basic thing I learned is that any claims about optimality or optimising are conditional on the model. Your results are always the best according to a model of the world. And a model of the world is not reality. You can’t take the model per se as a benchmark of success in the world.” The idea of comparing complex optimisation models against simple heuristics appealed to him. He joined Max Planck as a postdoc fifteen years ago. Katsikopoulos recently shifted to the Business School of the University of Southampton, UK, and continues to be associated with Max Planck through Harding Center for Risk Literacy as an adjunct scientist. (Harding Center, a part of Max Planck trains physicians and patients to better understand medical evidence, and promotes risk literacy among school children. Gigerenzer is not a fan of the Nudge (using insights behavioural economics and psychology to change behaviour of people by subtle cues and changes in a setting). He believes teaching people how to assess risks is a more straightforward and effective way to get positive results).
Another researcher, Shenghua Luan, was doing his PhD in psychology at University of Florida, after graduating from Peking University in China. At Florida, he attended a seminar in which Gigerenzer’s work was discussed. He started studying his papers and got hooked. After PhD, he spent some time at Max Planck as a postdoc, before moving to Singapore to teach at Singapore Management University. Soon, he was missing the exciting research at Max Planck so much, that he decided to go back. When he speaks about his work, his enthusiasm can spill over. “This is not some abstract stuff. This is what people do in real life,” he said.
The skeptical statistician
Gigerenzer’s own intellectual journey started when was wondering why he chose a career in academics over entertainment music years ago. “As a musician at that time, I was earning, may be, five or ten times more than what an assistant professor would earn. Now, I did not sit down and list all possible consequences of staying in music, all possible consequences of going to academia, weigh them and add them up, because that made no sense. I wouldn’t have been able to estimate all of them. I just took the decision. It was a qualitative decision. Thinking about it, got me interested in how people make decisions.”
Gigerenzer was contrasting his decision making process to a method that has been long popular among intellectuals of scientific temper. They believed one can arrive at an optimal decision by listing down all the likely advantages and disadvantages of various options. Charles Darwin tried that technique to decide on whether to marry or not. Benjamin Franklin strongly recommended this method - he called it Moral Algebra - to his nephew who was looking for his wife. “if you do not learn it, I apprehend you will never be married,” Franklin wrote to him. It has not vanished. We will find some version of this method being practiced by the bureaucracy in government and business even today. Gigerenzer saw that he didn’t use this method to make one of the most important decisions of his life. In fact, most of us don’t. Our decisions are mostly qualitative, not quantitative.
“When I started as a student i didn’t know very much about psychology,” Gigerenzer said. “Basically, I knew about Freud and a few other psychoanalysts. I quickly realised that I can explain almost everything with these concepts. I got disinterested. I didn’t want a theory that explains everything. Then I got interested in personality psychology. I remember there was a big book, 400-500 pages, on the subject and I knew it almost by heart. Some friends tested me and I could tell them what sentence was in what page. However, once I understood it, I realised its slimness. Much of behaviour is not just inside. We are social beings. We are dependent on our ecology. I then became interested in thinking and reasoning, and took courses on philosophy and logic. I loved it. I thought it was nice, but I wanted something concrete. And then I started studying probability and statistics.”
Later, Gigerenzer spent a year at The Center for Interdisciplinary Research at Bielefeld studying the history of probability. The centre had gathered a diverse set of experts from biology, mathematics, philosophy, economics and many other fields. Their aim was to study the intellectual history of probability, and the output was a two volume book called Probabilistic Revolution, that Gigerenzer co-edited. “I can think of no other comparable work that comes even close to covering the same important material in the history of science and philosophy,” Stanford University philosopher Patrick Suppes, said of that book.
For Gigerenzer, that one year turned out to be one of the most important years of his entire life. “This is where I learned people in different disciplines understood concepts like probability entirely differently. I learned how the concepts themselves and their meanings changed over time. All these intense involvement of my own mind in probability also taught me the limits of probability theory which many of my dear colleagues still don’t see,” he said.
Who on the earth is Linda?
Then came the question:What is the alternative when probability doesn’t work? That was when he turned his attention to heuristics that Herbert Simon, the American polymath who proposed the theory of bounded rationality, referred to in his works. That term was getting more and more popular in psychology because of work by two experimental psychologists Amos Tversky and Daniel Kahneman. “I read their work and found it very interesting. However, because of my training in statistics i was suspicious about some of their claims”, Gigerenzer said.
One of their most famous work involved a question that went like this.
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
Which is more probable?
- Linda is a bank teller.
- Linda is a bank teller and is active in the feminist movement.
Called the Linda problem, the question’s purpose was to demonstrate “the raw power of the mind’s rules of thumb to mislead,” as Michael Lewis puts it his latest book, The Undoing Project, that chronicles the collaboration between Tversky and Kahneman.
In Thinking, Fast and Slow, Kahneman writes about the response to a version of the question: “We were convinced that statistically sophisticated respondents would do better, so we administered the same questionnaire to doctoral students in the decision-science program of the Stanford Graduate School of Business, all of whom had taken several advanced courses in probability, statistics, and decision theory. We were surprised again: 85% of these respondents also ranked “feminist bank teller” as more likely than “bank teller””.
And for Kahneman, who eventually won the 2002 Economics Nobel, it was a serious error: “About 85% to 90% of undergraduates at several major universities chose the second option, contrary to logic. Remarkably, the sinners seemed to have no shame. When I asked my large undergraduate class in some indignation, “Do you realize that you have violated an elementary logical rule?” someone in the back row shouted, “So what?...,” Kahneman wrote in his book.
Gigerenzer, with all the perspective he gained from the studying the history of probability, would have sympathised with those students.
“If you read the description, nothing in it suggests that she might be a banker. So, when you ask what is more probable, ‘bank teller’ or ‘bank teller and an activist”, many people say, ‘hmmm may be, the second’. And Kahneman says this is wrong because a single instance of Linda being a bank teller can never be lower than conjunction of Linda being a bank teller and a feminist. He then asks that to be accepted as a proof of human irrationality. But, it’s far from that, because it implies that people should treat the term “what is more probable” in terms of probability theory. If you look in the Oxford English Dictionary, you will see probability has quite different meanings and they are all legitimate. So, they ask, ‘where is the evidence that Linda is bank teller’, and since there is none, they go to the other option,” Gigerenzer said.
To test his own hypothesis, Gigerenzer framed the question differently. Instead of asking what is more probable, he asked,
There are 100 persons who fit the description above (that is, Linda’s). How many of them are:
- Bank tellers? __ of 100
- Bank tellers and active in the feminist movement? __ of 100
When he posed the question in this way, Linda problem mostly disappeared.
The catch that changed the world
Many smart, successful people fail in rationality tests inside a lab because rationality is defined rather narrowly. It’s logical rationality - about not violating some law of logic or probability. But, outside the lab, in real world, we cannot do well with just with logical rationality, we need ecological rationality - the kind of thinking that helps us get what we want in an environment that’s uncertain and dynamic. This means exercising our instincts, using simple but robust rules of thumb. This means behaving in a way that helps achieve one’s purpose rather than constantly looking at a list of biases, to see if we have fallen for any of them. For example, overconfidence is a bias, according to a standard book on cognitive errors. But in real world, it’s the entrepreneur who is endowed with overconfidence that takes bold steps and go into unchartered territories. Another example is probability matching - which is not optimal according the laws of statistics - but in real world, some would willingly choose a path where the probability of success is low because that path also has less competition.
In many cases, rules of thumb don’t violate laws of logic or probability. They are there because they are useful.
Let’s go back in history. It’s June 25th, 1983. The Lords in England. It was World Cup Final between India and West Indies. West Indies team was the clear favorite. They had won the first two world cups. Vivian Richards was at his confident best. And then, at one point he top edged a ball from Madan Lal. Kapil Dev, fielding at mid-on, chased the ball till almost the boundary, all eyes on him, and his eyes fixed on the ball, till he caught it. That fantastic catch changed the course of the match, and India walked away with the cup.
Check it out here.
How did Kapil Dev, playing under extreme pressure, figure out where the ball will land? How did he pace himself so well that he was right there to catch it? Did his brain have “something equivalent to a mathematical calculation” (to use Richard Dawkins’ words from The Selfish Gene) that was going on to predict the trajectory of the ball and and to direct him on how fast to run? Or was it something else?
It’s something else. Gigerenzer says, fielders - be it in cricket or in baseball, consciously or unconsciously, follow a simple heuristic. Just fix your gaze on the ball and just make sure you maintain the angle as you run in the direction of the ball, and you won’t err.
It’s not just Kapil Dev. Check out two more examples of running catches by Steve Waugh and Martin Crowe.
They are not calculating the trajectory of the ball, they are simply using gaze heuristic. Its use is not limited to cricket grounds. “We know from animal research that a predator catches prey by keeping its optical angle constant. Sailors use it to avoid a collision. They don’t estimate their own trajectory and they don’t estimate that of others. They fix their gaze, if the angle remains constant they just get away,” Gigerenzer said.
If you are a sailor in a small fishing boat, you can’t afford to take your computer, and calculate the direction and speed of another boat. So, you have to depend on your eyes, and a simple heuristic.
But, what if someone using sensors and supercomputers to calculate the trajectory is directing the fielder. Would he perform better?
Where simplicity trumps complexity
Nathan Berg was trained in mathematics, has played in the band of celebrated jazz musician Maynard Ferguson in his growing up years, and now teaches quantitative analysis at New Zealand’s Otago Business School. He was aware of limitations of optimising models, and was attracted to the entirely different approach taken by Gerd and his team. “I was enthralled with the possibility that something new might be going on but I was stumbled over and over again”, Berg said. “I remember three or four conversations with Gerd about the gaze heuristic. My proposition was with gaze heuristics, you can get almost as good as what an optimising robot would do. ‘No Nathan’, Gerd would say, ‘optimisation is one interesting benchmark, but you are still trapped in the idea that because it’s a heuristic, it is by definition second best, and at that most it could be nearly optimal’"
"What he was saying is that if you are trying to optimise you would have to estimate parameters about the world, and thereby introduce model risk. It opened me up to the idea that heuristic need not be the second best but actually the best. But, Gerd wouldn’t use the word first best, because it’s an environment where it’s impossible to define what’s the best. The comparison he is interested in making is that a simple rule in a complex world can outperform a complex rule in an artificially simplified world. That’s a distinction that took me a long while to fully digest”.
The problem with complex models is not calculations - computers can do that pretty well and fast. The problem is that they would demand that you make estimations. And that’s where things go wrong.
One of Gigerenzer’s favorite examples is the modern portfolio theory, pioneered by Harry Markowitz back in the 1950s. Markowitz offered a mathematical framework to design your portfolio so you can maximise your returns for any given level of risk. His theory was elegant, is taught in finance courses in universities across the world, has finance professors swear by it, and won him a Nobel prize in 1990.
There is a low-tech way design your portfolio. It’s simply called 1/N formula or equality heuristics. Simple divide your funds equally across funds. It sounds too simplistic for the complex world of finance, and unlikely to impress any investor from whom you are raising funds (unlikely to impress you if someone is asking for your money, saying 1/N is their portfolio allocation strategy).
But the crucial question is how does it really compare with Markowitz model and its various derivatives in the real world. Three researchers, Victor DeMiguel, Lorenzo Garlappi and Raman Uppal tested optimal diversification model with the naive (1/N) diversification, they found that none of the former consistently outperformed 1/N. For a optimised portfolio with 25 assets to beat the performance of 1/N diversification would need a window of 3200 months, or 266 years, and one with 50 assets, 6000 months, or 500 years. “That means, in the 2500 people can stop using simple heuristic and do the complex computation, if the same stocks are still around,” Gigerenzer said.
The most prominent endorsement to 1/N diversification however came not from these studies but from Markowitz himself. When he had to invest his own money, he didn’t invoke his mean-variance framework. He simply went for 1/N.
In real world, there are several cases where simple rules trump complex algorithms. Take a marketing campaign aimed at 'active customers'. How does one determine who active customers are? Academics might suggest one of the variants of Pareto/Negative Binomial Distribution models. But some studies have shown that using hiatus heuristic, that is, simply marking customers who haven’t bought from you in the last 9 months as a criteria for flitering out inactive customers will give you as good or better results. Similarly, lay people who go by recognition heuristic - betting on the more recognized name - tend to do better than those who use complex models to predict sports outcomes.
On the other hand, complex models sooner or later fail to predict or fail to help you take better decisions. In 2008, Google Flu Trends, a project by the search engine giant, was celebrated among data enthusiasts and general public for its ability to predict prevalence of flu by analysing search terms. It turned out that it was overestimating the numbers, and Google stopped publishing it. Similarly, Basel Committee on Banking Supervision came out with increasingly complex, increasingly fine tuned and increasingly voluminous rules and regulations for banks, and yet none of that could prevent the financial crisis in 2008.
“We have to get over the illusion that complex problems need complex solutions”, Gigerenzer said. “But, its opposite is also not true. Simple heuristics are always not better. We need to treat hammers and screwdrivers as different tools with specific purposes, and ask very reasonable question: Will this tool work better than others?”
Most situations that we face, fall somewhere between risk and uncertainty, or they have elements of both risk and uncertainty, and so we need a combination of both. Take the miracle on the Hudson. In January 2009, two pilots correctly decided to land their bird-hit plane on Hudson River instead of taking it further to an airport, saving the lives of all the passengers and the crew. Chesley Sullenberger and his co-pilot didn't use an elaborate mathematical equation to figure out that their damaged plane might not hold up till they reach the airport, but they used a heuristic (something similar to a gaze heuristic). However, having decided to land it, they went for a checklist, which is the opposite of heuristics. They used both.
The glass of rationality
Some of Gigerenzer’s critics downplay his work by saying it’s no different from what Tversky and Kahneman have argued. Tversky and Kahneman focus on the negative side of heuristics while Gigerenzer on the positive side; the former say the glass is half-empty and the latter half-full.
I asked Katsikopoulos about this criticism, and his answer long and measured.
“You can say at a higher level of abstraction that these things are similar. Because Tversky and Kahneman also think about people not about algorithms. They both think about people and human behaviour. They try to describe where it succeeds and where it fails using standard techniques such as experimentation and looking at data and reasoning. In that way they are similar. So I can understand when people look at this for first time and they say, oh so much the same, Gigerenzer looks at the positive power of intuition and Kahneman at the negative. And this difference is the most important. But, being more informed, I don’t think it’s so important. First of all, because both sides want to look at the conditions under which intuition has positive effects. In a way none of this side believes one or the other.”
“Second reason is there are differences in the method and the sources from which these two sides get the inspiration and how they go about accumulating knowledge. I would say some of the strength of our approach is that ours is broader. Our group is interdisciplinary and so we have knowledge in mathematics, statistics, economics, biology - it’s important as well - philosophy. So sometimes we have a more integrated view of what is rational. Laymen, non experts may believe philosophers and mathematicians have the last say on defining what is rational. But, that is not true. It’s not true that there is just one meaning of probability and one meaning of logic. Especially if you consider the whole of human knowledge across human disciplines. Tversky and Kahneman side is less sensitive to that because their foundations come more from experimental psychology, and from that part of mathematics that actually believe that the problem of defining rationality is solved. That colours their methods, interpretations.”
“One method that they have not used is to really run model competitions on how heuristics and optimisation models perform in the world. Because they have never done that, they never found out that heuristics can perform better than optimisation. In a way you can say there is no logical flaw in their investigation. In some sense it’s incomplete because they didn’t use this method. But we have done that, so we are more positive about heuristics”.
It would be equally wrong to say that heuristics are always better and big data are useless. That would equally be a big mistake. But so far, the other mistake has been made more. So I suppose it’s fine.”
The sacred gift and the faithful servant
A quote attributed to Einstein captures that imbalance well. It goes like this: the intuitive mind is a sacred gift and the rational mind is a faithful servant. We have created a society that honors the servant and has forgotten the gift.
In the days since Einstein, we seem to be honoring the servant with even greater fervour, thanks to the exponentially growing power of two weapons he holds in his hands: data, and the ability to process it. And, our memory hasn’t gotten any better, when we have to remember the gift. That tendency has serious implications for business leaders and policy makers.
I asked Gigerenzer if his work - spanning books, lectures, research papers - had one big message. He said, “We need to dare to think for ourselves, instead of anxiously adapting to our environment . We have in western world fewer and fewer people who are willing to take responsibility, to make decisions on their own and the tendency of the management to delegate to consulting firms which is often a waste of time and money.”
“My advise would be to trust more in expert knowledge, in long years of experience. Don’t buy statistical algorithms you don’t understand. Many managers buy big data algorithms which come in black boxes because they are not sure, they don’t really understand what all these are about. But they think, ‘if I don’t buy that, and if things go wrong, I am responsible, and have to take the blame. If I buy that, it costs the company something, but I am safe’. There is a lot defensive decision in society and unwillingness to take responsibility, and the fear of one’s own common sense.”