Lately, I've found myself attending academic colloquiums.1 They are organized semi-weekly at my school by our math department and each talk is led by a different invited professor on the topic of their specialty. These one-time lectures are meant to be a resource for the experienced faculty from different institutions to exchange knowledge and ideas. Surprisingly, many of them were very well presented. What's even more surprising, they were easy to follow for a "non-experienced-non-faculty" like myself.
Over the course of these colloquiums, I realized that I am slowly rediscovering math for myself. Seeing that: wow, math can be beautiful. I get fascinated over and over again by the multiplicity of diverse ways to represent our complex world with a math model – and then by solving the model, explain the world.
For example, recently, there was one talk that said to use drift-diffusion models on collective decision making. The abstract got me intrigued, so I went. The presentation ended up great (and I am very picky in that sense!) and even after the meetup was over, my mind kept coming back thinking about the talk. So I reached out to the speaker and asked for a few papers that his presentation was based on. "How do individuals combine private evidence and social information to make decisions?" stated one of them. Over the course of several papers, to address this posed question, they start with introducing a model, follow that with analyzing it on small networks, and then extend their work on large networks.
What kind of model can you use to represent human interactions and decision making?
The World and its Models
Consider a population of n agents that has to decide between two choices/hypotheses: H+ and H-. They do so by accumulating evidence and computing the conditional probabilities, P(H±|evidence), that one of the two hypotheses is correct.
For example, there is a coin that is known to be equally-likely one of two biased coins: one that is more biased towards Heads (H+) with probability p in (0.5, 1] interval and another is more biased towards Tails (H-) with the same probability p. You want to identify whether this coin is H+ or H-.
Naturally, to make a decision, you'll need to accumulate some evidence toward one or another hypothesis. In our case, you can do that by tossing a coin many times. If you get more Heads, it's more likely to be the H+ coin. If you get more Tails, it's more likely to be the H- coin.
Now the question is – when do you make a final decision? On the 10th flip? 100th? 10,000th? Well, sure, you can use some formulas and derive the exact probability of whether it is one or another coin given a sequence of outcomes. However, the question still remains the same: when do you stop? At 95% confidence, 99% confidence, at 99.8%? Whatever that number is for you, let's call it your "confidence threshold". Naturally, the higher your threshold is, the more accurate your educated guess is; yet, the time you will spend on making that decision is naturally higher. Hence, slow-deciders. And vise-versa.
The tricky part is that if two coins H+ and H- are only veeeery slightly biased (i.e. p is 0.5+epsilon), then you'll need a very large number of tosses before you can "confidently" make a decision, whatever your confidence threshold is.
That's where network information comes into play.2 Imagine that a group of agents is dealing with the same choice as you and each of you can see when others settle on a decision and what decision that is. Once a decision is made by an agent, they do not change it. In this case, you can not only accumulate your own (private) evidence about the coin by tossing it multiple times, but also observing others and what decisions they make.3
For the intuition behind it, think about buying an item on Amazon – say, a new fancy pen. One way to decide on whether it is worth buying that item or not, is to read the reviews of the item. Another way is to observe what your friends think about it: i.e. do they buy it or refrain from buying it. The first one will be your private data/evidence that you accumulate on your own, and the second one is your network information.
The model set up
Now, back to our model. For simplicity, let's assume that the time passed corresponds to the amount of private data each agent accumulates. (Think of private data as the evidence you accumulate tossing a coin multiple times: the more tosses are made, the more time has passed.) And as described earlier, each agent i has their own "confidence threshold", call it 𝜃i.
We agreed earlier that we can use some formulas and derive the exact probability of whether it is one or another coin given a sequence of outcomes. Let's call this probability a "belief" of a given agent i, and denote it as yi(t). Note that the belief changes with time (i.e. with accumulated evidence); hence, it is denoted as a function on time t. For example, yi(0)=0, as each agent starts from zero belief as there is zero evidence.
So, to repeat, the belief function yi(t) of each agent i will be based on randomly generated evidence for that agent i. Implementation of it can be done via computers. Simply put, a computer randomly chooses a belief, i.e. a coin – [H+] or [H-] – and then, for every agent i, it generates a sequence of heads and tails tossing that coin for that agent; and the agent calculates their personal "belief" towards one or another hypothesis using that generated evidence.4
Once at some time t=Ti, yi(Ti) is equal to 𝜃i, the agent i announces their decision that they believe the H+ hypothesis. Similarly, if yi(Ti) is equal to -𝜃i, they announce their belief in the H- hypothesis. Once an agent arrives at a decision, it is final. Think of it as: if you buy that fancy pen, you can't return it; and if you delete it from bookmarked, you won't ever return to it.
Now, the beautiful part – how do we incorporate other people's decisions into our belief system? Let's think through the intuitions behind the world first.
The model in action
Every time one agent arrives at their final decision, say H+, all other agents update their beliefs based on that information – specifically, they experience a "jump" in their belief towards H+. The intuition behind it is that if you see someone confident in their decision, you update your "priors" towards that belief, no matter what priors you had before (just like in Bayesian thinking).
Notice that if an agent was relatively close to their threshold favoring H+ outcome, they would arrive at the threshold right after experiencing that "jump" in their belief towards H+. So once one agent arrives at their final decision, it forces a wave of a group of other agents who were close to the threshold at that time to immediately arrive at a threshold and so to make a final decision in favor of H+ too.
Okay, let's now put a little bit of math into it. Before going into the complexity of groups with agents having different thresholds, let's observe the behavior of the group of agents that all have the same threshold 𝜃. Consider the first agent that arrived at their final decision (e.g. H+) at time T1, i.e. y1(T)=𝜃. Everyone else see that and know that this agent has accumulated 𝜃 amount of evidence over time T. Hence, they update their belief yi(T) with that amount of evidence 𝜃 towards H+, which makes their new belief to be yi(T)+𝜃. Naturally, as discussed earlier, some agents who were "close enough" to the threshold that when they update their belief by +𝜃, they immediately arrive to their final decision.
Now, note that the number of people who arrived at a decision in the first wave, and also the number of those who remained undecided also reveals the information. If the number of those who have arrived at a final decision is majority, it means that all of them have already accumulated private evidence that was supporting H+ hypothesis, which naturally makes the hypothesis H+ more likely to be true and that everyone's belief has to be updated towards H+. If, on the other hand, the number of undecided agents is far from being majority, it means that the majority of the group has accumulated private evidence that is supporting the H- hypothesis, so the belief has to be updated towards it.
To put it more formally, since the threshold is symmetric (+𝜃/-𝜃), observing undecided agent j after the first wave reveals that yj(T) have been in the (-𝜃, 0) open interval before the jump (+𝜃), meaning that the belief can be updated again in an opposite way by the amount of private data that was accumulated by all the undecided agents towards H-5
Naturally, if some agents were close to the boundary, that causes another wave of decisions. And potentially another. And another. These waves will go until either everyone arrives at a final decision or no new agent arrives at a decision after the most recent update of their belief.6
Back to the world
This model is for a very simplified version of the world, where everyone knows each other's thresholds and everyone sees everyone's decisions, not to mention all the assumptions that were made along the way. In the studies, they expand this model further into many versions – "maps" – of the world. For example, they split the population of agents into clusters of those who know each other's thresholds and those who don't. They also introduce the idea of initial bias that agents can hold: which is, they no longer have an unbiased belief towards one or another hypothesis at the beginning, i.e. yi(0)≠0 for some or all agents i.
Along with other results, one is a mathematically proven idea that fast-made decisions are highly affected by prior biases, whereas slowly made decisions often converge to a correct decision regardless of prior biases. Another one, that very much stuck with me for some reason (and inspired to write this post), is that although fast-made decisions are highly affected by prior biases, they are the ones that help cautious deciders to lead a group to a correct collective decision:
"We show that in a group of identical agents, a wrong first decision leads approximately half the network astray. However, in heterogeneous networks a wrong first choice is usually made by hasty, uninformed agents and only convinces others who are similarly quick to decide. Cautious agents can observe the decisions of early adopters and make the right choice. Thus, in diverse groups decisions by unreliable agents, even when wrong, can reveal the better option."
Final thoughts
In conclusion, surely math is hard. And often quite painful. Nonetheless, there is that beautiful part about it that connects unconnected and makes what was only intuitive before to be formally proven. It reminds me of all the great entrepreneurial projects: the initial idea is all nice and beautiful, even may seem magical and unreal for some; then the realization stage is all messy, tangled in bureaucracy and materialism; and then the final result, once fully developed, after all the work, ends up as nice and beautiful as it was intended in the beginning, even seemingly magical and unreal for some. And so is math. <3
References:
Heterogeneity improves speed and accuracy in social networks. B Karamched, M Stickler, W Ott, B Lindner, ZP Kilpatrick, K Josić. Physical review letters 125 (21), 218302.
Bayesian evidence accumulation on social networks. B Karamched, S Stolarczyk, ZP Kilpatrick, K Josić. SIAM journal on applied dynamical systems 19 (3), 1884-1919.
Fast decisions reflect biases, slow decisions do not. S Linn, SD Lawley, BR Karamched, ZP Kilpatrick, K Josić. ArXiv 2024.
colloquium: https://en.wikipedia.org/wiki/Colloquium
network information, or collective knowledge, refers to the "phenomenon where individuals rely on the conclusions and beliefs of others in order to form their own opinions and make judgments, particularly when detailed information is lacking" – an AI generated definition on ScienceDirect.com.
Same as in the financial markets of the market-driven economy, each buyer and seller has their unique private information that they use to make decisions (i.e. buy/sell), and so the collection of this information helps to set prices on the market even without each party openly sharing their private information.
The collective knowledge of markets (plancorp.com)
The collective wisdom of financial markets (peterlazaroff.com)
In this particular study, they use the log likelihood ratio (LLR) to measure the "belief" of an agent, which is log( P(H+| evidence) / P(H-| evidence) ).
Here, they construct the conditional probability density for the belief using previously described LLR to define the "evidence amount" that each agent accumulates by the time T. They call it R+(T) for H+ hypothesis and R-(T) for H-. Thresholds are symmetric, so -R+(T) = R-(T). The resulting belief increment after the first wave they define as c1 = a1*R+(T) + (N-a1-2)*R-(T) = (2a1 - N + 2)*R+(T), where a1 is the number of those who updated towards H+ in the first wave and N is the total number of agents. Which means, if a1>(N/2-1) – aka the "majority" arrived at a decision H+ during the first wave – the new update is in favor towards first wave guess (H+), and c>0. Otherwise – aka the "majority" stayed undecided after the first wave – the new update is in favor of the opposite hypothesis (H-), and c<0.
It is assumed that everyone first waits for all the wave-updates to happen before the accumulation of the private evidence resumes, aka time stops (wow, magic).