-
The brain inside a Large Language Model is a random number generator
Suppose you ask a Large Language Model (LLM) the same question several times. Why do you get a different but related answer? For example, I asked ChatGPT the following question 3 times,
Question: Explain Artificial Intelligence in 20 words or less?
Answer 1: AI is technology that enables machines to learn, reason, and make decisions, mimicking human intelligence.
Answer 2: Artificial intelligence is the simulation of human intelligence processes by machines, including learning, reasoning, and problem solving.
Answer 3: AI is the creation of intelligent machines that can think, learn, and solve problems like humans, changing industries worldwide.
How can a machine respond to the same question with a different answer? At first blush this would suggest that the computer is thinking. The real reason, however, is astounding and explains why we are still a very long way from computers thinking for themselves.
I attended a seminar this week at which a PhD candidate in AI spilled the beans. LLM’s use a weighting regime to calculate the probability of one word being followed by another. For example, if the first word is ‘I’ then the most common word that follows 99% of the time might be ‘am’. An LLM will follow ‘I’ with ‘am’ 99% of the time unless the probabilities change. Suppose the second most common word to follow I is ‘want’ and this occurs 1% of the time. By drawing random numbers and adding them to the probability of the word ‘want’ can displace ‘am’ in the word-string. Training the LLM on billions of passages of prose should generate a reasonable amount of patterns to recognise when calculating response probabilities. Using random numbers to perturb these probabilities will make responses different to give the impression that the computer has ‘thought’ of a different answer.
To demonstrate the use of random numbers to simulate thought, I asked ChatGPT to generate some Matlab code to estimate a LLM. The first function that popped out was,
// Function to initialize weights and biases
function [W, b] = initialize_parameters(layer_sizes)
W = []; // Cell array for weights
b = []; // Cell array for biases
for i = 2:length(layer_sizes)
W($+1) = rand(layer_sizes(i), layer_sizes(i-1)) * 0.01; // Small random values
b($+1) = zeros(layer_sizes(i), 1); // Zero initialization for biases
end
endfunction
where the emboldened line of code draws Small random values to seed the weight matrix. This seeding is very important for the LLM to be led to choose nuanced expressions in response to the same question. A random number generator is the brain behind a LLM.
Two things jump out of this expose. First, the response of a LLM is actually deterministic since, in the absence of a random number generator, the response will be the same forever. Second, and more concerning, a LLM’s lack of originality will just parrot back results that it has encountered during training. Unless new thought makes its way into the training set in sufficient abundance to materially impact the word structure probabilities, then the development of language will stagnate. A LLM will not generate anything new. This might be fine for some languages, such as French, where government departments are dedicated to preserving the language. But progressive and adaptive languages such as English will just morph into whatever ChatGPT’s internet crawlers have settled upon.
-
How much is that DOGE in the window? A lesson in Government
So it looks like we will get to see who wins out of the Godzilla Government v King Kong Elon Musk battle now that the US election has delivered Trump back to the White House. I argued that this will be worth electing Trump alone. Musk is confident of being able to strip out $2 trillion worth of costs from government and it will be key to see how he goes about it. The government is essentially a service industry which differs from Musk’s traditional manufacturing background so whereas his approach to cost cutting has been to trim back to that which is physically minimal to produce a car or a rocket, he will need to explore that which is organisationally minimal to achieve a government objective. The Department of Government Expenditures (DOGE) may question the need for a particular government function, but it will just need to accept the presence of a policy that Congress installs. The main role of DOGE is to judge whether the policy is being implemented efficiently.
One can imagine all sorts of approaches to making government more efficient, not the least by using technology to solve coordination problems. The ability for one government department to share information with another government department is almost non-existent in most countries and moving toward an open system could both make an individual’s interaction with the government faster and cleaner, as well as defray duplication and outright fraud. Cutting costs does not just mean cutting people. Providing the means to access government services quickly and efficiently is a quantifiable benefit.
Understanding the budgeting process is another focus for attention. The private sector, as Elon Musk well knows, has to pitch its ideas to the capital markets to gain funding. Government has a non-market based approach to raising capital. Is there a way to make the business of government more attuned to the oversight function of the capital markets? The US Treasury has a monopoly on raising capital for the government but there may be opportunities for ‘start-up’ programs to seek funding independently of the general appropriation framework. This would effectively allow the government to partner with the private sector to deliver a valuable service and each to share in the equity thereof. The National Science Foundation funds tons of research, much of which goes unnoticed. Motivating the administrators to seize upon and commercialise ideas can both manage costs and generate profit.
It feels like Musk has been given Vivek Ramaswamy to colead the DOGE in order to reign Musk in and soften the message he may choose to deliver. But it is naive to think that DOGE is just about firing people and shuttering failed departments. There will and should be some of that, but the lasting contribution will be making a monolith more like a collection of microeconomic actors that fit together.
-
US Election-eve special
With campaigning by Trump and Harris having effectively completed, it now comes down to the grubby business of voting – if you have not already done so.
Some observations…
Early voters are people who just want to get it over with. US election campaigns are excessively long, compared with the UK system where a 6 week campaign is considered an eternity. The fact that half the US electorate line up early to vote suggests they would prefer a shorter campaign.
So who is going to win? My Money versus Mischief model of voter behavior is instructive. To recap: swinging voters are driven by their back -pocket and who they can piss-off. Harris has made some cash promises but not followed through with specifics. Trump has resiled from pork-barreling but been quite specific about how he would bring down energy prices. Both candidates seem to have gifts for voters, so they are tied in this dimension. Trump, however, clearly dominates the Mischief vote since a vote for him is a clear challenge to the liberal media and the Democrat elite. A vote for Trump will surely piss off the fourth estate. My prediction is for a Trump victory on this point alone.
The margin will be large unless there is a protest vote. A protest vote would benefit the Green’s candidate, reducing the margin between Trump and Harris. Any protest, however, will harm the traditional left wing candidate and siphon votes away from Harris proportionately more. In the absence of a protest vote, the Trump vote will be expressed in the voting booth itself, as voters make up their minds when presented with the actual decision.
Voter turnout is expected to be ‘high’ but this still means that 30% of the eligible electors don’t vote. In my book this is an extraordinary number given the importance of the US in international affairs. I accept that not voting is a choice, but how many of those who do not participate actually make that conscious decision?
-
The Private Debt renaissance
Michael Milken is generally regarded as the father of the Junk Bond market. His innovation extended high yielding debt to companies to ensure that profits would be paid out to investors rather than held as retained earnings. The theory is that forcing companies to pay out profits is better than leaving those companies to invest their free cash in sub-optimal projects. There are many corporate finance aspects to issuing high yield bonds with the focus on the shareholder/manager conflict of interest within a company. Less attention has been paid to the impact of investor preference for risk.
I am regularly asked ‘…what is the next big thing in Finance?’ I think that investors are tired of bearing equity risk without control so that other ways to earn a competitive rate of return are being actively sought. Private debt could solve this problem. The long-run equity premium is about 7% over cash which, with a cash return of 2-3%, means that a rate of return of 9-10% is about what investors can expect in nominal terms from investing in equities. This comes with equity risk, however, which averages about 18% as measured by the standard deviation of equity returns. Is there a way to lock in, say, a 10% return while reducing or limiting risk? This is where private debt comes in. Private debt contracts have the ability to transfer risk from the shareholders back to a firm’s managers. The investors specify a market rate of return that they want to receive from a company which then tries to meet the objective as interest. If they meet the interest payment then the investor is happy, if not then they go into receivership which re-allocates claims. Private Equity firms are leading the charge for Private debt. These institutions can deal with both debt issuance and bankruptcy.
The private debt phenomena differs from Milken’s model in that it aims to achieve a macro-finance outcome of achieving a specified rate of return while limiting risk. Milken’s focus was the micro-economics of forcing managers to pay out profits. These are different objectives achievable with the same financial instruments. The interesting aspect of the private debt market is the scale that it can reach. Almost every listed and unlisted company can participate in the private debt market.
-
Godzilla Government v King Kong Elon Musk
I have argued before that Elon Musk is Minimum Average Cost man. Minimum Average Cost is the long-run equilibrium cost for producing a good or service and the point where marginal cost equals average cost which also equals price in equilibrium. Musk’s biography is peppered with examples where he strips out cost from his car production lines or his rocket business often cutting too much and having to reinstall processes to patch up design holes that come from excessive cost cutting. This approach to costs is the dynamic that leads him to effectively find the correct cost setting.
Musk has made friends with Donald Trump with an apparent agreement for Musk to head up a ‘Government Efficiency Commission’ should Trump win the Presidential election. This will be absolutely fascinating to witness snd will be a real life Godzilla v King Kong match up. It is reason, alone, to elect Trump since Musk would try to take the chainsaw to government while the Government would try to sandbag his every action. Government’s do not face competition that drives costs down and it is unclear what each individual department’s objectives are. Furthermore, it takes someone with hands on experience to manage cost, rather than a Congressional committee, to implement the correct dynamic. From Musk’s perspective, a government function that has zero benefit should have zero cost – which means firing more than the tea lady.
Musk was a guest speaker at Trump’s rally in Pennsylvania on the weekend and it was interesting to watch a hushed crowd listening to his comments. Trump’s rallies are not known for their discipline so the fact that they were quietly listening to Musk’s every word is testimony to his credibility and standing.
In terms of social benefit, Musk’s Government Efficiency Commission promises far more than putting men on Mars. I can envisage him asking the heads of every department what they do and to justify the cost. This simple question will tie department heads in knots. Since price equals minimum average cost in a competitive economy, the cost of running the government is a proxy for the price that they need to charge. Opening up the government to private competition is an obvious result. Musk has already beaten NASA at this game, so he will be difficult to challenge.
But in hitching his Tesla to the Republicans he risks reprisals from the Democrats should they get re-elected. The Dems are voracious users of ‘Lawfare’ to damage critics and they could easily cost Elon a fortune by targeting X (formerly twitter) and cancelling SpaceX’s contracts with NASA. His companies could get regulated into bankruptcy. From this perspective, Elon’s decision appears reckless and inviting the downside. That said, his risk taking track record is pretty good so Trump might take heart.
Do you like what you read? Then subscribe to our blog below…
-
US Presidential election: if you dont have anything to say then start singing
Why are exit polls so accurate while pre-voting intention polls largely rubbish? The UK election showed this stark reality where opinion polls leading up to the election showed a close race whereas the exit poll accurately measured a Labour Party landslide victory with an 80 seat majority (in actuality the majority came in at 82). The UK exit poll sampled 2200 electors at 200 polling booths during the first half of the voting day. The statistical sample size was no larger than surveys of pre-election voting intentions. It is tempting to say that exit polls capture the opinions of actual voters whereas the inaccuracies of voting intentions capture non-voters, but this is not an argument in favour of systematic error.
My theory is that the majority of electors who matter, the swing voter, makes up his/her mind in the polling booth itself. They may enter the voting station having answered a pre-poll survey one way but then, when actually pulling the lever, decide to do something else, which they report on exit. The pre-election opinion polls are therefore just as accurate as the exit polls, however they are measuring a different thing. Intentions are very different from actualities.
I cannot help thinking that this will again be the case in the current US Presidential election. The media now report that Kamala Harris is ahead in ‘polls of polls’ by a small margin but it is difficult to fathom a small margin given who she is running against. Donald Trump evokes just as much vitriol amongst his haters (liberals, assassins, the media) as he evokes patriotism amongst his supporters. It is hard to imagine a close outcome in this environment with whomever manages to stick in the mind of the swing-voter in the polling place on election day. There are two points here,
- The US Presidential race is wide open now and will be until election day
- The winner, if you agree with my thesis, will be the candidate who leaves the most indelibly positive impression on voters lever-in-hand.
To this second point, Harris’s game plan seems to be to stay out of the public arena and let other actors define Trump negatively (US celebrities, the media, Trump himself when off-message). Trump, on the other hand, so long as he remains on message (MAGA, immigration) is then left free to present his case, even in front of an unsympathetic media. It strikes me that Harris’s controlled absence prevents her from making that positive recall amongst the swing-voter on election day. Trump seems to be in the box seat and it is Harris who has let him get there…
There’s an old saying in advertising that goes ‘…if you have something to say, say it, if you don’t, sing it…’. This seems apt. Perhaps Harris should start practicing her Arias while its not too late?
Do you like what you read? Then subscribe to our blog below…
-
The Fed…the UK, Trump and Biden
The US Federal Reserve and Interest Rates…
After 2 decades with US interest rates way below their long-term target (the ‘neutral policy position’ is about 4% for cash), the Federal Reserve must be feeling comfortable. Yetserday’s CPI print is the first negative month-on-month figure, a few more of which might put the inflation bear to sleep. So what would the Fed be thinking of doing? Absolutely nothing is my thinking. With the economy bounding along, unemployment plumbing lows and inflation under control there is no reason for the Fed to lower interest rates without a crisis to rescue.
Yet the markets are predicting the Fed will cut rates as early as September. When rates were stuck below 1% for years, I remember the markets were almost entirely bearish about rate hikes that never materialised. The ‘lower for longer’ camp had almost no members. The ‘higher for longer’ camp would seem the place to be now, but it, too, has no friends. I can see the cash rate remaining at 5% at least until the next recession.
… the UK election numbers and what they mean for Trump-Biden.
My theory of the thoughtful swinging voter is that they make their minds up when they are in the polling booth, based on money and mischief. They may enter the voting venue intending to do one thing but they can easily change their mind while there. This is impossible to observe however the ‘exit poll’ is a near proxy for measuring this behaviour. People are more likely to tell you what they have just done, since it is fresh in their memory and they can fill out a mock ballot with the confidence of anonimity. The exit poll from the UK election predicted Labour to win 410 seats and the Conservative share of the vote to be severely reduced by a competing right-wing party Reform UK, making the UK first-past-the-post system favour Labour. In fact, the Labour party won 412 seats and the Conservative party’s vote was decimated. The exit polls were amazingly accurate, based on a stratified sample of 20,000 exiting voters at 150 polling venues. (Labour’s vote share was only 33%, meaning that the huge seat margin will be cut or eliminated if the conservative forces can recombine.
So what does this say about the looming Trump-Biden electoral battle. First, the propensity for voters to decide their vote in the polling place means that either candidate can still win. Second, the money and mischief theory of voter behaviour that I have previously espoused is relevant here. [Swinging voters respond to promises of money and vote for the candidate who is more likely to piss-off the intelligentsia] Trump is firmly inside the voter’s hip-pocket, even to the extent of promising to abolish income tax. Mischief may or may not favour him, however. If Joe Biden becomes the presumptive loser in the election – through his age and signs of dementia – then Biden could pick up a sympathy vote and be seen as the underdog. Voters may well just go along with their vote from the previous election expecting others to vote in favour of Trump. This would reduce Trump’s share of the vote. Ironically, Trump might benefit if Biden stands aside, a new Democrat candidate receives the support of the biased media and Trump achieves underdog status.
Biden’s performance in the Presidential debate was embarassing for him and Trump’s reaction was, impressively, constrained. How do you disarm the sympathy for a declining octagenarian? This will become a theme of the campaign if Biden continues.
-
Speed is not the solution to Generative AI
What is it about human thought and creativity that cannot be captured by a computer?
I have always said that humans are smart but slow, while computers are dumb but fast. A human plus a computer is very powerful.
Generative AI is all about making computers think for themselves. Despite the doomsday warnings and related fantasy, computers remain a dumb workhorse. Researchers have decided to buy ever more computer power to try to emulate human creativity. But this hasn’t worked, nor do I think it will, because no matter how much a computer can access information, this is probably not the key to creativity.
It is worth emphasising here that there is a difference between knowledge and intelligence. Providing a computer with as much memory and information as possible will definitely make it knowledgeable but will it make it intelligent? Elon Musk famously diverted a shipment of Nvidia processors from Tesla to SpaceX to provide the computer power to know everything. But Musk is wrong in believing that this will trip the intelligence switch. A dumb computer with access to a faster processor and all the information in the world is still a dumb computer. Knowing what has failed historically is valuable in so far as you can avoid repeated failure but it does not stimulate creative success.
Why are humans slow thinkers? Rather than this being a weakness, perhaps this is a creativity trait? Humans do not know everything but given a limited dataset can conjure up brilliant solutions. Rather than combing through billions of potential combinations, could it be that restricting knowledge to a limited set can stimulate a leap of thought that creatively bridges the gap? Asking questions is a human trait and the first step toward finding a solution…
-
AI: A statistics package in the hands of monkeys
There are two ways to approach an empirical question.
1. Formulate an hypothesis based on some physical, structural or behavioural model. Test the hypothesis using a relevant dataset. Interpret the results relative to the model predictions.
2. Dump a bunch of data into a bucket, stir and implement the correlations that appear.
I have been intrigued by the aura-ed veil with which AI distinguishes itself from statistics. One journalist poses the question aptly “If machine learning is a subsidiary of statistics, how could someone with virtually no background in stats develop a deep understanding of cutting-edge Machine Learning concepts?“[Joe Davison, Medium, June 28 2018 ‘No, AI is not just glorified statistics’]. The journalist’s intent argues that AI is fundamentally different to statistics – that understanding the statistical algorithms is not necessary. I disagree. I question whether a ‘deep understanding’ of these cutting edge concepts was ever attained.
Let me explain the dangers of a statistics package in the hands of monkeys. Recently we were visited by an AI-person asking us to invest millions in his stock picking model. Stock picking has been a fascination of statisticians for centuries so there is an established literature of acceptable practice amongst researchers to prove their success or otherwise. Measures such as information ratios, Sharpe ratios, Sortino’s, benchmark relatives etc etc are standard. But this AI-person was out and about seeking investors without any of this supporting evidence. When quizzed about how he chose his investment universe and the subset of securities that made up his portfolio the response was that the ‘AI chose it’. Add to this, where he did calculate some measures of success, the calculations were wrong (an information ratio of 27 without a benchmark!) Financial datasets are notoriously unstable with outlying observations driving inference and spurious correlations galore. Ceding control of your dataset to an AI algorithm is hardly comforting…
…to me. However, this seems to be standard AI practice. The bigger the data bucket, the more crunchtime that is needed, the better the story it seems. AI practice is to specify, say, a linear relation between some X-variables as they affect a Y-variable, calculate the coefficients and just use them. It is a fact, however, that the optimal coefficient estimates from a multivariate regression are functionally related to the variance and covariances in the data. The response coefficients are positively related to covariances of X with Y, and negatively related to the variance of X. It is also a fact that these coefficients will be exactly the same irrespective of whether the user identifies themselves as an AI data-scientist or a statistical researcher. Clearly, where the end-result is the same and the method for estimating the relations are the same, then the same thing is being generated. So who are the monkeys here?
It seems to me that the AI-field fails to distinguish itself from the other users of the mathematical techniques that are used in their model-building. The monkeyness shows up as a lack of thinking about the datasets and the problem that they are trying to solve. One of the first things that you are told is to ‘look at your data’ but most people start calculating means and variances when they come into possession of a new dataset. The AI-types just stick the data in a bucket and stir! They dont want to know about where the data came from, how it is calculated or what it represents. They just want results – means and variances and covariances – that may repeat themselves or not.
The monkey that visited us seeking capital (and who I described above) is a classic case of someone who is destined to make the mistakes of statisticians several centuries before him and really doesn’t know how naive he is. He will patch up his model when it fails to deliver actual investment results and arbitrarily impose constraints to improve historical performance, while not improving anything in the future. He will lose money for investors and not know why.
It does seem odd that AI researchers have embraced the Finance industry without bothering to learn from the mistakes that have gone before them. Do they genuinely believe that they are the first to apply mathematical techniques to these datasets? A monkey with a statistics package is a dangerous combination.
Do you like what you read? Then subscribe to our blog below…
-
Sovereign reserves and the USD
Many market observers have noted that holdings of US Government bonds by Sovereign Reserves Managers (Central Banks and Sovereign Wealth Funds) has declined in recent years whereas the USD remains well bid. These commentators view this as contradictory and that the long term implication of lower Treasury bond holdings should be a weaker dollar.
This is incorrect. I am very proud to say that I wrote a paper together with Min C. Lie (1) almost 20 years ago that argued for separate bond and hedging benchmarks for Sovereign investors. At that time, if a Sovereign held 40% of their assets in US Treasuries they also held 40% of their assets in USD. This is inefficient. Min Lie and I argued that the Sovereign could reduce their holdings of US Treasuries in a dollar-neutral way by using currency forward contracts to buy exposure to dollars. This broke the link between the portion of debt in a portfolio and the currency.
My suspicion is that Sovereign investors are reducing their holdings of US Treasuries while hedging their currency risk back to USD. This leaves the USD unaffected in the trade (selling USD is offset by buying USD forward) whereas other reserve assets are purchased in place of the bonds eg. Chinese bonds or Emerging market debt. The easiest way to express a benchmark is to nominate fully-hedged, partially-hedged or unhedged to base currency. Again, my suspicion is that the Sovereign investors are adopting a fully hedged benchmark.
Another observation is that Sovereign investors are continuing to divest their US Treasury holdings quarter over quarter. This slow switch is taking place as they reinvest their coupon receipts into other bond markets. With coupons on Sovereign debt relatively low the process of moving to a new benchmark could take years. Expect the decline in US Treasury holdings by Sovereign investors to continue for many years.
(1) The paper is Fisher and Lie (2004)”Asset allocation for Central Banks: Optimally combining liquidity, duration, currency and non-government risk” in Risk Management for Central Bank Foreign Reserves Bernadell et al Eds. European Central Bank press