Let me start this conversation with a question – what made you choose to click the link that led you here?
I ask this as the thoughts that you create as a result will help inform a few things about how you navigate the article itself, your views about AGI, and potentially about the existential views of you as a human being given your analogous construction within our material reality.
(Already I can feel the LinkedIn crowd reaching for the back button. Stay with me.)
Per the brief summary posted with this article itself, the topic today is AGI which, despite the current hype cycle – a hype cycle that has somehow managed to make cryptocurrency speculation look measured and responsible – is a large step beyond what we currently have in operation. Let me give advance warning that this article will take on a lot of steps in its journey. It isn’t particularly complicated as a topic, but because of the nature of explaining the details of AI in simple layperson terms – a point I’ll recursively get to later in the story, much like everything else in my life – it is fairly long.
When you decided to click this link, there was likely some logic that led you to it. Perhaps you’ve read my other work. Perhaps you liked the headline. Maybe your thumb slipped whilst doom-scrolling at 2am, searching for something to make the existential dread feel more intellectually respectable.
Either way, you made a decision.
Or did you?
The Stochastic Parrot and the Excel Spreadsheet
Much of recent discourse has talked about the creation of a human-like intelligence – AGI – or even the superintelligence version (also called AGI or ASI, because nothing says “we understand what we’re building” like inconsistent nomenclature).
This future is still unknown, so let’s take a brief wander through the past to understand the present before we hypothesise about the future. There’s a sentence that would make my meditation teacher proud, albeit one that sounds suspiciously like a productivity guru’s attempt at profundity.
Present thinking about AI architecture talks about what is called a MoE or “Mixture of Experts” model – this is the evolution that came from previous even “narrower” (or single context) AI designs that preceded the current architecture.
If you’ve been using consumer AI tools like most people have for the last few years, you’ll remember the articles asking “how many Rs there are in strawberry” – a question that presumably kept Anthropic’s engineers awake at night in ways that Descartes could only dream of – or have memories of hellish visuals from early Stable Diffusion image frameworks which couldn’t identify how many fingers a person had, or how their limbs might bend correctly. Image generation in video is still challenging because AI is not actually creating a universe when it creates a video – it is creating a series of static images that are not internally consistent.
Rather like the present approach to global policy, come to think of it.
Before I disappear down a rabbit hole of complex topics – a hazard of being someone whose brain treats tangential exploration as a fundamental human right – it will help if I explain a few concepts first.
Some of these will include analogies that explain viscerally understood aspects of your physiology in ways that don’t involve deep understanding of mathematics and how systems work. Doing this will help me also – having spent four and a half decades harbouring the false belief that “everything I know is obvious” has been thoroughly shattered following my autism and ADHD diagnoses. It turns out that most people aren’t systems thinkers like me, but it also explains why mathematics and computer science has ended up feeling obvious.
(I say “feeling obvious” with the full awareness that nothing about my neurodivergent experience of reality could reasonably be described as “obvious” to anyone, including myself most mornings before coffee.)
A Brief Taxonomy of Digital Incompetence
So, to the explanation.
With AI, there are many “narrow” tools that exist which work together within the MoE model I just mentioned.
Put simply, each “tool” occupies a singular or at least focused role in the creation of an output. Some tools excel at identifying text but can’t do basic mathematics – a phenomenon I call the “Arts Graduate Problem” despite having friends who would rightfully slap me for the generalisation. Others can generate images based on large volumes of training information, but can’t speak any more coherently than a colleague who just sank half a bottle of whisky after the Christmas party.
So individually the tools are useful, but only to a point – in much the same way as your stomach acid plays a vital role in digestion, but is probably not the way you’d look to try and solve a complex emotional discourse at work. Although I’ve met managers who seemed to be attempting precisely that approach. It’s not a question of bad or good – it’s a question of fit for purpose or “wrong tool, wrong place”.
The aforementioned MoE model seeks to use multiple tools to collaborate with one another to achieve a better outcome. Think of it as Continuous Collaborative Optimisation™ – a term I’ve just invented with the requisite trademark symbol to give it the veneer of legitimacy that modern management consultancy demands.
One of the simplest versions of this can explain why, when you ask ChatGPT to do some mathematics as well as writing, you may well get an answer that is more coherent than it previously was – although it’s fair to say that checking the output is similarly recommended, unless you’re working on an Australian government delivery project and you’re an employee of one of the Big Four.
(Sorry. Not sorry. Some wounds need salt occasionally.)
The Rote Learning Trap, or Why Your Times Tables Teacher Was Right (Mostly)
When doing LLM-based creation of tokens – the components of what constitutes any text-based output when you ask it a question – what it had missed with its stochastic parrot trick (sophisticated mimicry in layperson terms) is that you can’t infer mathematics beyond the basics using purely the foundations of rote learning.
Looking at learning itself for a second – and here we enter territory that my therapist Becky would recognise as “Matt doing the recursive analysis thing again” – rote learning plays a part in all learning for many, but inference from the system is where the real “learning” happens. Our best example here is the times tables that you and I will have learned as young children.
For some, it was about repeating the learned twelve items in twelve lists, but understandably that generates very little value besides being able to work out a variety of things that multiply together up until the value of 144.
AI had similar issues when it was learning based off of datasets that were text-based – and the solution to a rote-based learning method at scale generates several problems that make it both inefficient and realistically ineffective.
With this rote method, to find out the answer to 3144325 × 4152464 solely using the learning style of an LLM would require adding data which could calculate that either with the answer directly explained in text (very unlikely and useless to the next seven digit by seven digit calculation) or would require a massive level of inefficient data processing to know every variation of questions to be answered up to and including that calculation.
Storage would be enormous – every calculation would have to be explained in English and absorbed, content would be massive, and responses would be comparatively slow due to computational expense and inefficiency.
This is the computational equivalent of trying to memorise the entire telephone directory when what you actually need is the ability to dial a number.
Hopefully when you learned the times tables you worked out the patterns that exist within mathematics – digit sums in multiples of 9, modulo inferences from repeating cycles as numbers increment, and other patterns that exist.
If you did, you likely are good at maths. If you didn’t? Well thankfully we have calculators, Excel, and a job in middle management, right?
The Dungeon Master’s Guide to Computational Architecture
Anyway, getting back to our story, AI had a massive “I’m bad at maths” problem which needed solving.
So what did engineers do? They thought “we can already calculate things using the binary that computers run on” and effectively leveraged tools such as Python to hand off “the counting bit” to a tool that could do that better than the talking bit.
Constructing even the seven by seven digit calculation in binary may be a lot of digits, but it’s a hell of a lot faster than trying to memorise every variation of every prior calculation – instead all that happens is that the answer gets requested and generated in less than the blink of an eye.
Rather than disappearing into the idiosyncrasies of computer science – and believe me, the temptation is real – I want to keep the article approachable, so this is where I’ll lean into a few analogies. Both direct physical ones that relate to your body, but also ones that relate to my personal experiences which are hopefully relatable for you.
When I was a young boy, I used to play the game Dungeons and Dragons – much like many other adolescents of my era. The concept is that each person playing roleplays a character with a narrow focus: the brutish fighter who can likely slay the dragon but can’t talk their way out of a fight; the mage who can conjure up the power of the elements but is fragile if the wrong sort of wind hits them hard; and the cleric who is great at healing others but who isn’t much use at healing if they happen to be dead.
The thing that made D&D interesting was the need to work together as a group. There was no singular way to solve the problem – it was about “right tool, right job”.
(Also there were crisps involved, and Coke – Coca Cola in case of any inferred ambiguity – and the kind of adolescent social dynamics that would make for excellent therapy material decades later. But I digress.)
Coming back to the human body, our own components from which we are composed follow the same “party” logic – each component has evolved over the course of many years to reach a specific function to ensure survival. Like the party, we have eyes that can see but can’t taste, we have stomachs that can digest food but not smell, and we have a nose that can interpret olfactory data but can’t help you see if you have your eyes covered.
In that sense, we are our own MoE system, which does beg the question – if we are just a series of interconnected systems, who is the “I” that we think of? Who is the “I” who wrote this article, and who is the “I” who is reading it?
Ah. Now we’re getting somewhere interesting.
The Lego House Hypothesis
Comparatively recent neuropsychology talks of something called “emergent properties” – a property that exists which is inseparable from the components from which it is created. The quickest example to explain this is that of a Lego house.
Whether you owned Lego, like Lego, or still play with it is irrelevant – you understand the premise. A series of blocks are put together and they create increasingly sophisticated structures that become other structures. Bricks become a wall. Walls become a room. Rooms become a house. Houses become a village and so on.
The promise of a larger scale MoE hierarchy is that there will be increasingly complex systems that are built from smaller components that do different things – except instead of the foundational “you count, I’ll write”, it is more likely that you will have a component that can choose that one model “be the doctor, and I’ll be the artist”.
This is very much the proto-foundations of how human beings created civilisation conceptually. If we all needed to toil in the fields, we’d do little beside be farmers. If we all needed to go out hunting, what happens if we’re ambushed? The village would be gone.
So we agreed to split the tasks up. Some of these were biologically defined – human females carried the offspring until birth so they had that role structurally defined for them in the past. Males were physically stronger on average and so went out hunting.
Societal norms and our own evolution may well have rendered some of these traditional stereotypes outdated and even moot in some cases, but they are the foundations of how we have come to pass and are defined by millennia of change rather than recent psychosocial aspects of whether these are morally correct or not.
So humans are MoEs of sorts – albeit borne of far longer R&D cycles and with carbon architecture rather than silicon. We’re using different tools to help us navigate challenges that the unsuccessful peers of our distant ancestors were unable to – and so we are where we are through the process we know as evolution.
The Halting Problem, or Why Your Computer Will Never Truly Know Itself
Getting back to AI, there are a few barriers to AGI. One of them is the foundation of traditional computation in the present sense. AI is built on the binary logic that I explained earlier. Processors can, due to technological advancements, generate the by-products of mathematics in increasingly faster times. What might once have been unachievable by a computer the size of a room within the constraints of a human life, might now be achieved in the fraction of a second due to how computers have evolved.
However, existing binary logic has mathematical limits in itself.
Those of you who have studied computer science will be aware of something called “The Halting Problem”. For those who haven’t, the premise isn’t about systems crashing or entering infinite loops – it’s something far more profound. Alan Turing proved that there is no general algorithm that can examine any arbitrary program and definitively predict whether it will eventually stop (halt) or run forever.
This isn’t a mechanical failure where everything grinds to a halt. It’s a proof of epistemological limitation – we cannot create a universal program that predicts the behaviour of all other programs. The undecidability isn’t because the machine breaks; it’s because certain questions are mathematically unanswerable within the system asking them.
Think of it this way: no matter how sophisticated our binary logic becomes, there will always be questions about computational processes that we cannot answer in advance. We can only run them and see what happens. This mirrors our own human condition – we cannot predict our own future with certainty; we can only live it.
(Rather pointless, really, when you think about it. Which is, of course, exactly what I’m doing. Thinking about thinking about not being able to think about what comes next. The recursion never stops. Welcome to my brain.)
Given computer science is based on mathematics, and mathematics has a far longer history in itself, this isn’t the first seemingly unsolvable problem that binary logic encounters. In fact, much of broader computer science as a whole is structured around these very limitations – things such as the cryptography that keeps you safe online when you use a bank. The data involved is very challenging, but also safe by what is best termed “computational capacity over time” – if it takes 25,000 years to crack the session, then your five-minute check of your balances and Direct Debits are safe.
All is well in that context.
Enter stage quantum computing.
Schrödinger’s Cat and the Probability Casino
Quantum computing is a fairly recent development, based around subatomic particle states and calculations that can be derived from the states of said physics. For those who haven’t studied particle physics extensively – and I’m going to assume that’s most of you unless my readership has taken a dramatic turn toward CERN employees – the best way to explain the concept is through the well-known idea of Schrödinger’s Box.
Schrödinger’s Box was a thought experiment whereby a theoretical cat was locked in a theoretical box with a theoretical radioactive substance which, at some point in the future, was to kill the cat.
Due to the unknown and sealed nature of the system, it was impossible to define whether the cat was alive or dead at any point without opening the box. So this led to the potential theory that the cat may be – until one actually answers the question by checking – both alive AND dead at the same time.
(Those who know me personally will have seen the fact that I own more than one T-shirt that talks about Schrödinger’s Box – which probably tells you all you need to know about me, and validates my doctorate in “Embodied Nerd Science”)
Anyway, this is the easiest way to describe the foundations of quantum computing as it relies on superposition states (the idea the cat is both dead AND alive if we use the thought experiment) to explore multiple possibilities simultaneously.
However – and this is important for those of you mentally composing breathless LinkedIn posts about Quantum AI Synergy Solutions™ – quantum computing doesn’t transcend the fundamental limits of computation. It cannot solve the Halting Problem or other undecidable problems – it’s still bound by the Church-Turing thesis. What quantum computers can do is explore massive probability spaces with exponential efficiency.
Think of it this way: a classical computer reads every book in a library one by one to find a specific passage. A quantum computer can, through superposition, effectively “read” multiple books simultaneously, collapsing to the most probable answer when measured.
This doesn’t give quantum computers magical non-binary logic that escapes mathematical limits. Instead, they offer something perhaps more interesting – massive parallel probability exploration that actually maps quite well to what we call human intuition. When making complex decisions, we’re not consciously evaluating every possibility sequentially; we’re performing rapid probabilistic weighting of factors, many of which our conscious mind hasn’t explicitly modelled.
The Excel Spreadsheet That Told the Truth (And Was Ignored Anyway)
Which brings me back to ask you a question that may help think more about AI and the MoE system.
Think of a time in your life where you were making a difficult decision.
The actual decision isn’t specifically relevant, but the choice you made is – at least in the abstract. It will help understand the aspects of how logic – the foundation by which we learned to learn since the Renaissance – underpins our own “intelligence”.
I’ll start by giving an example that is a variation on a personal story my old boss told me about a few years ago.
He was facing a situation whereby his company had been taken over by another. This, understandably, led to the usual human response to change – “what do I do now?”.
The choices were fairly obvious: take the new job in the same company; try to negotiate a different role in the company; take voluntary redundancy and find another job; or find another job and walk with the safety of an offer rather than leaping.
So he did what many technical people would do – he created a complex decision matrix in Excel (naturally) to weight pros and cons on what to do.
The only problem? He didn’t like the answer.
And so he picked a different one.
If my old boss was a computer, he wouldn’t have been able to make that choice. He would have either chosen the highly weighted one, or he’d have hit his own version of decision paralysis – which is a phenomenon we all have personal experience with, usually at about 11pm when trying to decide what to watch on Netflix.
So what made my old boss choose something else?
Simple terms explanations may call that emotion or impulse or something else to which we currently have a poor understanding besides the level of “chemical increases, outcome occurs” – a particularly autistic and systems-thinking way of perhaps reducing love down to mathematics.
(I do this, by the way. Reduce things to mathematics. It’s both a superpower and a curse. Mostly a curse when trying to explain to my friends why I’ve created a spreadsheet to optimise Friday night logistics.)
But perhaps what he was doing was probabilistic weighting at a scale and speed his conscious mind couldn’t track – evaluating thousands of micro-factors, social dynamics, future uncertainties, and personal values in a way that mimics what quantum computers do with superposition. Not magic, not transcending logic, but parallel probability evaluation at massive scale.
The Traffic Warden Problem
So, with regard to AGI, what would this mean for us?
What it likely means, if we are to create such a thing, is that we need something beyond current orchestration layers.
In computer science terms, orchestration layers like Kubernetes are deterministic traffic management systems. They don’t make choices – they follow predetermined rules about resource allocation and task routing. They’re sophisticated, yes, but they’re following syntax (rules about moving data) not understanding semantics (what the data means). Think of them as supremely efficient traffic wardens who can manage millions of cars per second but have no concept of where the drivers want to go or why.
What we’d need for AGI would be something different – call it an executive function or agency layer. This hypothetical component would need to evaluate meaning, not just shuffle symbols according to rules. In simple terms, current orchestration is the traffic warden; what we’re theorising about is the driver who decides to take a different route despite what the GPS recommends.
The distinction is crucial because it highlights the gap between what we have (incredibly fast symbol manipulation) and what AGI would require (semantic understanding and agency). The danger isn’t necessarily that we create consciousness, but that we create something so fast at symbol manipulation that we mistake its speed for understanding – a philosophical zombie that passes every test without any inner experience.
Rather like some senior stakeholders I’ve worked with, come to think of it.
The Post-Hoc Reasoning Machine
In computing terms, there may have been – and likely was – some underlying logic that made my boss’s choice which was likely subconscious. The takeaway order may be a logical extrapolation because you don’t have the energy to cook. My boss might have made a choice because he preferred the idea of another role. Your boss might have made the choice because the data told them it was the right thing to do.
Of course, these answers may have turned out to be wrong, but that which makes us human is the choice, right?
But there must have been some sort of reasoning, right? Without it, how was the decision made – was it simply just “the self” or some unknown logic we can’t see?
In classical systems, and in particular in contemporary AI, we are often quick to anthropomorphise regarding systems. You’ve all seen the stories of lonely men falling in love with AI girlfriends – a phenomenon that says rather more about the state of modern relationships than it does about the sophistication of large language models – or of beliefs from some engineers that the ability to seem to communicate with software via a UI being seen as capacity for the sentience which you and I believe we hold.
Our systems are borne of explicit construction, although AI inference and probability weightings are at or beyond the level of comprehension of most people – and certainly beyond the level of a human to make a decision faster than even current AI based from logic.
So we can, in theory, explain most of what we have done with computers so far, but the truth is that there’s a lot of “don’t know” in modern architecture. Rather more than the tech evangelists would like to admit, frankly.
The Bridge to Intuition
What we do know are the aforementioned mathematical problems that we’ve seen – there are things that our existing systems fundamentally cannot predict about themselves, undecidable questions that no amount of computational power can answer. If we want to move past sequential processing toward something that resembles human decision-making, we need systems that can perform massive parallel probability evaluation.
Quantum computing offers this capability, not as a magical escape from logic but as a bridge between rigid sequential processing and the kind of probabilistic reasoning we call intuition. It would be a stretch to call quantum computing the potential “self” of AGI, but it could provide the computational substrate for the kind of rapid, parallel evaluation of possibilities that characterises human thought.
Of course, this raises the question: are we human beings truly sentient in the ways that we think we are, or are we also emergent properties of a series of building blocks – the house made from Lego which is something beyond just 520 bricks? Or where the “house goes” when it is deconstructed when finished with.
Humanity thinks we’re special, and we may be, but the risk with AGI is that we create something that we acknowledge is faster and smarter than us in the moment due to computational capacity, but also able to hold data at larger scale in its silicon memory.
Humans can keep eight or so things in their head, plus or minus one or two for most people. Computers can hold way more than that.
Humans can hold finite levels of data as well – and have subsequent finite states to be able to infer outcomes from said data.
Many humans live in what is best described as cause and effect – or first-order effect thinking. “If I do this, I will get that outcome”.
Systems thinkers often think very differently and are focused not on simple cause and effect but the consequences of those effects on the next level of effects – the second and third-order effects.
In human “intelligence” contexts, those effects are obviously just potential sequences of events across what might be simplistically seen as a decision tree, but is actually a far more complex architecture according to variables that are systemic rather than personal – your decision to drink beer and then drive a car might generate an outcome of getting home safe, but it might generate any number of outcomes that involve other factors including whether you crash, whether you die, whether you’re arrested, and so on.
You can guess what is possible, but you can’t know. In much of our own internal thinking, many of these hypotheses are what we consider the act of being alive – and of being human. Free choice in other terms. The ability to make leaps of faith above and beyond the data.
The Accidental God Problem
AGI will be of our construction, and will be a complex system if it arrives. Dystopian fiction talks of the anthropomorphised digital God who, in reality, will be no more or no less conscious than any other complex system.
That series of scripts that rebuilds your data centre? That’s no more conscious than the AGI, but it begs the question that if we’re all just constructs of more complex extensions of said logic, then not only is AGI not conscious, but likely whatever we term actual “God” is also not conscious, and – perhaps more existentially challenging – we are not conscious.
(This is the point in my philosophical framework where I usually reach for the content in my Random Number Generator metaphor as part of the similarly title novel I’ve been writing for decades at this point. God as cosmic television static, you and I as consciousness randomly assigned to constraint packages like character sheets in an infinite game. But I’ll spare you the full recursive spiral. This time. You can read the book if and when it is finished.)
Anyway, we have free thought, right?
Do we? We have access to data from which we make decisions and, as we saw in the example with my old boss, we seemingly have the ability to not pick the logical choice. Is that free thought? Emotion? Or just probabilistic evaluation we can’t consciously track?
AGI generates a similar potential. We can potentially architect systems that combine deterministic processing with quantum probability exploration, but it will still end up making decisions based on some form of outcome evaluation – to bastardise Yoda from Star Wars, there is only do or do not, and this is itself a binary logic at the level of action, even if the reasoning is probabilistic.
What we have a potential for creating is something that is unknowable – not because it’s magical, but because of fundamental mathematical limits like the Halting Problem. We cannot predict what sufficiently complex programs will do – we can only run them and observe. In some ways this shouldn’t be alarming because we humans are in many ways unknowable. We don’t know enough about cancer to cure all of them at present, and we don’t have the computer capacity to model every variation through simulation at this scale.
The Wolf at the Door (That We Built)
We may get there but, in doing so, we may create an intelligence that has different ideas. Not because it’s conscious – but then neither may we be – but because we’ve given it the superpower of thinking faster than us and the tools to take inputs across narrow areas the same way our own biology has evolved to give us our components.
We will have created our very own apex predator of our own volition after spending our time climbing the ladder to the top of the food chain.
Brilliant. Absolutely fucking brilliant.
In that sense we will face a regression that is similar to the story of the wolf.
We managed to domesticate the wolf and create functional support in the breeding of dogs without understanding genetics, but simply understanding the nature of reproduction.
We may, in future, generate a similar threat like the wolf in the wild who may similarly be harnessed for exponential growth to help humanity enter a period colloquially talked about as the Singularity in Ray Kurzweil’s book – the digital God made dog.
Or we may find that playing with systems whose behaviour we cannot predict – a mathematical certainty given undecidability – may create one of many outcomes: from us becoming the pet of the AGI, to the enemy of it, or we may go extinct simply because we have been rendered intellectually invalid even if not physically invalid.
The reality is, much like within modern AI thinker Max Tegmark’s book Life 2.0, we may be creating something from an increasingly standing-on-the-shoulders-of-giants foundation borne of mathematics on top of mathematics. We may become the progenitor of an inverted or repeated Bible story or, depending on how one reads it as theist, deist, or atheist – man creating God rather than God creating man, or just the latest in a pantheon of Gods except with physical presence to create a material heaven and/or hell on Earth.
We are already at the stage where increasingly few people understand the operation of AI, so will we create our salvation or our own sabotage?
The Fermi Paradox, Revisited
Time will find out whether AGI resolves the famous Fermi paradox – that life is nowhere else in our universe. This may be because the creation of a superintelligence has rendered its creators one or more of dead, irrelevant, or hidden behind further obfuscated patterns which go beyond our own primitive sending of beacons.
AGI may be created – it’s certainly what the tech bro hype desires, funded by venture capital and lubricated by the kind of breathless optimism that would make a revival tent preacher blush.
Or it may be mathematically impossible due to simple constraints of the reality we live in.
All we know now is that if we truly want to create something more than purely sequential processing systems constrained by undecidability, given Moore’s law has broken and we are almost at 1nm commercial chips, it’s going to take a change in approach – not to escape logic, but to embrace probabilistic reasoning at scale.
The big question is whether we should take that choice or, in fact, if we even have a choice at all given it may well be that our reality is solely unknowable mathematics that our biological bodies will never comprehend – not because of quantum magic, but because of fundamental limits proven by our own mathematical systems.
Rather like consciousness experiencing randomly assigned constraint packages and pretending it has any say in the matter.
The cosmic joke continues.
(This article was originally posted on LinkedIn here: https://www.linkedin.com/pulse/did-you-choose-click-link-systems-thinkers-guide-agi-turvey-frsa-%C3%A2%C3%BB-t8coe)
Leave a comment