I hate almost everything about contemporary AI. Above all, that it has become synonymous with a single paradigm: generative AI. I hate that it is based on computational variation of a flawed understanding of cognition, connectionism. That it is made possible by vast quantities of data and parameters, instead of high-quality of sense-making and understanding. I hate that it is a matter of statistical extrapolation, and in such a way, its "depth" is achieved by ultra-stacking of shallowness. That it is non-deterministic, so you get different answers to the same question, even when all the needed input is there. I hate that most corporations skip the unsolved data governance and data quality problems and directly jump on the AI bandwagon. I hate the individual me-too deluge and the associated monetizing of hype and anxiety.
Then, you may wonder, if I hate it so much, why is it that I use AI all the time (fact), and now I even have the audacity to write about what it takes to use it well. (Yes, that's what this essay is about)
Well, the answer is the name and the claim: AI. As I explained in another essay, AI is neither artificial nor is it intelligence. It is natural. And the fact that we call it "artificial", in other words, non-natural, is also natural.
Let me explain. If the smallest material unit is the atom (or quark, lepton, string, preon or whatever your favorite theory claims), then the smallest unit of cognition is distinction. Everything is a matter of making distinctions and distinctions on distinctions. One of the best alternative mathematical systems, the Calculus in Indications, is entirely based on distinctions. They create something from nothing, and everything from everything else. The deepest and the widest theory of society, that of Nikolas Luhmann, is also based on a deep understanding of the role of distinctions.
We have developed zillions of theories and practices based on distinctions such as subject/object, nature/culture, nature/nurture, mind/body and of course, natural/artificial. Some distinctions have more utility than others. There are unquestioned distinctions that are the foundations of some scientific cathedrals. But let's not succumb to the temptation to deviate here. In our case, it's worth focusing on the common distinction between those that are born and those that are made. And yet, if the end of our body is our skin in some sense, it's not in another. When the hammer is ready-to-hand, then it is the hand.1 The mind is not in the brain, not because it is extended in the environment, but because it is a relational phenomenon.2
Our cognition emerges from our interaction with our environment, including that part we create ourselves. Such a part, with increasing significance, is AI. The current AI might not be the one we wish we had, but that is the path-dependent adjacent-possible that we do have. So let's make the best of it. But how?
By balancing trust and suspicion.
Let me remind you that AI was going nowhere, even after waking up a decade ago, until 2017, when a paper called Attention Is All You Need3 changed the game. After that, it took a few years and a few billion dollars to bring AI from the most obscure to the most talked-about topic. So, if for an LLM, or at least those using transformers, self-attention is all they need, what is that a human needs to use AI effectively?
My answer is: trusticion. That's a portmanteau word, made up from trust and suspicion. If attention is all a machine needs to learn from text, then trusticion is all a human needs to learn from a machine that has learned from text.
It may look specific to AI. It's not. And it may look simple. Again, it's not.
Trusticion
Nowadays, there’s a unanimous agreement that the main problem with LLMs is their occasional hallucinations.
If we indulge in this anthropomorphic narrative, then it's certainly not true that LLMs hallucinate only sometimes and that this is a problem. They hallucinate always, and this is how they work.
An increasingly bigger part of those hallucinations and confabulations, to various degrees, coincides with what we can qualify as useful. We tend to consider a generated text useful if it is factually correct or if we find it insightful. But it doesn't change the fact that for AI using LLM only, 100% of what is generated is hallucinated.
LLMs get better, and the useless part of their hallucinations gets down. But they are still there, and with a different average amount depending on the type of quest. So LLMs have task-dependent thresholds of reliability.
Not only LLMs have task-dependent thresholds of reliability. The increasingly capable Large Reasoning Models (LRM) can be so impressive that they can fool us into thinking they are trustworthy. Yet, while LRMs are good at medium-complexity tasks, they are not so good at low-complexity tasks and very bad at high-complexity tasks.4
It’s up to us to develop the ability to both trust and critically assess what we receive, shaping new muscle memory and continually updating our attunement. That’s remarkably similar to what’s needed to increase the chances of serendipity. Serendipitous events can arise from noise, error, or something alive — but we must be able to tell what’s relevant from what’s not. The same applies to social media: the sheer volume of disinformation, misinformation, and noise can either brainwash us or lead us to discard all as useless and miss valuable opportunities for learning or serendipity.
Trusticion needs to be applied fractally: across all models, within a particular model, for a specific interaction, and even for a single response. In what we trust, there is both trust and suspicion – and the same goes for what we suspect to be total slop. An annoying bout of logorrhoea might contain something valuable, while something that appears pristine could be 1% wrong, but in the part that matters most.
And one more thing, in relation to an earlier essay on AI. Realizing that AI is a fourth-order observation is necessary but insufficient for developing trusticion. Yet it shows that trusticion is not that different from self-attention.
Shorter Wings
Adaptations that give a survival advantage can be for avoiding danger or for being better at getting food. In all cases, they take a long time. The toxic skin of poison dart frogs took 30 million years. Human depth perception took twice as long. Adaptations related to obtaining food seem to be faster, but still in the range of a million years. The opposable thumb in humans took two million years. Similar time for the long neck of giraffes.
But there are exceptions. Some adaptations can serve both purposes, avoiding danger and improving food supply, and they can develop surprisingly quickly. Such is the case with cliff swallows.
It took only 30 years for their wings to get shorter. Cliff swallows tend to build nests on bridge supports, and they often get killed by the passing vehicles. Developing shorter wings allowed for quicker vertical takeoff, and the number of road-killed cliff swallows significantly declined.5 Shorter wings also helped them adapt to climate change by increasing their maneuverability and making them better at catching the fewer insects left during cold snaps.
Trusticion in humans is like the shorter wings of cliff swallows. It needs to develop fast. It may help avoid being hit by something big moving fast, and catch the rare insectsights contained in the increasing amount of noise.
That insight of Heidegger, which is important from a philosophical point of view, has had plenty of confirmations in the last few decades by both neuro-science and cognitive science. Every “tool is an extension of the hand in both a physical and a perceptual sense” (Iriki et al., 1996). After the seminal paper of Iriki et al, this finding has been confirmed and further explored (Serino et al., 2015).
References:
Iriki, A., Tanaka, M., & Iwamura, Y. (1996). Coding of modified body schema during tool use by macaque postcentral neurones. Neuroreport, 7(14), 2325–2330. https://doi.org/10.1097/00001756-199610020-00010
Serino, A., Canzoneri, E., Marzolla, M., di Pellegrino, G., & Magosso, E. (2015). Extending peripersonal space representation without tool-use: Evidence from a combined behavioral-computational approach. Frontiers in Behavioral Neuroscience, 9. https://doi.org/10.3389/fnbeh.2015.00004
There is such a popular theory, the Extended Mind hypothesis, by Clark and Chalmers (Clark & Chalmers, 1998), which indeed challenges the neurocentric understanding of the mind. However, I’m more inclined to subscribe to the few that the mind is not located in the brain, not because it is spread beyond that, but because when it is understood as a relational phenomenon, it doesn’t make sense to speak of location (Di Paolo, 2009).
References:
Clark, A., & Chalmers, D. (1998). The Extended Mind. Analysis, 58(1), 7–19.
Di Paolo, E. (2009). Extended Life. Topoi, 28(1), 9–21. https://doi.org/10.1007/s11245-008-9042-3
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need (No. arXiv:1706.03762). arXiv. https://doi.org/10.48550/arXiv.1706.03762
See the recent paper, The Illusion Of Thinking.
Brown, C. R., & Brown, M. B. (2013). Where has all the road kill gone? Current Biology, 23(6), R233–R234. https://doi.org/10.1016/j.cub.2013.02.023