Wordless Thinking – Tamseel's Weblog

Henrik had posted a new essay (When is it better to think without words?) some days ago and I had been holding out reading it. I think I do this because Henrik’s essays feel too sacred to be read casually. And it turned out right. I literally jumped off my feet–several times–as I read it today at dawn.

What makes Henrik’s essays so lovely is that it’s like, there’s a sort of hum of a melody you had heard somewhere in childhood and the details of it are faded (perhaps you had only heard it from far off and your ears didn’t had capacity to perceive those details in the first place), and that faded hum stays in your mind for days and months and years, and then someday, you turn your attention to someone, and they start playing that precise melody that has been taking up your mind for so long, with such sharp and crisp details that your ears start savoring with great attention so as to fill in those missing details, and as the melody is about to end, you are startled as you realize there is more to that melody that you hadn’t heard of before, and what you had been repeating was merely a segment of an extended symphony.

I think I have been doing this thinking-without-words thing, for a few years, even though much rudimentary compared to Hadamard’s. This description seems oddly familiar:

He also saw something that looked like equations, but as if seen from a distance, without glasses on: he was unable to make out what they said.

this type of deep, consciously-blurry concentration

Often when, reading something or paying attention to something for long, excites my mind too much, I go for a walk, that sometimes continues for an hour or more, in which I think without words¹. And I feel this strange tension, because there’s this wide panorama, really really wide, and I can see these bizarre connections between remote things, but when I try to zoom in or to look clearly into the details of that thing to be able to validate if that apparent connection makes any sense or not, the whole image shatters and falls apart. And so I am left empty-handed, retaining neither that wide panorama, nor any conclusions about validity of any particular connection/pattern. It’s a very unsettling feeling.

This was how I had described that feeling a while ago:

You can replace “thinking in pre-concepts” with “thinking without words“

The thought reference was the essay Think by Talha Ashraf:

[Deep thinking] depends on not only one level away from direct results but then another level away where you think about the effects of the effects and then again the effects of those effects and then also try to think about the results of these combinations. The only problem with this is that these levels exist as preconcepts, things that cannot even be defined with words because they are too vague at that point. Most of them you probably cant even identify as concepts so you wont be able to write them down and if you cant write them down you run into a very basic problem that everytime you start thinking about them you have to start from scratch. You cant start again from where you left off becuase the end was not a concept but a combination of vague concepts and their effects. This means that the only way to think in depth is through uninterruped chains of thought. You can only have uninterrupted chains of thoughts if you have time in solitude and the depth of your thoughts are then limited by the longest uninterruped duration of thinking that you can have. …

Now there is one caveat in solitude allowing the longest chains of thought. Which is that you can keep thinking about vague concepts in vague ways without ever actually turning them into something concrete. And this is where a lot of people get stuck at. After you have spent sufficient time in your thoughts you actually want to try writing your vague thoughts by converting them into words. You are basically trying to think in “type” where instead of interrupting your thoughts to write, you think in writing by just starting to write down your thoughts.

The thread in that screenshot was my after-thought to the difficulty I was having in thinking in type described in the thought-reference. So, I was basically having this tension where I was becoming ever more skeptical of my whole thinking (like, is the panorama even there or am I suffering from a thinking-schizophrenia?) because when I attempted to re-think those thoughts in words, they just wouldn’t come; without that multi-dimensionality there was nowhere to begin from. That was even more problematic because if somehow, I could grasp them in a concrete manner, I could at least shrug them off saying, nah they don’t make any sense, but now I couldn’t even do that, because absence of evidence is not evidence of absence. But also, if there’s no evidence, then on what grounds am I standing on? I can’t live suspended in the air.

Reading Henrik’s essay clarifies a lot of things.

It seems the problem I have is this. What I have called the panorama is sort of a web of associations in a higher dimension space. It feels blurry because the eyes-equivalent-of-brain are physically incapable of looking at it in a distinct manner. What we can look at, is a representation of it that has been compressed into a lower dimension. But compression with minimum amount of loss, is very resource intensive. My brain is low in this processing power, which makes the compressing process computationally slow. Then, the working-memory of my brain is also very limited, and hence the residual accumulation exhausts it before the computation completes. The process crashes.

The way to get anywhere, it seems, is to not to try to compress the whole panorama at once but some chunks of it, that are large enough to contain useful associations. If somehow, I can keep hold of at least two remote points and the association between them², from the high dimensional space to low dimensional space, I will have something concrete to build further upon. And the good thing is that our hardware is flexible. If we keep exercising these compression computations, our working memory expands and processing speed improves, and thus we can bring in larger and larger chunks from the high dimensional space to the low dimensional one³.

Looking at the writings in my weblog, it seems I have been able to run a handful of computations involving compression of very small chunks of that wider panorama, without crashing. For those of you who have seen me, I feel just as much progressed with this exercise, as I would feel if I went to a gym someday⁴.

But the thing is, this whole affair is not about compression at all. It’s about building a better model in the higher dimensional space—a model that is closer to truth. That is why we compress it into lower dimensional space, in the first place, so that we can scrutinize it more thoroughly, looking for contradictions and flaws, and fix parts that need to be fixed, and demolish parts that need to be demolished, thus updating the high-dimensional model⁵. A very useful by-product of it though, is that we can share results of our findings (based upon scrutiny of the compressions) with others and read findings shared by others, and based upon our personal re-inspection of others’ findings, speed up the model-iteration process by magnitudes.

Probably more often than this, I do the other kind of stroll in which I do think in words—or speech to be precise. It’s a conversation I have with myself. [Probably, that’s why I have become accustomed to circling on the roof my hostel or veranda of my home (where I am alone), instead of something like a park, because I find it very hard to have a deep conversation with someone in presence of others (regardless of whether others are paying any attention or not). Even though I don’t actually speak aloud the words when I am talking to myself, but it shows in my behavior (I laugh at funny things and even slap myself (though much rarely))]. ↩︎
The other kind of stroll where I talk with myself (or think in speech/words) also helps make new associations, but those are mostly with points much closer in that high-dimensional space. ↩︎
But, the compressions are still lossy. You can’t substitute them for the thinking in high-dimensional space. Physicists or mathematicians are able to talk to each other in words or symbols, only when they can map the compressed representations to their own individual thinking in the high-dimensional space. ↩︎
Those who have actually seen me know that’s a false comparison. The state of affairs is much more feeble in meat-space. ↩︎
If I am not mistaken about the technical processes, this happens with LLMs only in the training phase. Once an LLM model has been trained, it is not re-updating its model with every single computation it does—unlike human beings. ↩︎