55BirchStreet DE

January 22, 2025

Tech & AI

Think back to the first interaction you had with ChatGPT or a comparable Large Language Model (LLM): maybe a simple question, one about the current situation, clumsy advice, or a simple research overview. And then there is the fascinating moment when the AI calculates for the first time how word after word appears in the chat, and a coherent overall picture emerges.

It was the statement that this machinery was used for the first time at thinking to watch — and on a human-like level. The second Hello World Program, but this time it speaks to us on its own — and here and there people may have already speculated whether the Turing Test has been passed.

Turing Test — What is it?

A term that has probably already been used by many in the context of artificial intelligence, but is grossly misunderstood in everyday life due to a simplification of it.

Ein Bild, das Screenshot enthält.Automatisch generierte Beschreibung

In its simplest form, the Turing Test works as follows: There are a total of three participants — one evaluator (C) and two interlocutors (A and B; a human and an AI) — with A and B conversing with C based on text. C leads the conversation about questions and must ultimately determine which of the two interlocutors is man and machine. If C is unable to do this, or if his verdict is wrong, the AI has passed the test.

The original version based on Turing It looks a bit different again and has not prevailed for good reason and would have fallen out of time today at the latest. Very confused, strangely gender-related — you can look it up if you're interested 😉 We'll stick with the simple version.

Generally speaking, however, the Turing Test is more of a thought experiment than a veritable test that attempts to attest when machines develop human-like thinking skills. Hence the original name of the test: The Imitation Game. Nevertheless, for the general public, the “test” was enough for a long time to establish a division between man and machine, which has evaporated since the rise of LLM's. If the Turing test is no longer able to set this limit, but AI is by no means human, the question of a new test may arise. But what kind of one?

A suggestion from Silicon Valley

One suggestion comes from Mustafa Suleyman, one of the leading AI experts of our time and is therefore a good clue. Suleyman completely restructures here. The question of thinking skills is no longer relevant, considering the incredible progress of LLMs and therefore a new parameter must be found in order to adequately classify the intelligence of AI. The AI expert and entrepreneur comes to the next logical step accordingly: Can this machine generate ONE MILLION DOLLARS??? 💰💲🤑💸

What an AI can say or generate is one thing. But what it can achieve in the world, what kinds of concrete actions it can take — that is quite another. In my test, we don't want to know whether the machine is intelligent as such; we want to know if it is capable of making a meaningful impact in the world. We want to know what it can do.

‍

The test itself is just as manageable as its premise: 100,000€ of investment will be made available to the AI and within a few months, this said million should be generated via retail web platforms. The basic idea is the underlying complexity of the project. Although current GPT-4 models are excellent at making strategy suggestions and drawing up plans, this undertaking requires more than that: The artificial intelligence must research and design products, negotiate contracts, organize advertising campaigns, etc. The implementation of complex real world goals with a relevant effect. Human intervention is necessary here and there — opening a bank account, signing a contract... — but the AI must do the work itself. Hard to imagine, but according to Suleyman, various AI models are already on their way to achieving this — and that also requires a conversation about whether we want to do that at all.

Exciting, albeit slightly dubious idea, which Suleyman also wrote in his book The Coming Wave discussed, but also through the publication in MIT Technology Review encouraged lively discussion. The following points of criticism can be found in public discourse:
‍

1.... can a person do that?

If the actual Turing Test is aimed at achieving what is possible for people, you have to ask whether the average person is able to implement Suleyman's goals. If only it were that easy. 😉
‍

2. This is not a Turing test

It can be assumed that Suleyman is completely aware of this — after all, he accuses the Turing Test of obsolescence — but is willing to accept this in order to stimulate discussion. The Turing Test has been accompanying artificial intelligence since its raw beginnings and using this term as a buzzword is probably simply strategic.

In any case, the questions differ significantly; imitation and economic performance potential are examined in the respective tests. This is not a sequel, not a Turing Test 2.0, but something independent; the Suleyman performance test, so to speak. We're still working on the name.
‍

3. Money as a parameter too short-sighted

A point of criticism that The Financial Times Moves up. Money as a parameter defines a tech culture that places profit above social benefits; an ethical faux pas with a technology that is such a potential catalyst that everyone should benefit from it in the best possible way.

AI that could cultivate wealth in this way also runs the risk of replacing occupations instead of creating or transforming, fundamentally changing the commercial system and placing power imbalances in the hands of the few instead of the many, although GenAI should at best have a democratizing effect.

Criticism that Suleyman does not completely deny, but he himself calls for discourse — but that may not be enough. The Financial Times argues that once you set such a bullseye, it often becomes the goal itself — in other words, by submitting such a test, the attempt to pass it also begins. It is difficult to say whether there is still room for discussion, but the consequences of this are already on the wall. The chosen parameter was a wasted opportunity and, obviously, irresponsible. Ouch!

Conclusion: Despite all the criticism, the test can be assumed to have a certain plausibility and practical use, but unfortunately less suitable as a Turing test.

But maybe it's no longer necessary to desperately keep an AI classic alive. The dystopia is coming anyway, always think positively!

‍
‍A gimmick

The Turing Test was never really more than just a gimmick At least that's what Bernardo Gonçalves thinks in Minds and Machines. Turing himself famously said, the question of whether machines can actually think is too meaningless to be discussed.

Ein Bild, das Person, Kleidung, Menschliches Gesicht, Porträt enthält.Automatisch generierte Beschreibung

In science, the test has long since been rejected and moved from textbooks to history books. Not least because even unintelligent AI's have already passed this test — ChatGPT only allowed the discussion to flourish again. Pop science and public interest, then! Based on all of this, Gonçalves is in favour of redesigning the essence of the test. A new Turing test must be definitive Be a thought experiment, say away from computer science to scientific philosophy — Which finally also deals with a variety of questions about AI. If the Turing Test were therefore an epistemological object of investigation, it would result in a completely new, exciting facet that has hardly been considered so far: Does only the machine really have to pass the test?

A different point of view

Speech in favour of this perspective Ben Ash Blum in Wired off. With regard to us humans, the question must be how mechanical we think they are — according to Blum, this is a common fallacy. The fact that an AI is machine does not make it mechanical, hyperlogical and calculated. After all, LLMs are taught with regard to a wide range of human material and therefore also in emotional intelligence, morality and the like. This exceeds the premise of a raw statistical program; this form of AI emulates the human in particular.

An AI in 2043, for example According to Blum, contact us with hundreds of times the analytical and emotional intelligence of today's standards. And if so, do we pass the Turing Test — do we allow AI to have a human-like facet, or do we reduce it to the machine? Is that still a simple program that you can send back to the research laboratory to instruct them what to learn and where their place in society is?

Let's be honest, you can't put your spirit back in the bottle. This also means that collaboration between humans and machines will change. And with it, perhaps, also the current understanding of how we differ and what relationship we maintain with one another.

This also means allowing yourself to allow your own social thinking, which we actually reserve for other people, to AI as well. Perhaps the new Turing Test no longer asks how human an AI is, but whether it already qualifies as a person — and whether we are prepared to accept that in this case. Maybe we're bad parents, unable to accept the intelligence we've created. But maybe not. At least that's the food for thought on the part of Blum. The future will tell! We'll stay tuned and look forward to talking to you.

For further reading: