A brief list of a random choice of experiences undoubtedly forming a man´s consciousness: visiting St. Peter´s Basilica in Rome, sitting in the silent auditorium of Teatro Olímpico in Vicenza, chatting and drinking beer with friends in a pub, walking along Champs Elysées in Paris, jogging in Central Park in NY, a dinner in a posh restaurant, a bivac when climbing a mountain, fishing, to fall in love, to have sex, to give birth to a child, to raise it, and so on and so on … When and how will be computers able to grasp such experiences? Erich Maria Remarque wrote his novel All Quiet on the Western Front as a veteran of the First World War: is anybody prone to pay his reader´s attention through all two hundred pages to a computer´s attempt to catch up?
Consciousness is also a mind-and-body creation, literally interwoven with the body and the body´s support systems, a sort of thing a robot can hardly experience.
On the other hand, in terms of connectedness with the world around or how the information is coming in, a strong parallel exists between a computer and the Neocortex that is the place where the thinking and learning of mammals – man not excluded – takes place. Like a computer, the Neocortex has no direct informational connection to the physical world: all of the Neocortex’s connection to the world around us goes (both ways) through the „old brain“ the workings of which are much less known than the workings of the Neocortex. Could this be significant for the next development of AI?
Moreover, just as AI can perform complex calculations without understanding arithmetic, creatures (including humans) can display finely tuned behavior without understanding why they do so. The rationale for their behavior is “free-floating”—implicit in the creatures’ design but not represented in their minds. Competence without comprehension is the default in nature [286]. The mental items that populate consciousness are more like fiction than accurate representations of reality. Computers may continue to increase in competence but hardly will develop genuine comprehension, since they lack the autonomy and social practices that have nurtured comprehension in humans. As a result, the computer comprehension can never align with the human.
„Life dwells in stories,“ Salman Rushdie [287] claims. The qute implies a strong double-sided interconnectedness of a real story and life: not only there is not a story that might be considered true (which property is in no conflict with phantasy!), which is not embedded in life, but there is also no authentic life that not composed of stories – of multiple, intertwinning layers of stories. Yes, there are “life-stories” of computers – monotonous, parametric, and boring even if they had not been predictable.
A sensorical equipment coping with the one of man together with (to some extent) free motion in the physical world and physical interaction with it render a precondition for consciousness; a precondition out of today´s imagination when it comes to computers. Paradoxically, man in the loop might be a solution at the closest at hand. The Neuralink project [288,289] may be the first step: aiming to let people control computers with thoughts, the researchers inserted a sensorical and comunicational chip into the first patient´s brain. But how many chips in which particular positins in a human brain will do to forward a consciousness to a computer? And still then, it would not be the computer´s conscioucness but a consciousness of the human that would transmit to the computer. A weird idea … A quote of Daniel Alarcón for BBC [290] fits: „Here in Caribean [a cab driver says], we all have wonderfull stories. Gabo [Gabriel García Márquez] only types well.“ So far, AI is a superb typist – but its stories are only retrieved, moreover, often only stupidly taken over.
The long story can bear shortening: So far, all the efforts in machine learning have been challenging cognitive processes that take place in the (human) Neocortex. Human consciousness, however, also arises outside the Neocortex, in the so-called „old brain“ as various findings of recent biomedicinal research show [291]. Inseparable from receptors, the diverse parts and regions of the „old brain“ „think and learn“ in ways that are far from being revealed (at least as the processes in the Neocortex do) but surely resist any so far thinkable imitation by artificial neural networks. As a result, the today-most-promising way to endow a machine with consciousness would be a kind of reversion of Elon Musk´s Neuralink project – attaching the („old“) human brain to an artificial neural network as its interface to the world and, in addition, as a „co-thinker“. Research projects like DreamDiffusion [292] generating images from brain EEG signals and MIT´s Mark Harnett´s investigation of how electrical activity in mammalian cortical cells helps to produce neural computations that give rise to behavior [293] delve into this realm.
The idea of a truly, in a sense/terms of poiesis creative AI shows rather a naivety.
Reasoning
Representing the state-of-the-art deep learning or AI today, GPT (the shortcut for Generative Pre-trained Transformer) is a kind of computer program that anticipates what shall continue after particular words or phrases; GPT models can create a new text that may look like created by man. It is not about the truth of the statement or the text that has been generated respectively (for architecture, authenticity represents “the truth”). It is only about the text generated to be in the pre-trained, pre-defined relation to the learning dataset; it is about following the pattern that has been discovered in the learning dataset and articulated explicitly in the process of training, whether following the criteria of regression (in supervised learning) or cumulative reward (in reinforcement learning), or criteria induced based on the analysis of the unlabeled dataset (in unsupervised learning). It is the pattern saying what is „correct“ and what is „false“. And, once we know how this „correctness“ emerges, we can regard it as „usuality“ – which, by the way, is by itself another explication that – and why AI is not truly intelligent and cannot truly create, when human intelligence is defined – among others – by the ability to think critically, to master successfully unprecedented and unusual situations, to articulate unprecedented ideas, and to create, adapt, and transform the living environment; most notably, the principle of disruption being inherent to human creativity.
In general, it has been LLMs what have been seting benchmark recently: not only in terms of popularity, frequency of use, and contribution credited by professionals and general public, but by level of development and results achieved, and by volumes of financial investments, too. The amounts provided as well as the eagerness to invest exceed any expectation. From technical point of view, a LLM tends to be at heart of most AI applications revealed today, not only the so called assistants or personas – digital characters that, by means of AI algorithms, imitate the behavior and qualities of a real human persona, used in marketing, virtual conversation, conversational interfaces, search, and more.
Current language models fall short in understanding aspects of the world not easily described in words, and struggle with complex, long-form tasks (as is characteristic, among others, for architecture and the built environment). Video sequences offer valuable temporal information absent in language and static images, making them attractive for joint modeling with language. Such models could develop an understanding of both human textual knowledge and the physical world. However, learning from millions of tokens of video and language sequences poses challenges due to memory constraints, computational complexity, and limited datasets. Large world model (LWM) is not a specific class of network or a learning strategy, but a deep learning model that uses transformer models and is trained using massive datasets by playing a guess-the-next-word game with itself over and over again. LWMs provide the ability to generate coherent and contextually appropriate responses over extended interactions, giving the impression of understanding or modeling the world. However, it’s important to note that these models do not have an understanding of the world in the way humans do. They do not have consciousness or beliefs. Instead, they learn patterns from the data they are trained on and generate responses based on those patterns [294,295].
Above the business-as-usual and also state-of-the-art R&D, still behind the horizon, there is the vision of artificial general intelligence (AGI). The path to the meta is expected to require a different approach than today’s generative AI models, still – in a way, as reviewed in section (4) – inspired by the McCulloch-Pitts theory of neural networks of 1943. Evidence is missing for the assumption that the machine-learning methods behind ChatGPT and other advances in AI during the past 20 years could act outside training data, which is considered a precondition for AGI. Verses, a California-based cognitive computing company that proposes a paradigm shift in the approach to AGI [296]. Specializing in cognitive computing, Verses aims to build next-generation intelligent software systems inspired by the wisdom and genius of nature. Within an alternative approach called active reference, Verses proposes an unparalleled model, which could resemble the new, in section (4) introduced, Hawkins´/Numenta´s theory of mind and brain. The team is working on what it calls distributed intelligence, a system that can self-organize and retrain in real-time – identify its mistakes and fix them by re-training as biological organisms do. Digital intelligence based on a web of intelligent agents is believed to be cheaper, more environmentally sustainable, and more geopolitically defensible than one vast system trained on billions of data points. To date, Verses has developed Genius, an operating system for „continually learning autonomous agents“ operating at the edge of the company´s connected devices. Genius combines biological inspiration, cognitive capabilities, adaptability, and open standards to create a unique AI platform that goes beyond traditional approaches. NASA’s Jet Propulsion Laboratory and Volvo are among the beta users of Genius. „Minimizing complexity“ is what is believed to be the finding of the way at the road fork. Instead of building ever-bigger AI models, Verses AI aims to deliver „99% smaller models“ without sacrificing quality and performance and promises to release a public beta version of Genius in summer 2024 [297].
Verses´ concept of distributed intelligence is closely related to and a predecessor of the field of multi-agent systems. It executes AI algorithms across multiple nodes or devices that can act independently and communicate asynchronously, exploits large-scale computation and spatial distribution of computing resources, and, not requiring all relevant data to be aggregated in a single location, it operates on sub-samples or hashed impressions of large datasets. Due to its scale and loose coupling, the system of distributed AI is robust and elastic. An alternative to it, also (perhaps) inspired by the human brain reasoning strategy similar to what Jeff Hawkins proposes [298], decentralized AI also executes algorithms across multiple devices or individuals, though, organizing the devices in a network, goes a step forward; it often uses blockchain technology to create transparent and secure platforms for collaboration. Both approaches involve distribution, distributed AI specifically focuses on solving problems using distributed approaches, whereas decentralized AI emphasizes the decentralized execution of AI algorithms [299, 300]. Currently, the second attracted the attention of one of the icons of AI, Emad Mostaque, founder and recently CEO of Stability AI who together with several high-profile researchers left the company in March 2024 to „pursue decentralized AI“ [301].
Poiésis: Architectural design within and against AEC ecosystem
From architecture and urban design over construction and MEP (Mechanical, Electricity, Plumbing), environmental, climatic, meteorological, and microclimatic expertise to transportation expertise, economy, demography, and sociology, multiple professions engage in the development of the built environment. The background of some of the fields is natural sciences whilst, for the others, it is social sciences or even arts – poetics or poiésis [302] as will be explained soon. According to the nature of the contribution provided by the respective expertise, the design and evaluation approaches range from „hard“ to „soft“ ones, from quantitative, parametric, and material to qualitative and emotional ones. According to such an origin and nature, quantitative parameters define the approach as well as the output in some cases, while it is manifestations of consciousness in others; let´s call it feelings, moods, or emotions to keep it simple. Obviously, as explained in the previous subchapter, manifestations of consciousness resist following parametric algorithmization as well as entering datasets.
There is a saying in Czech that goes something like “Just as one calls into the forest, so it echoes back”. A rephrasing of the saying in terms of training dataset and the algorithm may sound you can get the requested parametric answers if you address the right question to the correct forest; however, no forest and no question exist to give the coveted emotion back – to give it in any situation, not to say an unclear situation, as it is the rule with man´s feelings.
Creativity, to be authentic and true, cannot be but poiétic or poetic [302]. The poetic principle requests consciousness together with intention: only consciousness together with intention is able to deliver poiésis [285]. In terms of architecture and built environment, consciousness is reserved for a man, or, more precisely, to Dasein, as Heidegger coined a proved. An algorithm, however complex and sophisticated is the artificial network it works on, can deliver only based on the principle of equality (or similarity, which, however, is only a deficient mode of equality) or by random choice. Face-to-face to new solutions, advance knowledge is the prerequisite. Prior knowledge is another aptitude reserved for consciousness [285] – to a human, not to a machine, and not to an algorithm. No consciousness, no own will, and no true creativity, but algorithms and immense data searched through, assessed, and prioritized according to the defined criteria are the attributes of today´s AI. And even the state-of-the-art theory does not show a vision of how machines could overcome the shortcoming.
Inevitably, when deployed on buildings, AI works in some respects and cannot but fail in others.
Approaching architecture as the most significant among the built environment creators, let us be clear: it is not a natural science scheme, algorithm, or calculus that is the architecture´s starting point. Moreover, it is not a linear sequence of signs – opposed to speech or text. On the other hand, among many other attributes, architecture can be consumer goods, too; and the more a consumer goods a practical architecture shall be, the more a pattern, a calculus, and an algorithm contribute to the delivery; but even then, the environment, the narrative of the development, and/or the people passing, entering and using the building or the structure „make the difference“.
In theory, architecture unanimously distinguishes from arts. But even so, even when architecture shall not be an art like painting, sculpture, drama, dance, or literature, let us not be shy: It is poetics or poiésis as Martin Heidegger coins in antic Greek that is the starting point and method of architectural creativity. Poetically dwells man, puts it Heidegger [302]: full of merit, yet poetically dwells a man. Poiésis precludes algorithm and vice versa, and similarly, a training dataset limits poiésis. By definition, in this regard, as claimed above in this section, a dataset must always be far from being comprehensive. Then, it cannot but limit the creativity for which, inevitably, the training dataset is „the whole world“ – there is nothing beyond.
Also, Encyclopedia Britannica distinguishes and confirms the emotional, social and societal, non-parametric nature of architecture, … the art and technique of designing and building, as distinguished from the skills associated with construction. [303] The characteristics that distinguish a work of architecture from other built structures are (1) the suitability of the work to use by human beings in general and the adaptability of it to particular human activities [and needs], …, and (3) the communication of experience and ideas through its form. Obviously, “use by human beings”, “human activities and needs” as well as “communication of experiences and ideas” cannot but resist algorithmization as well as digital parametrization. These needs, experiences, and ideas tackling and elaboration represent the heart of the initial and most important phase of any design process that precedes any parameterization and refuses it. Significantly, this is the phase and process that IT and AI developers and computer scientists do not know about (unfortunately, given the decline of both the architectural profession and theory during the last 70 years [8,304], a good proportion of architects are no better at it): hence, wherefrom the gap (pinpointed in section (1) of the paper) stems in mutual understanding of the needs, capabilities, and processes of one and the other expertise involved in the design and development of AI tools for architects. To make the long story short: a good proportion of the issue behind recent flawed attempts to foster the architectural design process is the AI developer teaching the machine to design architecture without knowing what architecture is.
Among all types and natures of creations by humans, architecture intertwins the most with human consciousness; not by accident. In the essay Poetically Dwells Man [302], elaborating further his seminal opus Being and Time [285] and the theme of Dasein – being-there or existence in English – after the Second World War in relation to the timely and pressing topic of housing, architecture by extension, Heidegger coins the concept of das Geviert – the fourfold in English – the union of the earthly and the heavenly, the human and the divine in man’s existence and in the world of his being – thus, as we have seen, in architecture. This is not only another strong argument refuting the vision of architecture created by an algorithm. It is no coincidence that materiality manifests itself in both consciousness and architecture: materiality manifests itself in them in the same way and is a strong link between them. This recalls the dual nature of architecture – of ideas, emotions, and experiences on the one hand and material, physical on the other – that slowly-slowly begins to lead to uncovering the feasible way of deployment of AI in architecture and grasping its prospects.
Dalibor Vesely featured and reviewed critically another face of architecture´s duality starting in the heading of the groundbreaking book Architecture in the Age of Divided Representation The Question of Creativity in the Shadow of Production [305]. Creativity never can be substituted by production; however, the material side of architecture – its physical properties both in terms of microclimate convenience, durability, security, ergonomy, operational efficiency, and sustainability – deserve and are keen to enjoy productivity – productivity, that is parametric and algorithms-inclined by nature.
So far in the field of AI in architecture, as in the whole AEC field, however, all the time only analogical, parametric-oriented approaches have been witnessed (the differences between diverse neural networks and AI algorithms, as outlined in (1) make no difference in this regard). Tackling data by a computational algorithm can provide poetics only by chance and randomly. It is not a question of learning or training; by definition, a poetic „output“ cannot be trained. Even if bokeh salience offers a „hallucination“, it’s not poiesis nor a creative act; it’s just a random interpretation of training data that we only additionally realize it was misleading. In a conclusion, the idea of a creative contribution of AI to conceptual architectural design is debunked; and together with it the theoretical collateral and all the AI´s outputs in the field so far. On the other hand, debunking the vision of AI or an AI´s „superuser“ replacing „the architect genius“ [306] as erroneous should not prevent algorithmizing and machine-generating what fits; and this is the physical aspect of architecture.