Artificial Intelligence in Architecture and Built Environment Development 2024: A Critical Review and Outlook, 12th part: (5) Discussion

AI is a super-parrot: it is superb in repeating what it has learned, explains Tomas Mikolov in a chat with Dan Vavra [105]. In other words, for a Generative Pre-trained Transformer, the magnitude and comprehensiveness of the training dataset is the starting point, the algorithm is the method or, running on an artificial neural network, the tool respectively, and the computational performance is the limit.

Nonetheless, financing appears to be another limit. Touched on in section (2) concerning security issues, the Alexander Karp story also illustrates eloquently the economic starting point for the deployment of AI in architecture. If he had invested as an architect the same talent, the same significant skills, and the same intense effort he put into AI, he would never have become the billionaire he is; most likely, neither he nor any of his fellow AI tycoons would become a billionaire also in the field of built environment development. Obviously, in economic and investment attractivity terms, designing architecture and even developing real estate cannot compete with data mining (not only) to ensure national security.

Obviously? The United States allocates approximately three percent of its gross domestic product (GDP) to defense outlays [333]. On the other hand, the finance, insurance, real estate, rental, and leasing industries (that one way or another could not perform without architects) contributed to the US GDP at 20.2 percent, and the construction industry alone at approximately four percent [334,335]. The only objective justification for consistently underestimating, even ignoring the potential of monetizable or otherwise tangible benefits of architectural designing and the built environment development can only stem from mistrust or even rejection of a possibility of improvement in their performance. These benefits can be economic, environmental, cultural, and social. Climate change has turned attention to the issues of environmental impacts and sustainability of construction: even in these respects, there are still no major efforts to improve the performance of architectural design and construction planning by applying machine learning. Other aspects of the built environment development and architecture remain oblivious to AI; better to say, AI remains oblivious to these fields. Any doubt that this deserves to be changed? Cui bonum? What profession shall intervene? Or is it a public interest, and the governments shall act?

Employing another, more technical perspective, the market potential of AI in architecture and built environment development should bear a comparison with the BIM software market. Providing advanced 3D modeling solutions for architecture, engineering, and construction professionals, the BIM software market was valued at $5.2 billion in 2019 and $5.71 billion in 2020; the projected BIM for 2027 is $11.96 billion with a compound annual growth rate (CAGR) from 2020 to 2027 of 11.1%. Another study suggests that the BIM software market was valued at USD 9,665 billion in 2021 and is expected to reach as high as USD 23,950 billion by 2027, with a projected CAGR of 16.33% [336,337]. Enhancing operational performance, decision-making, cost estimation, and collaboration, BIM software tools are benefiting from government support. Appreciating its boost for operational efficiency and (accelerated during the COVID-19 pandemic) remote collaboration support, private builders’ implementation further contributes to the BIM software market growth. All these incentives could apply to AI´s support for the professions involved; nonetheless, the reality is still far behind. Following the patterns witnessed in the economy of R&D in AI today, a prospect of $10 billion (almost) immediate market value should be able to attract some half-trillion in investments. However, architecture and the built environment development experience no warm welcome of this kind in AI R&D (and investment).

Not primarily but still, Carlota Perez’s call [338] for understanding AI in a broader context addresses the contradiction; though often behind common understanding, architecture and the built environment fundamentally contribute to such a context.

Often regarded as the next technological revolution, AI may be better understood as a pivotal development within the ongoing information-communications-technology (ICT) revolution, which began with microprocessors in the 1970s. The ICT revolution accelerated in the 1990s when the US government privatized the Internet, which led to intensified innovation and globalization. AI may indeed represent a third leap; however, it is essential to recognize that ICT has already brought us to the brink of a golden age. Realizing this potential hinges on understanding the role of market-shaping public policy during previous technological revolutions. Without such policies, AI will fall short of its potential to drive inclusive social and environmental progress. The question of whether AI constitutes a new technological revolution remains significant. Early stages of revolutions involve creative destruction across the entire economy, not just specific sectors. During these periods, new technologies create and eliminate jobs, reshaping industries and regions. To maximize social gains, institutions and regulations must guide these technologies. AI relies on a massive energy supply and the Internet, which, in turn, depends on powerful microprocessors. These technologies mechanize mental work, and their future evolution may combine AI with biotech, new materials, new dwelling concepts, and new public spaces (among others) within the context of an ongoing ICT golden age. The historical context matters for economic decisions made by investors, firms, governments, and households. For sustainable progress, AI’s development must occur within a regulated system, avoiding detachment from the real economy, real life that, as reminded, is fundamentally embedded in architecture and the built environment.

State-of-the-art floorplan generation

As mentioned in (2), the till recently achieved results of AI´s deployment in architecture show that the so-far-ruling principle of lossy compression and subsequent „creative“ decompression within the supervised (or unsupervised) learning has exhausted its possibilities without being able to deliver truly usable results. Not only the strategy reduces the comprehensive three-plus-dimesion-spatial architectural task to image processing; the training stock is another issue. Hundreds of thousands of images, each labeled by humans, that are a precondition for the „statistics“ to work properly, are unachievable in real life: unachievable for two distinct reasons. First, such images are available on the internet and in public resources in general only very sparsely. Training datasets – in section (4) predicted open source platforms pose first questions on materials assembly, materials quality, and size. Given state-of-the-art machine learning, the size should (significnatly) exceed the Nth power of two, where N is the number of parameters to specify the AI task: thousands rather than hundreds of parameters when it comes to the comprehensive parametric and physical structure that materializes architecture: a building. Even if it were „only“ lower hundreds, the number has a hundred and more zeros – a googol: the question of computing power – or rather, the optimization of the parameters structure – is immediately raised when googol exceeds the estimate of the number of elementary particles in the known universe. Considering the issue of computing power and the needed volumes of training datasets combined, the efforts to generate floorplans and apartment layouts using GANs render futile.

Stanislas Chaillou carried out and introduced bold research and development [329] as the starting point of the „next era“ of his approach to the use of artificial intelligence in architectural design – an approach effectuated in co-founding the start-up Rayon (introduced in (2)). Engineering a road map „from the parcel to the building footprint, from the footprint to a room split, from a room split to a furnished one,“ and to a room rendering, Chaillou intends to equip the „design pipeline“ with a „catalog“ of (four) pre-defined „styles“ to satisfy subjective preferences. Further on, „design categories“ of „footprint, program, orientation, thickness and texture, connectivity, and circulation“ are to be tracked and algorithmized. By reducing the quantity of design parameters, the approach successfully deals with the computing power problem. Nonetheless, confronting Chaillou´s design categories with the poiétic, truly creative design aspects that, as section (3) of this paper recollects, a computer algorithm can never grasp and perform (unless the computer acquires consciousness and ability of a true reasoning together with it), reveals another issue instantly. Chaillou´s or, better to say, Rayon´s „design pipeline“ does not respect the distinction; the design categories lying in or running into the sphere of piétic creativity, the „design pipeline“ is doomed to provide fruitless outputs at any design stage. No matter if „the designer is then invited to “pick” a preferred option and modify it if needed, before actioning the next step. Browsing through the generated options however can be frustrating, and time-consuming,“ Chaillou admits. „To that end, the set of metrics defined in the “Qualify” chapter can demonstrate their full potential here and complement our generation pipeline. By using them as filters, the user can narrow down the range of options and find in a matter of seconds the relevant option for its design. This duality of Generation-Filtering is where the value of our work gets all the more evidenced: we provide here a complete framework, leveraging AI while staying within reach of a standard user. Once filtered according to a given criterion (Footprint, Program, Orientation, Thickness & Texture, Connectivity or Circulation), we provide the user with a tree-like representation of her/his choice. At the center is a selected option, and around it, its nearest neighbors classified according to a user-selected criterion. The user can then narrow down the search and find its ideal design option, or select another option within the tree, to recompute the graph.“

As the metrics run into the poiétic creativity sphere, „the set of … „Quality“ metrics“ has no „potential to complement … the generation pipeline“ in fact – as opposed to Chaillou´s expectations and promises. Not only by Chaillou´s application (the Rayon) but by AI generally, in terms of authenticity, poiésis, a truly satisfactory output can come into existence only haphazardly. Browsing through piles of substandard proposals turns out to be a waste of time and energy, and diversion to man-made modification renders a deliverance; the sooner approached, the better. Inevitably, such „design pipeline´s“ outputs can only compare to AI-generated elevator music: not any architecture an architect would go in for. Only either a layman with scanty user experience and cultural background, and lacking outlook can be satisfied with what the algorithm performs, or a software professional who, however, appreciates not architecture but how the application works and how it produces.

The gap between architects – and authors across creative fields in general – on the one hand and AI applications developers and data scientists on the other renders again. OpenAI revealed the Sora AI application [153] in February 2024 that, in response to a text prompt, can create videos up to a minute in length. Sora is supposed to be able to generate complex scenes with multiple characters, specific types of movement, and precise details of individual objects and backgrounds. „Our model understands not only what the user asked for in their input, but also how those things exist in the real world,“ OpenAI representatives said in a post on the company’s blog [339]. Impressive: however, a gadget like Sora, anyhow further developed and perfected, will never replace author teams headed by Quentin Tarantino or Jonathan Glazer – for reasons starting from the simple fact that a prompt that could capture all the poetics that the screenwriter, director, actors, cameraman, and other co-creators put into the work would have to be a performance more complex than the resulting film itself; as an example, the detail that during the production of the film, the creators communicate not only with language but also employing other senses and means of expression, is worth mentioning as an illustration. Nonetheless, this does not imply that Sora, as a representative of a class of algorithms, is doomed to be useless. The authors can, and probably will use it in place of picture script, as a research tool, as other – in addition to natural human senses and communication tools – means of communication.

In its processes of birth and elaboration, architecture is more similar to filmmaking than a non-architect would say. In terms of a competent architectural design workflow, by contrast with world imitation- and other advanced reinforcement learning strategies, the strategy and the technique recently introduced by Chaillou turn out to be a cul-de-sac, and all the other applications, which reduce spatial making to image supervised-learning processing, too.

References

Introduction figure: Svarovska, E.: PromeAI deployment at progressing from a simple volumetric model to near photo-realistic rendering. Student´s design. Department of Architecture, Faculty of Civil Engineering, CTU in Prague. 2023. author´s archive

Michal Sourek

Exkluzivní partner

Hlavní partneři

Partneři