ARTIFICIAL INTELLIGENCE IN ARCHITECTURE AND BUILT ENVIRONMENT DEVELOPMENT 2024: A CRITICAL REVIEW AND OUTLOOK, 7th part: AI in architecture and engineering

The previous subhead demonstrated big expectations the field of AI lives with: nothing similar refers to architecture and the built environment development. As annotated in concluding the overview of AI production ecosystems, the scale of insight must change to examine this field.

Among multiple others, also Zaha Hadid (studio) met AI using the technology to render forms not so free to cease resembling antic temples patterns that served as imagery datasets to feed the GAN [213]. In doctoral research under the supervision of Patrik Schumacher of ZHA in 2017, Daniel Bolojan created Parametric Semiology Study using machine learning algorithms and other tools of gaming AI implemented in Unity 3D to model the behavior of human agents in order to test the layout of a proposed space [214].

Stanislas Chaillou and Nvidia company, and also others efforted providing AI applications to generate floorplans and apartment layouts [215]. ArchiGAN uses generative networks to create 2D and 3D building designs based on input parameters such as dimensions and space requirements. Another model is CityGAN, which generates drafts of city blocks and buildings. From a practical point of view and concerning the efficiency of deployment, the results of both applications are questionable – as in all other similar cases. On the principle of image-to-image translation with conditional adversarial networks (CANs), Phillip Isola Research Group [216] provided series of machine-generated facades following the “style” and character of the pattern deployed as the “input”. [217,218] Introduced by the same team, Pix2Pix is shorthand for an implementation of a generic image-to-image translation using CANs [219]. Having investigated building façades generation, the team opened the door to architectural designing using GAN. Andrew Witt expanded on this work in his QUILTING exhibition; showcased as one linear, endless animation of an urban skyline, larger facade designs were created by enlargement of the final layers of Pix2Pix [220].

The use of GANs for floorplan recognition and generation was first studied by Zheng and Huang in 2018 [221]. Their Pix2PixHD [222] GAN architecture transferred floorplan images into programmatic patches of colors, and inversely, patches of colors were turned into floorplans. The same year, Nathan Peters [223] in his thesis at the Harvard Graduate School of Design researched laying out rooms in a framework of a single-family home footprint; an empty footprint represented in color patches was the output. Developed in 2019 by Kyle Steinfeld [79], GAN Loci tries to generate perspective images-like of urban scenes assembled with given facades-like textures, pathways, street furniture, pedestrians, cars, etc., by training to achieve the required „mood“ – suburban, public park, etc. [224, 225] Blending the outcomes of Isola´s team and Steinfeld´s R&D, Sketch2Pix provides an interactive application for architectural sketching augmented by automated image-to-image translation [226].

Having investigated a different approach, Nono Martinez’s thesis at the Harvard Graduate School of Design, 2016, deserves noting [227]. The idea of a human-in-the-loop rests at the heart of the method that tackles GAN as a design assistant. Martinez trained models for specific sketching tasks and proposed an interface allowing the human designer further „hand-elaborate“ the model at any moment of the design process. Section (5) of this paper will appreciate and further elaborate the principle of human-in-the-loop.

Tom Mayne of Morphosis employed AI to develop operational strategies to generate output that could never be predicted. The studio developed Combinatorial Design Studies: a Grasshopper definition of one formal study elaborated by GAN technology provided a range of further combinatorial options [228]. Characterizing the 2010s state-of-the-art AI in architectural designing when giving credit to true creativity, adaptability, and intuition of a learning machine, a misconception reveals already at this elementary level when the research´s mission statement finds itself in explicit contradiction with the theory that has (yet) not been disproved [229]. On the other hand, an approach of the breakthrough paradigm deserves appreciation.

Foster+Partners, another global-star architectural studio cannot stay aside; in its Applied R+D team architects and engineers together with expert programmers combine the best of human intuition and computational rigor working with new technologies such as augmented reality, machine learning, and real-time simulation [230].

In terms of practical use, based on the experience from other fields such as image processing, predictive simulations have been considered an etalon. ComfortGAN, for example, investigates the challenges of predicting a building´s indoor thermal comfort [231]. Also structural design was on the lookout for AI. Using variational autoencoders, for instance, research development at MIT investigates how diverse structures can be generated while ensuring performance standards [232]. However, due to the essential material liability of the structural design, the not yet-solved problems of the algorithm´s black box that do not allow to rely on the machine curb so far the deployment of AI in structural design to the theory and conceptual drafting.

Typically deploying supervised learning, ZMO.AI promises interior design driven by AI that, however, soon reveals unable to cope with professional standards [233]. During 2022, hundreds of generative models were released to give the year the label: a proportion of them are (separately) no more supported in 2024. Stable Diffusion – an unheard-of state-of-the-art AI model available for everyone through a safety-centric open-source license [234] – has been creating hyperrealistic art, ChatGPT dared to answer questions about the meaning of life, and Galactica by Meta has been learning humanity’s scientific knowledge but also revealed the limitations of large language models [235]. The innovation arrivals offered hierarchical text-conditional image generation with CLIP latents, high-resolution image synthesis with latent diffusion models, a dataset (LAION-5B) containing 5.85 billion image-text pairs being used to train models such as Stable Diffusion and even CLIP itself, personalizing text-to-image generation using textual inversion, fine-tuning text-to-image diffusion models for subject-driven generation, text-to-video generation without text-video data, frame interpolation for large motion, trainable „bag-of-freebies“ setting new state-of-the-art for real-time object detectors, building open-ended embodied agents with Internet-scale knowledge, human-level play in the game of Diplomacy by combining language models with strategic reasoning (Cicero), training language models to follow instructions with human feedback, language models for dialog applications, robust speech recognition via large-scale weak supervision (Whisper), instant neural graphics primitives with a multiresolution hash encoding, scalable large scene neural view synthesis (Block-NeRF), Text-to-3D using 2D Diffusion (DreamFusion), or Point-E – a system for generating 3D point clouds from complex prompts [236]. None of these tools adresses the architectural field directly but early-adopting architects are getting acquainted with them to learn the benefits they provide: more to be zoomed on in the next section.

On an urban scale, attempts are ongoing to contribute by generating „typical style“ road- and circulation patterns and networks using – among others – the Neural Turtle Graphics. [237,238] Over the past decade, the deployment of online platforms has provided an adequate infrastructure to the end users, [239] also to deploy Generative AI: (former) Spacemaker [240,241], Cove.tool [242], Giraffe [243], or Creo [244] are a few examples of this growing ecosystem, offering simplified access to AI-based predictive models [245], generative design, real-time simulation, additive manufacturing, and IoT to iterate faster, reduce costs, and improve product.

City digital twins are also embracing machine-learning algorithms. Commonly used in engineering and manufacturing sectors, digital twins are increasingly being adopted by cities for various purposes including emission reduction (from buildings primarily). AI-fostered digital twins also aid municipalities in managing traffic effectively, they contribute to economic development planning, assist in climate action planning and monitoring, and for facility-management purposes, they represent streets, buildings, trees, fire hydrants, and other urban assets, using both live and historical data. Notable city digital twin implementations include Las Vegas, Transport for London, and Mannheim. Additionally, ABI Research predicts that over 500 urban digital twins will be deployed by 2025, resulting in US$280 billion in savings for city planners by 2030. AI strategies and models, in particular, contribute to addressing challenges related to data availability and awareness that render crucial for successful adoption of the technology [246].

Block-NeRF is a variant of Neural Radiance Fields that can represent large-scale environments, specifically, to render city-scale scenes spanning multiple blocks. Waymo built a grid of Block-NeRFs from 2.8 million images to create the largest neural scene representation to date, capable of rendering an entire neighborhood of San Francisco [247].

Not only start-ups, academia, and spin-offs of global architectural star-studios go in for AI: the global CAD-tycoon Autodesk runs Machine Intelligence AI Lab – and much of Autodesk´s software, including Fusion 360, is (said to be) AI-enhanced and applying generative design today [248]. Nonetheless, as broad as all this listing may seem, the development of AI for and in AEC is still in its infancy, failing to catch up with LLMs, text-to-image processing, deployment of AI in internet search, content placement, and advertising, but also healthcare, pharmaceuticals, insurance, or justice referring to custody and bail [249].

In the narrow field of architecture itself, the results so far are, if anything, lowly. So far, none of the above applications has been widely used or appreciated in architectural practice. After years spent studying and (sort of) basic researching the limits of deployment of AI in architectural design, Stanislas Chaillou, one of the most distinguished protagonists of the field, accepted that AI is not capable of replacing the human architect: in architecture (more than in other professions) AI shall not replace human intelligence but augment it. The question remains how. Today, as the co-founder of Rayon, a Paris-based startup, sermons building the next generation of space design tools – a collaborative, online platform aspiring to provide an updated toolset to the „architecture of the 90%“ [250] – the production architecture, as this paper puts the term and elaborates it later. The results, however, do not seem to be able to address professionals; so far, the application more resembles customer „design“ tools that furniture producers offer to their clients to ease their purchasing. [251,252]

A challenge of creativity

Design starts with a poiétic idea emerging way before any „early stage design“ CAD tool can apply. A misunderstanding of this starts with missing the poiétic starting points and nature of architecture (as alleged in section (3) of this paper), even disregarding the general definition of architecture (as put by Encyclopedia Britannica in (3), too). The misconception is widespread and entrenched throughout the AEC field and is close to a rule in the CAD field; consequently, the situation cannot be better regarding AI-driven architectural design „creative“ tools development. Displayed repeatedly in this paper, the intractable situation not only leads to wasting efforts on trying to make an algorithm truly creative but, so far, has been sidelining R&D in perspective fields such as generative-patterns-based pre-design, design-development support by „advice whispering“ – continuous parameters assessing, design reviews, and evaluations of solutions or extended-reality technologies, imitation-based learning, spatial computing, novel networks- and algorithms paradigms, and computing efficiency and performance.

Nonetheless, the issue of poiésis in architectural designing (section (3) delves into the concept), represented by the marginalized phase of pre-parametric, conceptual ideas designing, has been addressed recently by Prague, the Czech Republic-based startup Wearrecho [253]. Released towards the end of 2023, the 1.0 operational version provides a comprehensive working platform that not only allows architects to make architecture as it deserves – from spaces, a diachronic way in space and motion – but also liaises this creative realm seamlessly with the parametric BIM domain (Autodesk Revit being the backbone of the latter). The unique workflow elaborates a virtual twin of the architecture to be designed; embedded in both domains – VR/XR and CAD/BIM – and allowing to be approached in the domain that better accommodates the feature to be used, the twin is a dual one keeping all properties and aspects regardless the domain from which they originate.

As inspirations from robotics give rise to the concept of production software for designing the development of the built environment as a robot, the question of creativity takes on a new form. Further elaborated in section (3) of this paper, poiésis, a precondition of true creativity, precludes parametrization, which is a starting point of CAD software. On the other hand, the seamless connectedness of the free, diachronic architectural creation and the parametric design realms as provided by Wearrecho may be opening a path to grasping at least a part of the domain of true creativity. Until the question is surveyed, the prospects shall be kept open considering either state-of-the-art CAD platforms or an extension beyond the parameters to become the world in which the design robots will learn and perform.

So far however, the deployment of AI in architectural designing has been lagging far behind any idea of a design robot. After purely experimentally getting acquainted with AI in the 2010s, along with many others involved in design disciplines and artistic professions, Zaha Hadid Architects studio adopted AI applications such as DALL-E to their standard workflows. The core is project development – support of – using AI images – via generating images that step-by-step proceed to articulate expression of the poiésis of the architecture within design concepts [254]. The approach sets a clear position in terms of the deployment of AI in architecture and artistic design: AI should be embraced by architects rather than met with skepticism or fear … „I am not at all worried about facing the newly empowered competition enabled by AI“, Patrik Schumacher puts it. On the other hand, a critical voice sounds, too, such as The Guardian’s architecture critic Oliver Wainwright´s, who claims all this can be dismissed as superficial because it works via mere image generation [255], though putting architects´ jobs … at risk [256,257].

A calm, artisanal point of view can prevail in positive terms: challenging the poiésis – the architectural narrative represented by the step-by-step AI-generated images may have the ability to develop the conceptual idea more straightforwardly, or, better to say more quickly and more consistently. Labeling the approach as a reverse prompting can, perhaps, explain the technology by itself rather understandably. Economizing the process is a benefit of AI´s deployment. Simulating human phantasy as this practical use of AI applications could be labeled, turns out to be an augmentation of an architect´s creative potential – not replacing him, not making an architect obsolete, and not to confuse with true creativity as will be elaborated in section (3) of the paper. And the skill of prompting only confirms the role of „a pencil and brush“ in a trained architect´s hand in the new era.

Tools like Lookx.AI and Krea.ai fill the role of the new pencil and brush in a straightforward manner. Lookx.AI allows the architect to generate images directly from his Sketchup drafts easily and quickly [258]. Furthermore, pairing the first Lookx.ai outputs with Krea.ai can follow [259]. Upscaling, and also enhancing details and correcting uneven lines produced by lookx.ai, krea.ai shows promising potential, too.

Architecture students tend to be early adopters of the new tools. At the School of Architecture, Faculty of Civil Engineering, Czech Technical University in Prague, Czech Republic, students use AI tools such as DALL-E3, PromeAI, Midjourney, Runaway, PhotoshopAI, Adobe Firefly AI, Inpaint AI, or Lookx.AI to shorten the way from a simple sketch to materiality-rich rendering or from photo to a video. The benefit is a time cut, valuable for architecture students beyond popular opinion, and, in addition, uplift of the results of designs and their presentations. A new generation of architects that, instead of fearing AI, will master it as a natural creative and economical aid and tool, and will be eager to the advances in the field may enter the practice soon.

Though only experimentally, to „replace“—or rather enhance—the „traditional design process“, an „AI design process“ is being applied in the studio led by the author of this paper. Starting from the first encounter with the client through sketching, development of the CAD/BIM model of the architecture, and its visualizations or a tour in VR to final design, diverse AI tools apply concurrently to standard CAD tools like Sketchup, Enscape, or Revit, and Wearrecho – a unique working platform developed by the studio´s internal startup that through a dual virtual twin of the architecture to be elaborated allows both for free diachronic creation of architecture from spaces (in Unreal Engine environment), parametric elaboration of the design (in BIM Revit environment), seamless switching between the two environments, and fluent communication between the design- and project stakeholders who can join the scene of the design both in place and remotely. After Midjourney [260] in the first phase, Stable Diffusion [261] joins the workflow pipeline to get the sketches ready for CAD/BIM model development. To provide visualizations or a VR tour, ControlNet [262] joins.

Beyond practical application, but still, in a very useful manner, studies of Carlos Baňon together with his online workshops are leading to mastery in the deployment of Stable Diffusion, Midjourney, and ControlNet [263,264]. Text-to-image, diffusion, and other AI tools are gradually entering the role of design assistants, quickly and tirelessly, against prompts, delivering variant impressions – depictions of possible architectural solutions with different levels of adjacency to the specified spatial parameters. The prompting technique improves constantly both in terms of prepared templates and request categories, and user training. Nevertheless, not exceptionally, the supplied images are ultimately rejected in general. Even then, the use of inspiration can remain. Mastery in using a tool is the challenge, the necessity of the craft and true creativity remaining untouched.

References

Introduction figure: Labbe, J.: Diverse AI-tools deployment at a multi-level design development. Luka Developent Scenario, Prague. MS architekti, Prague. 2024. author´s archive

Michal Sourek

Exkluzivní partner

Hlavní partneři

Partneři