Generative CAD is coming, $60bn Autodesk is internalising big CAD models
CAD giant Autodesk is landing generative AI, too.
At press time, Autodesk (US:ADSK) stock is up more than 20% on the 2024 year at $293.32 per share, with a total market cap of $63.064 billion.
‘We're developing a generative AI base model that's unlike any model out there.’ Raji Arasu, Autodesk's executive vice president and chief technology officer, said recently during an appearance at a public event.The base model that Raji Arasu is referring to, which was revealed in May 2024, is called ‘ Bernini ‘ a generative AI project that converts text, hand-drawn sketches, etc. into 3D files.
As far as the current big model market is concerned, the implementation of graph-generated 3D functionality doesn't seem to be anything new. Google DeepMind's just released Genie 2, domestic Tencent Hybrid, and VoxCraft from BioDigital Technology, to name a few, can all implement similar functionality. But as a global CAD giant, Bernini is of great practical significance to Autodesk's actual business.
Uncovering Bernini
Bernini named after Gian Lorenzo Bernini, the famous 17th century Italian sculptor and architect. In terms of training data selection, the model was trained on 10 million publicly available 3D shapes by Autodesk AI Labs in collaboration with the Chinese University of Hong Kong.
The reason why Bernini is different from other basic big models is that Bernini has three main characteristics in generating 3D images:
1. the structure of the generated 3D image is a functional 3D structure. For example, it generates water bottles that are hollow and actually serve the function of holding water, rather than just generating models that look similar;
2. Separate shapes and textures. The ability to generate shapes and textures separately gives the user the freedom to adjust the variables, blend them together or make other designs, avoiding the problem of confusing textures and contours of 3D objects;
3. provides a wide range of variants. Optimised for professional geometry workflows, the ability to generate multiple functional 3D shape variants from a single input gives designers choice and accelerates their creative workflow.
Autodesk user interface screenshot
In this, the base model needs to overcome a natural barrier to the design and manufacturing process in order to achieve these characteristics, namely, the AI needs to be fully attuned to the complex logic of the input and output of the design work. ‘Accepting multimodal inputs such as text, sketches, voxels, point clouds, these replicate the design process of the creator.’ Raji Arasu describes.
In addition, since generating geometries like 3D CAD requires reasoning based on the laws of physics in terms of space and structure, it also demands a high level of precision and accuracy.
So in terms of time, Bernini's launch doesn't seem to have been a quick one. Since May 2024, when Bernini's progress was revealed, most of the year has passed. At that time, Bernini just released a part of the concept video. And it was only at Autodesk's user conference in San Diego in October that CEO Andrew Anagnost released a Bernini preview.
However, Andrew Anagnost said that Bernini is trained with public data and is not yet available for commercial use, and has been opened to the AI community. But he did disclose a possible business plan for Bernini: ‘The method of training Bernini is data-independent, so customers can use their own data, if needed, to optimise Bernini and continuously improve the model.’
How Bernini was ‘made’
Bernini's training process also uses NVIDIA GPUs, but outside of GPUs, Raji Arasu believes that it is the processing and use of ‘data’ that is more important in training a model. In her talk, Raji Arasu revealed more about this process. She divided the building process into data handling, data preparation, model training for cost and efficiency, and complexity management for model inference.
‘Billions of objects and petabytes of data of different sizes, shapes and workloads need to be processed.’ Raji Arasu said.
Autodesk needed to build a data pedestal in the cloud based on its massive volume of large design files, for which it chose Amazon DynamoDB as the primary database and created a canonical data model that enables writes across hundreds of partitions with high throughput and near-zero latency.
In addition to solving the data performance problem, Autodesk easily completed the data preparation process for basic model training by combining cloud services such as Amazon EMR, Amazon EKS, Amazon Glue, and Amazon SageMaker, which featured, tagged, and segmented large amounts of complex historical data.
In the model training phase, Autodesk also faced many problems such as GPU selection, and it eventually used Amazon SageMaker to unify the solution for instance testing, infrastructure management, etc. The team focused more on data preparation, model development, and customer-oriented AI function development.
Latency, cost, and performance performance need to be properly addressed when managing model inference at scale. ‘Amazon SageMaker's auto-scaling and multi-model endpoints seamlessly support real-time and batch inference for high throughput, minimal latency and maximum cost efficiency.’ Raji Arasu said.
As you can see, Autodesk's build of Bernini makes heavy use of Amazon SageMaker, the most widely known AI and machine learning service from Amazon Cloud Technologies, of which a number of high-profile organisations, including Autodesk, are leveraging Amazon SageMaker HyperPod for model training. Amazon SageMaker HyperPod has officially rolled out several heavy-duty updates in the recent past, such as Flexible Training Plans to create more automated training jobs that optimise costs by efficiently utilising a wide range of more cost-effective compute resources, and Task Governance, which prioritises different training tasks to maximise model training, fine-tuning, and reasoning during the process of resource utilisation.
Based on these innovations, Autodesk ultimately cut base model deployment time in half. This increased AI productivity by 30 per cent while keeping operational costs stable.Raji Arasu also revealed that Autodesk has begun rolling out AI capabilities built on these base models to customers.
‘Acting as a design partner to our customers and helping them balance parameters like material strength, cost so that they can determine the best design. All this is done to minimise tedious tasks and maximise creativity.’ Raji Arasu said.
Correct data, the premise of training large models
After ChatGPT, NVIDIA GPUs have become a popular target for major AI companies to chase, and how many NVIDIA GPUs a large company orders in a year can even become the front page headlines in the media. However, to sum up the birth process of Bernini, GPU is certainly important, but the data determines the quality of large model training.
And training a big model that can really work in business is a full-stack system problem that involves many aspects.
‘Big models are just one part of generative AI application innovation. There are other capabilities that need to be augmented if generative AI application innovation is to be done well. First and foremost, you need to make sure that generative AI can augment the big model capabilities of generative AI applications with an organisation's own data.’ Ruisong Chu, Amazon's global vice president and president of Amazon Cloud Technology Greater China, expressed similar views in a speech.
In the case of the illusion problem, which is now a big concern for big models, it can also be solved from the perspective of ‘metadata,’ which is to ensure the quality of the data in the knowledge base before training big models. ‘Metadata in the database, i.e., high-quality data that has been reviewed and approved, helps reduce latency and improve the response of the Big Model.’ Mai-Lan Tomsen Bukovec, vice president of technology at Amazon Cloud Technologies, said he cautioned organisations with big models being trained to be careful to distinguish between what is created by people and what data is generated by AI.
It's because of the importance of data that big models are accelerating the way companies realise their data assets. 2 months ago, Qunar Technology, the parent company of Cool Spaces, a 3D spatial design platform from China, also unveiled a new business plan, launching a data training platform for embodied intelligence and others to open up the world's largest indoor scene-awareness deep-learning dataset. The company disclosed that the platform already has more than 320 million 3D models, with an average of 77.8 million monthly active visitors. The company will open up physically correct 3D spatial data assets, spatial cognition solutions, and spatial intelligence training-related services for AIGC, embodied intelligence, AR/VR, and other enterprises.
The data shows that about 77% of companies will increase or significantly increase their investment in AI and emerging technologies within 3 years, and that the 3 areas where AI will be the first to see efficiencies occur are Automation, Analysis, and Augmentation (automation, data analysis, and assisted augmentation).
‘AI is reshaping all industries’ is no longer a slogan. ‘More and more industries are becoming more active in the generative AI space.’ Uwem Ukpon, vice president of global services at Amazon Cloud Technologies, said that the industry application area of generative AI has gradually expanded from the financial industry at the very beginning to a wide range of industries, including the public sector, traditional industries, governmental organisations, as well as industries such as life sciences and healthcare. At the same time, generative AI also provides new business opportunities for enterprises, Autodesk is betting on large-scale data-based generative CAD, a potential future market.