Future Autonomous Driving Systems to Integrate Hardware and AI Software into Mobile AI Agents

Deep News12:10

At the Second Autonomous Driving Industry Development Forum held in Beijing on April 25, Deng Zhidong, a tenured professor in Tsinghua University's Department of Computer Science and Director of the Visual Intelligence Research Center at the Artificial Intelligence Research Institute, delivered a speech highlighting the current challenges and future direction of autonomous driving technology.

Professor Deng outlined four major challenges facing physical AI. First, large language models (LLMs) inherently lack a concept of physical space. A key problem for Visual-Language-Action (VLA) and Vision-Action (VA) frameworks is enabling these models to adhere to physical laws and possess spatial positioning capabilities. He noted that the introduction of language also brings subjectivity into the system.

Second, comprehending the real physical world remains the most significant bottleneck. While understanding in digital or text-based worlds has advanced significantly, achieving a similar level of comprehension in the real physical world—which includes multi-attribute recognition of objects and their spatial relationships—is foundational yet extremely challenging. Success here could lead to the emergence of a super AI agent for autonomous driving.

Third, there is the challenge of bridging various modules within a latent space in open, dynamic environments. Traditionally separate functions like perception, decision-making, and planning are now integrated into a latent space, requiring tools like language for connection. Professor Deng suggested that text language is an optimal solution for this, as it can incorporate semantic and knowledge enhancements, thereby improving the interpretability of decisions and plans. Techniques like VLA combined with Retrieval-Augmented Generation (RAG) can leverage prior driving knowledge effectively.

Fourth, he advocated for the development of an "empirical physical AI." Driving is a skill-based task reliant more on experience and technique than on intelligence or extensive knowledge. The difference between an experienced and a novice driver often comes down to accumulated mileage. Professor Deng proposed that such driving experience should be stored in a learnable, evolvable long-term memory, accessible during the reasoning phase to significantly reduce AI computational demands and enable smooth, agile responses.

Additionally, Professor Deng summarized several core observations on the evolution of autonomous driving. The technology is shifting from segmented, hybrid closed-loop systems to end-to-end, large model-driven approaches. Embodied intelligence in vehicles essentially involves installing a "cerebrum" and "cerebellum" onto a mature industrial product. World models can be categorized as external or internal, with VLA frameworks needing to develop embedded internal world models, while VA typically focuses on external world models and policy optimization.

Regarding AI's impact on automotive R&D, manufacturing, and product value paradigms, Professor Deng stated that the "data flywheel" would replace traditional blueprint models. The core competitiveness of automakers will transition from precision mechanical manufacturing capability to data closed-loop operational capability. This shift extends beyond autonomous driving to potentially encompass the entire enterprise. On the product value side, software profits are expected to surpass hardware profits. Hardware will become standardized and pre-installed, with future profits increasingly generated through over-the-air (OTA) software upgrades, subscription services, and other value-added offerings.

In conclusion, Professor Deng projected that future AI-driven autonomous driving systems will be integrated mobile AI agents combining a hardware platform with AI software. Compared to pre-installed standardized hardware, diversified AI software and intelligent value-added services are likely to occupy the higher end of the value chain. To achieve this, it will be essential to address user needs concerning safety, efficiency, comfort, and emotional support during the use of AI-powered vehicles.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment