The results of the paper acceptance for the international artificial intelligence conference AAAAI 2026 were recently announced. A paper jointly completed by XPeng Inc. and the National Key Laboratory of Multimedia Information Processing at the School of Computer Science, Peking University, titled "FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning," was successfully selected. The primary contribution of this paper lies in proposing an efficient visual token pruning framework specifically tailored for end-to-end autonomous driving VLA models, named FastDriveVLA.
Reportedly, FastDriveVLA incorporates a plug-and-play visual token pruner called ReconPruner. During the inference phase of the vehicle-side model, ReconPruner can be directly embedded into the autonomous driving VLA model for visual token pruning, offering a plug-and-play capability without requiring retraining of the entire model. To facilitate the training of this pruner, a dedicated dataset named nuScenes-FG, containing 241,000 image-mask pairs from six camera perspectives, was constructed. This large-scale annotated dataset for autonomous driving foreground segmentation can be widely utilized in future autonomous driving research.
Ultimately, testing on the nuScenes autonomous driving dataset revealed that employing this pruning framework achieved state-of-the-art (SOTA) results across various pruning ratios: when the pruning ratio reached 25% of visual tokens, driving performance showed almost no degradation, with its L2 trajectory error and collision rate metrics even surpassing the unpruned baseline model; when the pruning ratio reached 50% of tokens, performance was more balanced across all metrics; simultaneously, the inference efficiency of the VLA model was significantly enhanced.
The FastDriveVLA framework, proposed jointly by XPeng Inc. and Peking University, establishes a new paradigm for efficient visual token pruning in autonomous driving VLA models, while also setting a new benchmark for the efficient deployment of large vehicle-side models. XPeng's Chairman, He Xiaopeng, commented on Weibo, stating, "We are very pleased to have achieved another new breakthrough on our path to exploring L4. We will continue to advance in the field of Physical AI and look forward to the second-generation VLA delivering an even better smart driving experience to our Peng friends."
Comments