Nvidia Launches Multimodal AI Model to Combine Vision, Speech and Language

MT Newswires Live00:49

Nvidia (NVDA) said Tuesday it has launched Nemotron 3 Nano Omni, an open multimodal AI model designed to combine vision, speech and language capabilities into a single system.

The model can process text, images, audio and video together, eliminating the need for separate models, and it has more accuracy in tasks such as document intelligence, audio-video reasoning and computer-use applications, the company said

Nvidia said the model delivers up to nine times higher throughput than comparable models, reducing costs and improve scalability while maintaining responsiveness.

Nemotron 3 Nano Omni has been adopted by companies such as Foxconn and Palantir (PLTR), and others such as Dell Technologies (DELL) and DocuSign (DOCU) are evaluating the technology, Nvidia said.

Price: 209.59, Change: -7.02, Percent Change: -3.24

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment