Exclusive information has been obtained that Sensetime Group Inc (HKEX: 0020) is currently in a confidential development phase for a new multimodal large model. This model is primarily designed for "design" scenarios and is spearheaded by the company's co-founder and chief scientist, Lin Dahua. The project's intent is to create a "thinking" image generation model that can compete with OpenAI's GPT-Image 2.
Internally, the model is codenamed "U1 Pro." It is being advanced by the SenseTime Research Institute and is a member of the company's SenseNova model family. It is anticipated that an internal invitation-only testing phase for the model will commence in July of this year, with services subsequently being offered to clients.
According to an informed source, when faced with complex design requirements, this model can function like a "thinking designer," achieving a long-cycle process of design, generation, and review. It also supports 8K resolution output. The source further indicated that in numerous internal evaluations using the same prompts, the images generated by this "U1 Pro" model are highly comparable to, and in some cases even superior to, those produced by GPT-Image-2.
In the text-to-image scoring of the LMSYS Chatbot Arena, GPT-Image-2 significantly outperformed Google's Nano Banana 2 in terms of image quality, text rendering, and instruction following, garnering widespread acclaim within the design industry. Industry forecasts widely suggest that OpenAI is also poised to release new AI image generation models in the near future, with "design" being a key strategic focus.
The emergence of details about SenseTime's new "U1" model signals to the market that beyond the programming domain led by top AI firms like Anthropic and Zhipu AI, "design" is becoming the next major arena for competition among multimodal models.
Comments