
DeepSeek, one of China’s most prominent artificial intelligence companies, shook the AI community on Monday by releasing Janus Pro 7B, an open-source image generation model. The release comes after the company recently launched DeepSeek-R1, focused on reasoning, further cementing its status as one of the very few companies in the world capable of releasing fully open-sourced frontier foundation models. Janus Pro 7B outperforms OpenAI’s DALL-E 3 on a variety of benchmarks and is licensed permissively for both academic and commercial use.
Janus Pro 7B, according to a listing from Hugging Face, is a successor model to the models Janus and Janus Pro 1B. DeepSeek characterizes it as an autoregressive framework that combines multimodal understanding and generation. Further, it was noted that its architecture and encoder had gone through serious upscaling to perform with greater functionality.
It works by factoring in efficiency through the decoupling of visual encoding into different pathways while using a unified transformer architecture for processing. For multimodal understanding, it makes use of a SigLIP-L vision encoder and a tokeniser with a downsample rate of 16 for generation tasks.
According to internal tests, Janus Pro 7B gets a GenEval benchmark score of 80 percent and a DPG-Bench of 84.2, beating both DALL-E 3 and Stable Diffusion models. Independent evaluations over the next few days will give a better idea of its all-round prowess.
The model is open-sourced under the MIT license on GitHub and Hugging Face. The current version is limited in generating output, though DeepSeek wants to improve that over time. For hosting in the US, it tries to avoid the concern that data might be sent to Chinese servers. OpenAI CEO Sam Altman has complimented the rapid rise of DeepSeek, saying the R1 model was “impressive” and the competitive landscape in AI is getting hotter.