New AI Model Lumina-DiMOO Revolutionizes Multimodal Generation: Open-Source Tool Outperforms Existing Systems in Text-to-Image, Image Editing, and Understanding Tasks

An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

Abstract We introduce Lumina-DiMOO, an open-source foundational model for seamless multimodal generation and understanding. Lumina-DiMOO sets itself apart from prior unified models by utilizing a fully discrete diffusion modeling to handle inputs and outputs across various modalities. This innovative approach allows Lumina-DiMOO to achieve higher sampling efficiency compared to previous autoregressive (AR) or hybrid AR-diffusion paradigms and adeptly support a broad spectrum of multimodal tasks,...