Machine Learning

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, with
systems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community’s progress remains
constrained by the absence of large-scale, high-quality, and openly accessible datasets built from real images. We
introduce Pico-Banana-400K, a comprehensive 400K-image dataset for instruction-based image editing. Our dataset is
constructed by leveraging Nano-Banana to generate diverse edit pairs from real photographs in the OpenImages collection.
What distinguishes…

​Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, with
systems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community’s progress remains
constrained by the absence of large-scale, high-quality, and openly accessible datasets built from real images. We
introduce Pico-Banana-400K, a comprehensive 400K-image dataset for instruction-based image editing. Our dataset is
constructed by leveraging Nano-Banana to generate diverse edit pairs from real photographs in the OpenImages collection.
What distinguishes… ​​ Read More

Leave a Reply

Your email address will not be published. Required fields are marked *