Machine Learning

Unified Open-World Segmentation with Multi-Modal Prompts

Recent years have witnessed the rapid development of open-world image segmentation, including open-vocabulary segmentation and in-context segmentation. Nonetheless, existing methods are limited to a single modality prompt, which lacks the flexibility and accuracy needed for complex object-aware prompting. In this work, we present COSINE, a unified open-world segmentation model that Consolidates Open-vocabulary Segmentation and IN-context sEgmentation. By framing open-vocabulary task and in-context segmentation task as promptable segmentation tasks, COSINE supports diverse modalities of input…

Recent years have witnessed the rapid development of open-world image segmentation, including open-vocabulary segmentation and in-context segmentation. Nonetheless, existing methods are limited to a single modality prompt, which lacks the flexibility and accuracy needed for complex object-aware prompting. In this work, we present COSINE, a unified open-world segmentation model that Consolidates Open-vocabulary Segmentation and IN-context sEgmentation. By framing open-vocabulary task and in-context segmentation task as promptable segmentation tasks, COSINE supports diverse modalities of input… Read More

Related Posts

The Complete Guide to Docker for Machine Learning Engineers

Reasoning’s Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection

Exploring LLMs with MLX and the Neural Accelerators in the M5 GPU

Leave a Reply Cancel reply