رویدادهای هوش مصنوعی@eventai P.1092

رویدادهای هوش مصنوعی

Presentation Title: Counting Understanding in Visoin Lanugate Models

Presenter: Arash Marioriyad

🌀 Abstract:
Counting-related challenges represent some of the most significant compositional understanding failure modes in vision-language models (VLMs) such as CLIP. While humans, even in early stages of development, readily generalize over numerical concepts, these models often struggle to accurately interpret numbers beyond three, with the difficulty intensifying as the numerical value increases. In this presentation, we explore the counting-related limitations of VLMs and examine the proposed solutions within the field to address these issues.

📄 Papers:
- Teaching CLIP to Count to Ten (ICCV, 2023)
- CLIP-Count: Towards Text-Guided Zero-Shot Object Counting (ACM-MM, 2023)

Session Details:
- Date: Sunday
- Time: 5:00 - 6:00 PM
- Location: Online at vc.sharif.edu/ch/rohban
@RIMLLab

〰️〰️〰️〰️〰️

این کانال با هدف آگاه سازی از رویدادهای مرتبط با هوش مصنوعی نظیر همایش، کنفرانس، ورک‌شاپ و کلاس تشکیل شده است.

@eventai

arXiv.org

Teaching CLIP to Count to Ten

Large vision-language models (VLMs), such as CLIP, learn rich joint image-text representations, facilitating advances in numerous downstream tasks, including zero-shot classification and...

www.tgoop.com/eventai/1092

3.3K viewsNov 18 at 19:33

tgoop.com/eventai/1092

Create: 2024-11-18
Last Update: 2024-12-24 22:48:48

BY رویدادهای هوش مصنوعی

Share with your friend now:
tgoop.com/eventai/1092

Telegram News

Presentation Title: Counting Understanding in Visoin Lanugate Models