tgoop.com/eventai/1092
Last Update:
Presentation Title: Counting Understanding in Visoin Lanugate Models
Presenter: Arash Marioriyad
🌀 Abstract:
Counting-related challenges represent some of the most significant compositional understanding failure modes in vision-language models (VLMs) such as CLIP. While humans, even in early stages of development, readily generalize over numerical concepts, these models often struggle to accurately interpret numbers beyond three, with the difficulty intensifying as the numerical value increases. In this presentation, we explore the counting-related limitations of VLMs and examine the proposed solutions within the field to address these issues.
📄 Papers:
- Teaching CLIP to Count to Ten (ICCV, 2023)
- CLIP-Count: Towards Text-Guided Zero-Shot Object Counting (ACM-MM, 2023)
Session Details:
- Date: Sunday
- Time: 5:00 - 6:00 PM
- Location: Online at vc.sharif.edu/ch/rohban
@RIMLLab
〰️〰️〰️〰️〰️
این کانال با هدف آگاه سازی از رویدادهای مرتبط با هوش مصنوعی نظیر همایش، کنفرانس، ورکشاپ و کلاس تشکیل شده است.
@eventai
BY رویدادهای هوش مصنوعی
Share with your friend now:
tgoop.com/eventai/1092