EVENT DETAILS
Structured schematic images--such as diagrams, maps, and puzzles--convey meaning through discrete visual elements and spatial relations. Although modern vision-language models offer strong semantic priors, they often struggle with fine-grained structure, precise relational grounding, and long-horizon state tracking. I propose a neuro-symbolic approach to schematic visual understanding centered on explicit, grounded intermediate representations.
The work builds on cognitive science accounts in which visual understanding proceeds from primitives and objects to qualitative spatial relations and task-relevant structures. It extends CogSketch, a cognitive visual-understanding system that represents scenes using glyphs and qualitative relations and links them to analogical reasoning systems such as SME, MAC/FAC, and SAGE. CogSketch plus analogy has been used in both cognitive modeling and deployed systems where the input is digital ink. This prospectus addresses the challenge of starting with images of structured semantic materials. VLMs are used as components in a representation-building pipeline that produces visual elements and spatial relations for downstream symbolic and analogical reasoning. Planned experiments to test these ideas include visual-to-formal encoding for puzzle solving, planning, and theory of mind reasoning. My claim is that explicit grounded representations offer a more interpretable, data-efficient, and reliable basis for advanced reasoning than direct end-to-end vision-language methods alone.
TIME Friday May 22, 2026 at 3:00 PM - 5:00 PM
ADD TO CALENDAR&group= echo $value['group_name']; ?>&location= echo htmlentities($value['location']); ?>&pipurl= echo $value['ppurl']; ?>" class="button_outlook_export">
CONTACT Wynante R Charles wynante.charles@northwestern.edu
CALENDAR Department of Computer Science (CS)