BEGIN:VCALENDAR
VERSION:2.0
METHOD:PUBLISH
BEGIN:VEVENT
UID:20260610T012759-82970617-northwestern.edu
DTSTAMP:20260610T012759
DTSTART:20260522T150000
DTEND:20260522T170000
SUMMARY:Wangcheng Xu CS PhD Prospectus: Neuro-symbolic Visual Understanding of Structured Schematic Images
LOCATION:
DESCRIPTION:Structured schematic images--such as diagrams, maps, and puzzles--convey meaning through discrete visual elements and spatial relations. Although modern vision-language models offer strong semantic priors, they often struggle with fine-grained structure, precise relational grounding, and long-horizon state tracking. I propose a neuro-symbolic approach to schematic visual understanding centered on explicit, grounded intermediate representations.\nThe work builds on cognitive science accounts in which visual understanding proceeds from primitives and objects to qualitative spatial relations and task-relevant structures. It extends CogSketch, a cognitive visual-understanding system that represents scenes using glyphs and qualitative relations and links them to analogical reasoning systems such as SME, MAC/FAC, and SAGE. CogSketch plus analogy has been used in both cognitive modeling and deployed systems where the input is digital ink. This prospectus addresses the challenge of starting with images of structured semantic materials. VLMs are used as components in a representation-building pipeline that produces visual elements and spatial relations for downstream symbolic and analogical reasoning. Planned experiments to test these ideas include visual-to-formal encoding for puzzle solving, planning, and theory of mind reasoning. My claim is that explicit grounded representations offer a more interpretable, data-efficient, and reliable basis for advanced reasoning than direct end-to-end vision-language methods alone.\n\nPiP URL: https://planitpurple.northwestern.edu/event/642551
END:VEVENT
END:VCALENDAR
ORGANIZER:Department of Computer Science (CS)<do-not-reply@northwestern.edu>
