Academics / Courses / DescriptionsCOMP_SCI 496: Special Topics In Data Systems Seminar
VIEW ALL COURSE TIMES AND SESSIONS
Prerequisites
Permission by instructorDescription
As LLMs are increasingly embedded in real-world systems, prompts have become the primary means of encoding user intent. They steer data retrieval, direct output generation, recover from runtime errors, and coordinate tool use. However, despite their centrality, prompts are often crafted manually in an unrigorous trial-and-error approach, and most LLM pipelines fail to appropriately version or track prompt evolution in a systematic fashion. Even worse, they typically remain as static, opaque strings, completely detached from the broader program logic.
At the same time, these pipelines are rapidly evolving into complex data-centric applications that involve interactions with knowledge bases, conditional fallback, data validation, and multi-agent orchestration. Several popular frameworks (e.g., LangChain, DSPy) allow developers to easily construct LLM processing pipelines, and semantic query engines provide a declarative interface to LLM-augmented query processing. Yet, in all of these approaches, prompt logic exists completely outside the system, a black box that stays entirely hidden to the optimizer and execution engine during runtime.
This course will explore modern data systems that integrate LLMs as a fundamental component of the query processing engine. It is intended for graduate and advanced undergraduate students interested in systems research. Students will learn how to: (1) read and critically evaluate systems research papers; (2) craft presentations that distill and convey core research ideas; and (3) plan and execute a final project that answers an interesting systems research question.
- This course fulfills the Technical Elective area.
REFERENCE TEXTBOOKS: N/A
REQUIRED TEXTBOOK: N/A
COURSE COORDINATORS: Andrew Crotty
COURSE INSTRUCTOR: Andrew Crotty