Events

Apr

9

IDEAL Workshop on Learning in Networks: Discovering Hidden Structure

Department of Computer Science (CS)

All Day 3514, Mudd Hall ( formerly Seeley G. Mudd Library)

EVENT DETAILS

April 9th and 10th
Description:

Real-world networks often exhibit a hidden structure, which we wish to infer. For example, many networks exhibit community structure. Inferring communities is a valuable tool in network analysis; community detection has been used in a wide array of applications including recommender systems (e.g. Netflix), webpage sorting, fraud detection, and neurobiology. Inspired by these real-world networks, researchers in probability, statistics, information theory, and machine learning have studied structure recovery problems in random graph models. In addition to community detection, problems of this type include graph matching, recovery of planted subgraphs, and inference of graph properties. This workshop will bring together leading experts in the field, and both local and external participants, with the goal of sharing the latest advances and launching new collaborations.

Form to register:
https://docs.google.com/forms/d/e/1FAIpQLSfZoMhFJMR00yQDSmYQYg33qQ-lpKtnGBm3jyV3XMXzq7Sgrg/viewform

Speakers:
Elchanan Mossel (MIT)
Tselil Schramm (Stanford University)
Alex Wein (UC Davis)
Jiaming Xu (Duke University)

Logistics:
Date: April 9th -10th
In-person Location: Northwestern University: Mudd Library 3rd floor, 2233 Tech Drive, Evanston

more

TIME Tuesday, April 9, 2024

LOCATION 3514, Mudd Hall ( formerly Seeley G. Mudd Library) map it

CONTACT Wynante R Charles wynante.charles@northwestern.edu EMAIL

CALENDAR Department of Computer Science (CS)

Jul

31

Designing interactive systems for reasoning with ontological uncertainty in data analysis

Department of Computer Science (CS)

2:00 PM ITW, Ford Motor Company Engineering Design Center

EVENT DETAILS

Data analysis processes are regularly employed to inform high-stakes decisions. However, typical workflows for implementing data analysis overlook the considerable subjectivity that is baked into the analysis process and how that might impact the results. This subjectivity reflects ontological uncertainty---implicit, qualitative uncertainty regarding how data should be analysed and modelled. While statisticians have proposed techniques such as multiverse analysis to surface implicit ontological uncertainty, analysts currently do not possess the tools to implement, evaluate and communicate the results of such analyses. This dissertation bridges that gap by providing a pipeline for systematically reasoning about ontological uncertainty that is implicit in data analysis. In the first study, I developed and evaluated multiverse, an R library, which lowers the barrier to implementing multiverse analysis. The library provides flexible and expressive syntax to allow analysts to declare any alternative data analysis step through local changes in code. The library is designed to integrate into both computational notebook and scripting data analysis workflows, and optimises execution by pruning redundant computations. . I evaluate how the multiverse R library supports programming multiverse analyses using (a) principles of cognitive ergonomics, and (b) case studies based on semi-structured interviews with researchers who have successfully implemented an end-to-end analysis using multiverse. I identified design trade-offs (e.g., increased flexibility versus learnability), and suggested future directions for supporting analysts in adopting multiverse analyses (e.g., how to evaluate a multiverse analysis?). In the second study, I address the issues of evaluation by first identifying principles for validating the composition of, and interpreting the uncertainty in, the results of a multiverse analysis. I designed Milliways, a novel interactive visualisation system, to support the principled validation and interpretation of multiverse analyses. Milliways provides interlinked panels presenting result distributions, individual analysis composition, multiverse code specification, and data summaries. In the third study, I compare the two different approaches for depicting ontological uncertainty---ensembles and p-boxes---by conducting experiments to investigate the impact of the visual representation on how the multiple uncertainty distributions are interpreted. Based on these results, I identified how the results of multiverse analyses should be visualised so that viewers adopt the desired (possibilistic) interpretation of ontological uncertainty. Together, these three studies outline a systematic approach for surfacing, reasoning about, and communicating ontological uncertainty that is often implicit in data analysis processes.

more

TIME Thursday, July 31, 2025 at 2:00 PM - 4:00 PM

LOCATION ITW, Ford Motor Company Engineering Design Center map it

CONTACT Wynante R Charles wynante.charles@northwestern.edu EMAIL

CALENDAR Department of Computer Science (CS)

Aug

4

Clustering without knowing the number of clusters.

Department of Computer Science (CS)

1:30 PM 3501, Mudd Hall ( formerly Seeley G. Mudd Library)

EVENT DETAILS

How much does prior knowledge about the number of clusters, $k$, influence the statistical feasibility and algorithmic performance of clustering methods? In this thesis, we explore two complementary clustering paradigms to elucidate the central role played by cluster cardinality. We first investigate Gaussian Mixture Models, a widely-used framework for clustering high-dimensional data. Here, knowledge of $k$ proves critical: we demonstrate a fundamental statistical barrier wherein mixtures of spherical Gaussians with unknown number of components become indistinguishable from a single Gaussian distribution, unless their pairwise mean separation is on the order of $\min(\sqrt{\log k}, \sqrt{d})$. Without prior knowledge or stronger assumptions, even the detection of multiple clusters is impossible, highlighting that knowing the correct number of components is an inherent necessity in Gaussian Mixture Models. In contrast, we examine Correlation Clustering, which is explicitly formulated without reference to the number of clusters. Objects are clustered based solely on possibly inconsistent pairwise similarity or dissimilarity labels. We provide improved approximation algorithms for the local-error objective in both complete and incomplete information settings. Furthermore, we introduce a more general model which we call the Correlation Clustering with Asymmetric Classification Errors, and present novel approximation guarantees tailored to this richer scenario.

Together, these results reveal a fundamental dichotomy: the cluster cardinality is either a crucial piece of structural information that defines feasibility, as in Gaussian Mixture Models, or an intentionally omitted parameter whose absence motivates alternative local objective functions, as in Correlation Clustering. Leveraging this dichotomy is thus essential for effectively choosing and employing clustering methods in practice.

more

TIME Monday, August 4, 2025 at 1:30 PM - 4:00 PM

LOCATION 3501, Mudd Hall ( formerly Seeley G. Mudd Library) map it

CONTACT Wynante R Charles wynante.charles@northwestern.edu EMAIL

CALENDAR Department of Computer Science (CS)

Sep

25

Bagel Thursday

Department of Computer Science (CS)

9:00 AM

EVENT DETAILS

TBA

more

TIME Thursday, September 25, 2025 at 9:00 AM - 11:00 AM

CONTACT Wynante R Charles wynante.charles@northwestern.edu EMAIL

CALENDAR Department of Computer Science (CS)

Inside Our Program
Program Events

Upcoming Events

Events filtered by

IDEAL Workshop on Learning in Networks: Discovering Hidden Structure

Designing interactive systems for reasoning with ontological uncertainty in data analysis

Clustering without knowing the number of clusters.

Bagel Thursday

More in this section

How to Apply

Related Links

Contact Info

The Current and Future State of IT

Bridge Between Tech and Business

Preparing For My Dream Job

Unlocking Agility in Organizations

The Current and Future State of IT

Bridge Between Tech and Business

Request Info

Request Your Program & Application Guide

Inside Our ProgramProgram Events

Events

Upcoming Events

Events filtered by

IDEAL Workshop on Learning in Networks: Discovering Hidden Structure

Designing interactive systems for reasoning with ontological uncertainty in data analysis

Clustering without knowing the number of clusters.

Bagel Thursday

More in this section

How to Apply

Related Links

Contact Info

The Current and Future State of IT

Bridge Between Tech and Business

Preparing For My Dream Job

Unlocking Agility in Organizations

The Current and Future State of IT

Bridge Between Tech and Business

Request Info

Request Your Program & Application Guide

Inside Our Program
Program Events