Center for Deep Learning


News, Blog, Events

CDL is Looking for a Post-Doc:

Center for Deep Learning (CDL) seeks an exceptional postdoc researcher in the area of deep learning. CDL is comprised of faculty members and PhD candidates from a variety of departments working on deep learning and reinforcement learning problems in the areas of natural language processing, bioinformatics, customer and business intelligence, computer vision, and the internet of things. The research is primarily focused on designing new architectures and models, novel optimization algorithms, and algorithms and implementations at the system and compiler level. Prior knowledge of model serving and kubernetes gives an advantage. We seek applicants with knowledge and expertise in these areas who are eager to conduct independent research and also to lead projects involving PhD candidates.

Examples of current projects:
● Text summarization models
● Dialog systems
● Several deep learning aspects related to model serving
● NLP in healthcare and bioinformatics
● Improving deep learning at the compiler level
● Reinforcement learning (related to autoML)

Application material:

Please send the following material to email

CV, the names of three references, a manuscript that you are the first author of, a statement outlining your research accomplishments (of at most two pages), future research plan, and career ambitions.
A successful candidate should have a PhD and a well-established research track record as demonstrated by publications and open source software. The PhD must have been obtained in the last five years.


● Conduct independent research
● Lead deep learning projects with PhD students
● Interact with sponsoring companies regarding select projects
● Participate in outreach (assist organizing a workshop, CDL advisory board meetings)
Northwestern University is ranked the number thirteen most innovative university and number twenty by World University Rankings. Most of the interactions will be with the faculty and students from the Department of Industrial Engineering and Management Sciences and Computer Science.


CDL Seminar:

Analytic Applications to Change the World

Timothy Chou                                                      Timothy Chou

Tuesday, November 10, 3 p.m. - 4 p.m. CST

This talk is organized as three mini-TED talks. In the first I will talk about a new class of software: enterprise analytic/AI applications. For too long the state of the art of analytics and data science has been based on the "wrong rock" method. It's time to take the next steps that we've already seen happen in enterprise workflow applications. Next, while there is a lot of data generated by People, it is dwarfed by what is generated by Things. Unfortunately, most of the techniques for the Internet of People don't work for the Internet of Things. I will discuss five big challenges in connecting and collecting data from Things. Finally, I would like to challenge the audience to use their talents to truly change the world. Twenty-five percent of the globe's population will be in Africa in less than twenty years. All developing economies share the needs for the basic infrastructure of power, water, food, education, and healthcare. Will we try and repeat what we did in the first world, or do we have an opportunity to re-invent the future? As a small example I will close with a discussion of our Pediatric Cloud Project - a project to connect all 1,000,000 healthcare machines in all the children's hospitals in the world and through software and data transform children's healthcare.

Timothy Chou

Timothy Chou began his career at one of the original Kleiner Perkins startups, Tandem Computers. He has had a long career in enterprise software. He served as President of Oracle On Demand beginning in 2000, which was the beginning of Oracle's multi-billion dollar cloud business. Today he serves on the board of directors of two public companies: Blackbaud and Teradata. In parallel to his commercial career he started teaching introductory computer architecture at Stanford University in 1982, and after leaving Oracle he returned to Stanford and launched the first class on cloud computing.

Watch the recording here.


Diego Klabjan Featured on the Data Mindset Podcast

Diego Klabjan was interviewed for the Data Mindset Podcast. He discusses contemporary topics in deep learning, and about bringing up the next generation of students in the arena.



REFIT: aRtificial intElligence For Internet of Things

CDL Blog Post 9/3/2020

Matthew Alvarez MSCS ’21, Aditya Sinha MSIT ‘20, Nancy Zhang MSiA ’20, J Montgomery Maxwell MS ESAM ‘20


In recent years the popularity of the Internet of Things (IoT) ecosystem has exploded. One area of interest in the IoT landscape is the combination of IoT infrastructure and Machine Learning. Traditionally, such systems use existing streaming solutions (e.g. AWS Kinesis, Azure Stream Analytics, Google Cloud Platform Stream Analytics), however, existing solutions typically require a significant amount of work to integrate and support a specific use case.

Northwestern University’s Center for Deep Learning is developing an open-source streaming platform that is designed to seamlessly integrate IoT workloads with modern machine learning approaches. The aRtificial intElligence For Internet of Things (REFIT) system is designed to be highly scalable and fault tolerant by using production-tested open source technologies. REFIT is also developed using Helm, a package manager for Kubernetes. Using these technologies enables you to deploy REFIT as a helm chart using a cloud provider (such as AWS, GCP, or Azure), on premises, or locally using minikube.

Note: local deployments on minikube are only recommended for development or testing


The primary goals of REFIT are as follows:

  • Produce predictions for a stream of events
  • Continually improve the predictive component of the system
  • Scale to meet elastic demand of IoT infrastructure

Some secondary goals include:

  • Secure data in transit & at rest
  • Integrate with upstream components
  • Multitenancy


System Configuration

REFIT is meant to be a highly configurable system that can support many IoT use-cases out of the box. To get REFIT to work for your use case, you only need to provide a “project schema” for your project. Your project schema is a YAML file that consists of a few pieces of key information and a set of fields that can come from your IoT infrastructure.

System Design

REFIT’s implements a publish-subscribe architecture, using Apache Pulsar to communicate between individual components in the system. In addition to Pulsar, REFIT uses Apache Cassandra to persist data on disk and Redis as an in-memory cache to store Machine Learning models that are ready for use. In addition to these components, REFIT uses four services to ingest, train, predict, and visualize data from upstream IoT components.

Ingestion Service: The REFIT Ingestion Service is an Apache Camel application that is designed to integrate with a variety of upstream systems and services. The Ingestion service’s purpose is to receive sensor data from existing IoT infrastructure, translate the information to an internal protocol buffer representation, and to send the serialized data to the Inference Service.

AI Application: The AI Application component of the REFIT system is where the machine learning models are created. Periodically the AI application will read historical data from the database and create a new machine learning model. After creating a new model that is ready for use, the AI application will save the model to an in-memory data store and notify the inference service of the new model. 

Inference Service: The Inference Service is an Apache Flink streaming application that consumes events from the IoT infrastructure and events from the AI Application. Using Flink’s Stateful Function API the Inference Service is able to extract features from IoT data and create predictions using the machine learning model provided by the AI Application. After the Inference Service extracts features and creates a prediction, it will publish an event for the Integration Service to consume.

Integration Service: The Integration service is the final internal service in the REFIT application. Much like the Ingestion Service, the Integration  Service is an Apache Camel application that is designed to integrate with a variety of downstream systems and services. The  Integration Service receives event data along with predictions from the Inference Service, encrypts the data and routes data to REFIT’s Cassandra database. The Integration service also implements a Grafana Implementation that allows you to visualize event data and predictions in real time. In addition to the Integration Service’s existing functionality, you can configure the application to integrate with many upstream systems using Camel’s many components or by creating your own custom component.

Grafana: REFIT’s integration service enables the system to easily integrate with many other applications. One such application is Grafana, which REFIT uses to allow you to visualize and monitor data streams in real time. REFIT’s modular approach allows the system to dynamically generate dashboard templates that can act as a base to visualize, manipulate, and learn from the trends in the data.

Efficient Architecture Search for Continual Learning

July 16, 2020

Diego Klabjan, Qiang Gao, and Zhipeng Luo have submitted a new paper for publication that addresses a novel approach to achieve higher classification accuracy in continual learning with neural networks.  Read about CLEAS.

New A.I. Tool Created by Center for Deep Learning Team Featured in Northwestern News

June 18, 2020

CAVIDOTS provides easy-to-skim summaries of academic papers.  Read the full article here, and a detailed blog post below. 

Center for Deep Learning creates COVID-19 query tool, organizing key information for medical researchers in a large multi-document dataset

Ning Wang, Diego Klabjan, Han Liu

CAVIDOTS — Corona Virus Document Text Summarization

April 22, 2020

The coronavirus has severely endangered public health and damaged economies. To provide medical researchers with more efficient tools to acquire relevant and also salient information to fight the virus, we developed a query tool, CAVIDOTS (CoronA VIrus DOcument Text Summarization) accessible here for investigating COVID-19 related academic articles provided in this dataset. (Full Article can be found HERE)


Inaugural Center for Deep Learning Symposium Looks to Future of Artificial Intelligence

November 14, 2019

At its inaugural symposium on November 14, Northwestern Engineering’s Center for Deep Learning, a community of deep learning-focused data scientists who conduct research and collaborate with industry, convened members of academia along with current and potential corporate partners to discuss the state of the technology and the best path forward. Full Article can be found HERE


Center for Deep Learning Presents: Machine Learning on Google Cloud

Yufeng Guo
Developer, Advocate, Machine Learning, Google

Thursday, August 22, 2019 

This interactive session will focus on the tools for doing machine learning Google Cloud Platform (GCP). From exploration, to training, to model serving, we will talk about available tools and how to piece them together, depending on your needs. We will explore some common patterns, as well as take time to address any questions, so bring use cases with you to discuss.

BIO: Yufeng is a Developer Advocate at Google focusing on Cloud AI, where he is working to make machine learning more understandable and usable for all. He is the creator of the YouTube series AI Adventures, at, exploring the art, science, and tools of machine learning. He enjoys hearing about new and interesting applications of machine learning, so share your use cases with him on Twitter@YufengG.

Thoughts on autoML

Diego Klabjan Blog Post 7/29/19

Diego Klabjan

Director, Center for Deep Learning; Director, Master of Science in Analytics

There are many start-ups screaming Data Science as a Service (DSaaS). The shortage of data scientists is well document and thus DSaaS makes sense. However, data science is complex and steps such as feature engineering and hyper parameter tuning are tough nuts to crack. 

There are several feature selection algorithms with varying degree of success and adoption, but feature selection is different from feature creation which often requires domain expertise. Given a set of features, feature selection can be automated, but the latter is still unsolvable without substantial human involvement and interaction. 

On the surface hyper parameter tuning and model construction are further down the automation and self-service path thanks to the abundant research work and commercial offerings of autoML. Scholars and industry researchers have developed many algorithms for autoML with two prevailing approaches of Bayesian optimization and reinforcement learning. Training of a new configuration can now be aborted early and perhaps restarted later if other attempts are not as promising as the first indicators were showing. Training of a new configuration has also been improved by exploiting previously obtained weights. In short, given a particular and unique image problem autoML can without human assistance configure a CNN-like network together with all other hyper-parameters (optimization algorithm, mini-batch size, etc). Indeed, autoML is able to construct novel architectures that compete and even outperform hand-crafted architectures and to create brand new activation functions. 

Google’s autoML which is part of their Google Cloud Platform have competed in Kaggle’s competitions. On older competitions autoML would have always been in top ten. In a recent competition autoML competed ‘live’ and finished second. This attests that automatic hyper parameter tuning can compete with best humans. 

There is one important aspect left out. autoML requires substantial computing resources. On deep learning problems it often requires weeks and even months of computing time on thousands of GPUs. There are not many companies that can afford such expenses on a single AI business problem. Even Fortune 500 companies that we collaborate with are reluctant to go down this path. If organizations with billions of quarterly revenue cannot make a business case for autoML, then it is obvious that scholars in academia cannot conduct research in this area. We can always work on toy problems, but this would take us only so far. The impact would be limited due to unknown scalability of proposed solutions and publishing work on limited computational experiments would be hindered. A recent PhD student of mine recently stated “I do not want to work on an autoML project since we cannot compete with Google due to computational resources.” This says a lot. 

The implication is that autoML is going to continue to be further developed only by experts in tech giants who already have in place computational resources. Most if not all of the research will be left out of academia. This does not imply that autoML is doomed since in AI it is easy to argue that research in industry is ahead of academic contributions. However, it does imply that the progress is going to be slower since only one party is going to drive the agenda. On the positive side, Amazon, Google, and Microsoft have a keen interest in improving their autoML solutions as part of their cloud platforms. It can be an important differentiation factor driving customers. 

Before autoML becomes more used in industry, the computational requirements must be lowered, and this is possible only with further research. I guess we are at the mercy of FAANG (like we are for many other aspects outside autoML) to make autoML more affordable.


Autonomous Cars: Sporadic Need of Drivers (without a Driver’s License)

Diego Klabjan Blog Post 6/25/19

Diego Klabjan

Director, Center for Deep Learning; Director, Master of Science in Analytics

Based on SAE International’s standard J3016 there are 6 levels of car automation with level 0 as the old fashion car with no automation and the highest, level 5, offering full automation in any geographical region and (weather) condition with a human only entering the destination. As a reference point, Tesla cars are at level 3 (short sellers claim it is level 2 while Elon believes it is level 5 or soon-to-be-level-5). 

Waymo is probably the farthest ahead with level 4 cars – full automation on select types of roads and geographical regions (“trained” in Bay Area or Phoenix, but not in both), and weather conditions (let the car not drive in Chicago in snow or experience pot holes when snow is not on the ground). Atmospheric rain storms in San Francisco have probably also wrack havoc to Waymo’s cars.

In “Self-driving cars have a problem: safer human-drive ones,” The Wall Street Journal, June 15, 2019, Waymo’s CEO John Krafcik stated that level 5 was decades away. A human will be needed to occasionally intervene for foreseeable future (a flooded street in San Francisco or a large pot hole in Chicago). 

Let us now switch to the other side of the equation: humans, aka drivers. Apparently, they will be needed for decades. The existence of humans in the future is less problematic except for the believers of superintelligence and singularity, but the survival of drivers is less clear. 

I have three young-adult children, each with a driver’s license, however they are likely the last generation with driving knowledge. Ten years from now I actually doubt they will still know how to drive. While they currently have their own car at home owned by the parenty, they frequently use car-sharing. I doubt they will ever own their car since they will not need one. Their number of miles driven per year is steadily going down and I am confident that five years from now it is going to deplete to zero. As with any other skill or knowledge, if you do not practice it, you forget how to do it. 

Thirty years from now I predict that the only people with driving knowledge will be those who are currently between 30 and 45 years old (I am assuming that all of us older than 45 will not be delighted to drive at the age of 75 or will RIP). Those who are now 30 years old or younger will forget how to drive in a decade or so. Those currently below the driving age and yet-to-be-born will probably never even learn how to drive. 

People outside of this age range of between 30 and 45 will not be able to own a car unless we get level 5 cars, which seems to be unlikely. No matter how much they long to have a car, occasionally they will have to take control of the car, but they will not know how to operate it. As a result, there will be no car ownership for those outside of age 30 to 45. 

The logic does not quite add up since car-sharing drivers will always be needed to transport my children, and occasionally, me. In short, thirty years from now, the only drivers will be those who are currently between 30 and 45 years old or are, and will be, car-sharing drivers. The latter will be in a similar situation as the airplane pilots are now. Car-sharing drivers will mostly be trained in simulators just to have enough knowledge to sporadically take control of a car. Since we consider today’s pilots to know how to fly, we should as well call future car- sharing operators, drivers. 

There is another problem with not having level 5 cars soon. I personally find Tesla’s autopilot useless (I do not own a Tesla, and thus, this claim is based on reported facts), as well as, level 4 cars. The main purpose of an autonomous car is to increase my productivity. The car should drive while I am working on more important things. If I have to pay attention to the road and traffic even without actively driving, it defeats the purpose; it is still a waste of time. The only useful autonomous cars are level 5. There is clearly a benefit of levels 1-4 by being safer, but at least, in my case, this argument goes only so far. 

In summary, the future of car automation levels 0-4 is bleak. They do increase safety, but they do not increase productivity if a human driver needs to be in the car paying attention to weather and pot holes. Furthermore, the lack of future drivers will make them even more problematic. In short, a human in the loop is not resounding. 

One solution is to have level 4 cars being remotely assisted or teleoperated. This practice is already being advocated by some start-ups (Designed Driver) and, in essence, is a human on the loop. In such a scenario, I will be able to do productive work while being driven by an autonomous car with teleoperated assistance. This business model also aligns nicely with the aforementioned lack of drivers since there will only be a need for ‘drivers’ capable of driving in a simulator. Is this a next generation job that will not be wiped out by AI? You bet it is, and we will probably need many. 

There is a challenge on a technical side. A car would have to identify an unusual situation or low-confidence action in order to invoke a teleoperator. If this can be done with sufficient reliability is yet to be seen. Current deep learning models are capable of concluding “this is an animal that I have not yet seen” without specifying what kind of animal it is. There is hope to invent deep learning solutions RELIABLY identifying an unseen situation and passing the control to a teleoperator.

Northwestern Engineering Launches New Center for Deep Learning

NU News - 9/25/18

As artificial intelligence grows in prominence, Northwestern Engineering is launching the Center for Deep Learning, which will build a community of deep learning-focused data scientists to service the research and industry needs of the Midwest.

Led by faculty in Computer Science and Industrial Engineering and Management Sciences, the interdisciplinary center will produce academic research and technological solutions in collaboration with corporate partners, ranging from Fortune 500 corporations to startups and from financial technology to pharmaceuticals.

At the September 24 kickoff, representatives of approximately 30 companies attended the kickoff meeting at the newly renovated Seeley G. Mudd Building.

“It’s a wide range of people from diverse industries,” said Jim Bray, director of Northwestern’s Office of Corporate Engagement. “There’s broad applicability of deep learning to different industries.”

NVIDIA, the Santa Clara, California-based company which created the graphics processing unit (GPU), is the inaugural partner with the center. The company has donated high-performance hardware that the center is already utilizing for deep learning research.

Diego Klabjan, professor of industrial engineering and management sciences and director of the Master of Science in Analytics program, co-directs the initiative with Doug Downey, associate professor of and computer science, and Mark Werwath, director of the Master of Engineering Management program and co-director of the Farley Center for Entrepreneurship and Innovation.

“The center is an opportunity for folks in the industry to apply this new technology and have it actually provide business impact,” Downey said. “It’s also affords us as academics an opportunity to gain more insight into the kinds of problems that arise when you try to take this technology out of the center and have impact as a real-world product.”

Within the first year, Downey said, the center hopes to create technology that results in both academic papers and business value for multiple companies.

“They know that the expertise is coming from places like Northwestern. They come here to hire some of the best data scientists in the world,” Werwath said. “It’s all very synergistic in the sense that it’s all a win-win-win.”

Corporate partners will benefit from the talent moving through the center, able to recruit Northwestern students well versed in deep learning.

“We see a lot of interest from companies in this space, a lot of interest in students coming out in this field, and it’s a hot job market right now,” said Tim Angell, senior associate director at Northwestern Corporate Engagement. “There’s recognition that Northwestern has strength in this area.”

Diego Klabjan Blog Post


Imagine an AI OS allowing the power of AI to be harnessed seamlessly by system builders, without the need for extensive model building or framework-specific engineering.

Imagine an open and accessible forum bringing together industry players, startups and academics collaborating to create real-world solutions.

Imagine democratizing cutting edge AI to transform foundational research, technology innovation, and education by building an open source AI operating system and ecosystem.

At Northwestern University, your imagination has become reality. Our Deep Learning Lab, created this year, understands AI’s rapid rise and importance in industry and society. New applications like self-driving cars, personalized medicine, personalized advertising and marketing, virtual assistants, and more are expected to revolutionize industries in coming years. 

Bringing about this future will require new advances in several areas, from fundamental advances in learning and inference techniques, to new paradigms for programming AIs making the technology widely accessible, to infrastructure challenges like reliability, efficiency, and security. Deep Learning Lab will be a global leader in those areas.

The lab envisions collaborative research projects in:

• Bioinformatics

• Image and video AI

• Personalization (marketing, product design, etc)

• Optimization

• Natural language processing (NLP)

• Internet of Things (IoT)

Sponsors of Deep Learning Lab, like NVIDIA, are excited about helping define the future of deep learning and AI technologies. Sponsors also gain access to top Northwestern researchers, student interns, graduates, and more.