Alumna Alice Zhao Writes New Edition of "SQL Pocket Guide"
Alice Zhao (MSiA '13) recently celebrated the publication of a new edition of O'Reilly's SQL Pocket Guide. We spoke with her about the writing process, what she hopes the book accomplishes, and how MSiA impacted the final product.
Mar 8, 2022
Congratulations on the publication of your book! Tell us a little bit about the writing process. What did you enjoy most about it?
Thank you! O’Reilly reached out to me two years ago to write the latest version of their SQL Pocket Guide. The last edition was published a decade ago, and they were looking for someone to modernize the book.
I started the process by reading the previous edition, which was one of O’Reilly’s most popular SQL books. It was incredibly thorough and well-written, but it read more like documentation. That made sense at the time, but these days, when people have a coding issue, they typically want to see examples of how to solve the problem. I ended up reorganizing the book, simplifying the language, including more examples and adding a few more practical chapters.
I would say the most enjoyable part of the writing process was creating content that I wished I had available when I was first learning SQL. One thing that always confused me was the difference between MySQL, PostgreSQL, SQLite, etc. I reviewed several other SQL books and websites and none of them had comprehensive, consistent side-by-side comparisons for each concept. So I created that! It felt like I was truly adding something original to the SQL world.
What’s cool to me now is that I often find myself referencing the book when I’m coding in SQL. I used to love creating one-page cheat sheets for exams in college, and I joke that I pretty much created the ultimate cheat sheet with this book! Even better, I get to share it with the world.
Why is SQL an important skill for data analysts to have in their back pocket?
One of my favorite quotes from the book is — if there was an award for best supporting programming language, SQL would take home the prize. While it’s not the first language that a data scientist or analyst might choose to use, it’s an essential one.
Often, aspiring data scientists and analysts focus on learning algorithms and coding in either Python or R. While those skills are absolutely useful, at some point in every data analysis project, you need to get data. Beginners may get a dataset handed to them or download a .csv file online, but in the real world, when you’re working at a company, that data almost always sits in a database, and the most common way to access it is by using SQL.
While I was teaching data science bootcamps, I would often remind students of the importance of SQL and encourage them to get as much practice as possible. I’ve had so many student come back to me after interviewing or being on the job for a few months to tell me that was some of the best advice they received.
How did what you learned in MSiA impact the content and structure of the book?
During the first quarter of the MSiA program, we took a class to learn the basics of SQL. At the time, I understood it as a stand-alone language — anytime you encounter a relational database, use SQL.
By the second quarter, we were learning other tools, and I was surprised that you could actually write SQL within other programming languages... and it showed up everywhere! It popped up when manipulating data in R, SAS, and in our big data class. With that in mind, I added an entire chapter of the book on the various places you can write SQL, including within R and Python code.
Another memorable time of mine during the MSiA program was during my internship. That’s where I learned that theory and classroom work can only get you so far. The real world experience of dealing with the particular quirks of a company’s database and struggling with queries that take forever to run are the other half of the learning experience. Throughout the book, I sprinkle in practical tips that I learned while on the job to hopefully help others when they run into similar issues.
You’ve been active in other media too, bringing analytics work to other projects like this one where you analyze comedy from Ali Wong. Tell us a little bit about the importance of data scientists applying their skills and knowledge to popular culture projects.
I think it makes data science less intimidating. When I tell people I’m a data scientist, they often say that have no idea what that is or they are really impressed with all the fancy algorithms I must know. I like to do data analysis projects on stand up comedy, reality TV and baking to show people how fun and down-to-earth data analysis can be.
I started writing blog posts and making YouTube videos the year after I graduated from the MSiA program to keep my skills fresh. I was shocked to learn that so many people found them entertaining and relatable. The best thing to come out of all of this is the number of people who have messaged me to tell me how I’ve helped them understand a technical concept or how I’ve inspired them to become data scientists.
I think combining data science and pop culture helps reach demographics that might not typically go into the field. When I started the MSiA program 10 years ago, a lot of the “fun” data analysis was around sports and politics. I wasn’t into either of those, so I decided to analyze data about The Bachelor. Now I see data analysis courses focused specifically on Bachelor data, TikTok videos about Excel functions and much more. The field of data science thrives on diversity, and I love that pop culture is helping reach a such a large variety of people.
Thank you! O’Reilly reached out to me two years ago to write the latest version of their SQL Pocket Guide. The last edition was published a decade ago, and they were looking for someone to modernize the book.
I started the process by reading the previous edition, which was one of O’Reilly’s most popular SQL books. It was incredibly thorough and well-written, but it read more like documentation. That made sense at the time, but these days, when people have a coding issue, they typically want to see examples of how to solve the problem. I ended up reorganizing the book, simplifying the language, including more examples and adding a few more practical chapters.
I would say the most enjoyable part of the writing process was creating content that I wished I had available when I was first learning SQL. One thing that always confused me was the difference between MySQL, PostgreSQL, SQLite, etc. I reviewed several other SQL books and websites and none of them had comprehensive, consistent side-by-side comparisons for each concept. So I created that! It felt like I was truly adding something original to the SQL world.
What’s cool to me now is that I often find myself referencing the book when I’m coding in SQL. I used to love creating one-page cheat sheets for exams in college, and I joke that I pretty much created the ultimate cheat sheet with this book! Even better, I get to share it with the world.
Why is SQL an important skill for data analysts to have in their back pocket?
One of my favorite quotes from the book is — if there was an award for best supporting programming language, SQL would take home the prize. While it’s not the first language that a data scientist or analyst might choose to use, it’s an essential one.
Often, aspiring data scientists and analysts focus on learning algorithms and coding in either Python or R. While those skills are absolutely useful, at some point in every data analysis project, you need to get data. Beginners may get a dataset handed to them or download a .csv file online, but in the real world, when you’re working at a company, that data almost always sits in a database, and the most common way to access it is by using SQL.
While I was teaching data science bootcamps, I would often remind students of the importance of SQL and encourage them to get as much practice as possible. I’ve had so many student come back to me after interviewing or being on the job for a few months to tell me that was some of the best advice they received.
How did what you learned in MSiA impact the content and structure of the book?
During the first quarter of the MSiA program, we took a class to learn the basics of SQL. At the time, I understood it as a stand-alone language — anytime you encounter a relational database, use SQL.
By the second quarter, we were learning other tools, and I was surprised that you could actually write SQL within other programming languages... and it showed up everywhere! It popped up when manipulating data in R, SAS, and in our big data class. With that in mind, I added an entire chapter of the book on the various places you can write SQL, including within R and Python code.
Another memorable time of mine during the MSiA program was during my internship. That’s where I learned that theory and classroom work can only get you so far. The real world experience of dealing with the particular quirks of a company’s database and struggling with queries that take forever to run are the other half of the learning experience. Throughout the book, I sprinkle in practical tips that I learned while on the job to hopefully help others when they run into similar issues.
You’ve been active in other media too, bringing analytics work to other projects like this one where you analyze comedy from Ali Wong. Tell us a little bit about the importance of data scientists applying their skills and knowledge to popular culture projects.
I think it makes data science less intimidating. When I tell people I’m a data scientist, they often say that have no idea what that is or they are really impressed with all the fancy algorithms I must know. I like to do data analysis projects on stand up comedy, reality TV and baking to show people how fun and down-to-earth data analysis can be.
I started writing blog posts and making YouTube videos the year after I graduated from the MSiA program to keep my skills fresh. I was shocked to learn that so many people found them entertaining and relatable. The best thing to come out of all of this is the number of people who have messaged me to tell me how I’ve helped them understand a technical concept or how I’ve inspired them to become data scientists.
I think combining data science and pop culture helps reach demographics that might not typically go into the field. When I started the MSiA program 10 years ago, a lot of the “fun” data analysis was around sports and politics. I wasn’t into either of those, so I decided to analyze data about The Bachelor. Now I see data analysis courses focused specifically on Bachelor data, TikTok videos about Excel functions and much more. The field of data science thrives on diversity, and I love that pop culture is helping reach a such a large variety of people.