1 Year of Teaching Myself Computer Science

War Stories: What Are We Doing Here?

In the winter of 2023, I came to a humbling realization.

While I had a firm foundation in the mathematics behind data science, I didn't understand any of the software/hardware enabled me conduct data science. This made me think...

What the f*** is data science?

The “field of data science” itself is still very young and is still figuring out its identity. It lies in this weird limbo between computer science, mathematics, statistics, and some domain for application. Honestly, this awkwardness made me shy away from studying “Data Science” and I focused my energy on theory, spending my time at university studying applied mathematics and statistics which I do not regret. However, as I left university and began work, I soon realized that I didn’t understand how to effectively use and optimize software/hardware for data mining and analysis. In my mind, I thought that all I needed to know was the theory behind these models; understanding was someone else’s problem and I adopted an “if it aint’ broke, don’t fix it” mentality. Ultimately, I soon realized that understanding is my problem. Here are a few stories:

These are just a few of my experiences. The point is I soon realized that I could not avoid the gigantic Snorlax blocking the bridge; I needed to find that PokeFlute. Now it’s been one year and while I've learned a lot, I'm still nowhere near where I want to be. However, I hope to recount my experiences in the past year, outline the resources that I used, and what I plan on doing this year!

One Year of Fun

To start off, I needed a frame of reference. The most helpful resource that I found was Oz Nova’s teachyourselfcs. While I did not use all the resources that the website provided, I used his “areas of computer science” as a guidepost of what I should know. Additionally, I had already developed a solid foundation in discrete mathematics and machine learning/deep learning through school, so I spent much of my time on the other fields of computer science.
An additional logistical note, I have generally excluded links for courses since they may be deprecated. However, the general strategy that I used to find lectures is as follows. If they contain public lecture recordings, there is no additional work to be done! If not, usually if you find a “COVID” version of the course, there will be lecture recordings. The same applies to projects/labs and homework which are more likely to be publicly available. I want to emphasize that most of my learning was done through projects/labs so much sure you do them. I would also recommend choosing the best method for you! I learn best through reading and doing so the textbook and project pair was ideal for me. As for textbooks, I would recommend using a fork of [0x6c,0x69,0x62,0x67,0x65,0x6e,0x0a] (where the string is UTF-8 encoded). With that sorted out, here we go! I started with introductory computer architecture/systems. During this time, I primarily used two class resources…

It was this time that I switched from VSCode to Vim. In the end a text editor is a text editor but I really have been enjoying using Vim all things programming. In the summer, I decided to delve deeper in operating systems and distributed systems. To this end, I used the following references…

I read one textbook for operating systems, which was Operating Systems: Principles and Practice, as well as a bunch of papers on distributed systems from A Distributed Systems Reading List. Then, I finished up 2024 by focusing on computer networking via UC Berkeley’s CS 168 – Introduction to the Internet.

What I’m Doing Now

As for learning, I have been focusing my energy on going through database systems. Specifically, Professor Andy Pavlo from CMU has a bunch of great, publicly available resources for database systems.

and compilers (here is my current implementation of the Lox programming language in Zig)

Future Plans

Again, this is nowhere near where I want to be, and I consider this past year a first step in a lifelong journey of continuous learning. While I am focusing on database systems and compilers right now, I know that I want to explore additional topics this year: reinforcement learning, data compression/coding theory, advanced computer architecture, “graduate level” statistical inference, and machine learning. While this whole endeavor was sparked by my frustration at not understanding the software/hardware that enables productive data science, I have grown to become deeply interested in the world of computer systems and theoretical computer science and I hope you stick along with me for this ride!