How to Read Deep Learning Paper as a Software Engineer - https://www.youtube.com/watch?v=nL7lAo95D-o Deep learning papers can look daunting to read, especially if you don't have a strong theoretical background in machine or deep learning. Some paper can be so dense with jargon, formula and magical looking results that you might feel you are missing ten years worth of background to even start to look at the title. In this video, I'll show you how I read most deep learning paper easily, even for new deep learning subfield to start out, there are two ways to read a paper depending on your goals. The first way is to kind of understand what was done, but not in depth, usually to use the results of the research or to understand another paper using these results. This way of reading rely heavily on trusting that the paper is right. It's not too difficult to do. If you read the abstract and conclusion of a paper, you can get the gist of it. It's a bit like knowing roughly what react is doing, but being more interested in using it than understanding how it's implementing. Like a shadow dom the second way to read the paper is where people get lost. The goal is to understand a research project so deeply that you can reproduce these results. In that setup, there is little trust and you actually need to understand the paper methods. This way of reading paper is what I'll be covering today. Before we start, let me preface it with a warning about deep learning research in general. It's a highly empirical field, meaning a lot of things get discovered just because they were tried. It also means that whenever reading a deep learning paper, you need to understand the context in which it was written. Most of the stuff used within can be slightly wrong, or at least the reasoning for why they worked is wrong. Be acutely aware of that. You will see it's easier to see it with very old deep learning paper with the knowledge that we have currently. Okay, so the full method I use looks somewhat like this. There's like seven steps. I involve like multiple read tree of them, which is very normal for any research paper. Paper in general are not linear, they are kind of a structure with like a lot of interdependency. The first step to read a paper is to have enough contextual information about it to be able to orient yourself properly. Going into a new field is highly disorienting. If you don't have context, you will get lost. Researchers with lots of experience within the field already have the context, so this is why they can grasp a paper easily. My favorite technique to absorb all the context before reading a paper quickly is the following. I find like two or three blog posts summarizing the main finding in the paper, they don't have to be well written or write. It will just help me orient myself with the nomenclature and stuff like that. After that I find like five or ten diverse videos on the paper. Main findings again, you are going for a diverse set of opinions about what is important in the paper. I will suggest here to vary the length of so between like six to 60 minutes is good. The first material you will ingest on the paper subject will fail disorienting. However, after a few you will start to notice patterns that will tell you where the real focus of the paper lies and where the strength are. This will also mean that as you read more paper in the field, it will get easier and easier to absorb the context faster. Now that you have the context in your head, you can open up the paper for first casual read I first start with a linear read from end to end where I note down on a separate sheet every element that I don't understand. Everything I understand is all fine. Usually I keep reading over it. I don't research things I don't understand deeply while I'm doing my first read, I just make note of them. There are usually five things in a paper that I take note of that make it harder to understand. These are unknowns I need to somewhat categorize and elucidate to make sense of the whole research. The first category of unknowns are critical elements of the paper that aren't explained by the others and implied to be understood. The second type are critical topics that are explained in the paper but need some work to grasp fully. The third type of unknown are elements that the author didnt understand fully, so they kind of waved it away. This type of unknown is usually very hard to get an answer for and is a core target for criticism in later years following publication. The fourth type is surprisingly stuff that the authors got wrong. It does happen. It doesnt mean the paper is garbage though sometimes things work out anyway. The fifth type is element of the paper that the reviewers wanted to be added before publication but provided no value whatsoever to the paper. These usually sticks out of the paper. Once you are done with noting down all of the unknowns you are ready to start to fill some gaps. Try to categorize the stuff you didnt understand into these five categories. Only tackle the external unknown for now, research whatever you dont understand and make sure you come back to the paper with a better understanding. Dont get too lost here. If its a specific method the paper makes use of leverage. The first way of reading a paper which is to superficially understand the intuition behind or better yet, just read a blog that contains the technical information that you need. Now comes the second read through. The whole purpose is to reduce the number of internal unknown to the minimum. In that read, I like to do the start with the abstract and the introduction to understand what is a setup in the paper. Try to understand the paper's motivation because it will help with understanding why the methods are the way they are. Then you jump to the discussion and conclusions section at the end of the paper to see the end point of this experiment. It's important that you understand deeply what the author got out of the research. Generally, the conclusion will follow the motivation of the start in a connected fashion. Since the paper was not written, introduction first, and then six months after that they got to the conclusion. It was all written together in a recursive fashion. Finally, I take each of the figures out and I try to understand what everything in the figure means, the axis, what the graphs are highlighting, the units, all of this stuff. At this point I feel a very deep connection with the whole research. I used the introduction and the conclusions as anchor and tried to fit in the result shown in the figures in a sort of like logical timeline. If there is something that feels disconnected, it's a hint that there is maybe something else that I didn't understand apart from their specific method or result the other got. Now there are two ways to finish up the second read through reading through the technical details slowly and making sure you understand each step conceptually, including the formulas. Or you open up the code base for the paper first if it's provided and start mapping out each of the sections. I like the second way because as a software engineer it will give you confidence that this stuff is not magic. At the end of the day, it's code that was written up and ran to generate whatever results. Do a good pass on the code base until you feel kind of comfortable about where things are and that you cross reference some of the name in the code with those in the paper. Disclaimer though research code is usually very messy and doesn't have a standardized structure, keep that in mind. Once done with the code, you are ready to check out the methods result section in depth. Go slowly and carefully and understand the methodology behind the paper step by step. There are a few repeating elements in a deep learning experiment that are mostly always there. Understand that structure helps a lot in knowing how to navigate the method section. First, there is the data that is fed to the deep learning model. The data has a specific structure that dictates the architecture choice for the model. As you dive into a specific field, you will see the same kind of dataset used over and over again. Then there is the architecture of the model per se. Here is the layer used, the connection with instead layer, and all this. There might be a lot of different sub elements in an architecture that you don't necessarily understand at first, and it's okay after that is the type of gradient descent optimizer and training regiment used. There are a lot of empirical details usually hidden here, so it's fine if the why isn't always 100% clear in the training. Finally, there is the pipeline from the raw data to whatever result is generated. Sometimes paper will combine multiple architecture in specific ways, like with an ensemble method. In any given research, there are usually only a few tweaks that understand the core. As you read more paper, you will start to spot them easily. Break each of the steps from the method down and try to understand the logical flow. If something feels weird or surprising, check the code or other resources for reasons why the author did it the way they did. Finally, go slowly through the results section and fill in a blank in your timeline between the introduction and the conclusion with the numerical or qualitative result. And once done, I usually like to do a final read from start to finish and look at my note to ensure I understand completely the flow again, if there is something that feels weird during that read, it might be that I still dont understand something explained within the paper, or that the author didnt explain something fully for a literal lack of knowledge, or that the author was flat out wrong, or that a reviewer asked to add some superfluous garbage information which feels disjointed anyway, in all cases, after this third read, you should understand the paper enough to take on a reproducibility effort if needs be. I hope you enjoyed the video. Don't forget to like if this was the case and leave a comment if you have any question. I'm here to help. Have a great week everyone, and see you in the next video.