# Debug

{% embed url="<https://www.youtube.com/watch?v=d07le7otRUM>" %}
Debug - Troubleshooting
{% endembed %}

## Summary

* The 5 most common bugs in deep learning models include:
  * Incorrect shapes for tensors.
  * Pre-processing inputs incorrectly.
  * Incorrect input to the loss function.
  * Forgot to set up train mode for the network correctly.
  * Numerical instability - inf/NaN.
* 3 pieces of general advice for implementing models:
  * Start with **a lightweight implementation**.
  * Use **off-the-shelf components** such as Keras if possible, since most of the stuff in Keras works well out-of-the-box.
  * Build **complicated data pipelines later**.
* The first step is to **get the model to run**:
  * For **shape mismatch and casting issues**, you should step through your model creation and inference step-by-step in a debugger, checking for correct shapes and data types of your tensors.
  * For **out-of-memory issues**, you can scale back your memory-intensive operations one-by-one.
  * For **other issues**, simply Google it. StackOverflow would be great most of the time.
* The second step is to have the model **overfit a single batch**:
  * **Error goes up:** Commonly this is due to a flip sign somewhere in the loss function/gradient.
  * **Error explodes:** This is usually a numerical issue, but can also be caused by a high learning rate.
  * **Error oscillates:** You can lower the learning rate and inspect the data for shuffled labels or incorrect data augmentation.
  * **Error plateaus:** You can increase the learning rate and get rid of regularization. Then you can inspect the loss function and the data pipeline for correctness.
* The third step is to **compare the model to a known result**:
  * The most useful results come from **an official model implementation** **evaluated on a similar dataset to yours**.
  * If you can’t find an official implementation on a similar dataset, you can compare your approach to results from **an official model implementation evaluated on a benchmark dataset**.
  * If there is no official implementation of your approach, you can compare it to results from **an unofficial model implementation**.
  * Then, you can compare to results from **a paper with no code**, results from **the model on a benchmark dataset**, and results from **a similar model on a similar dataset**.
  * An under-rated source of results come from **simple baselines**, which can help make sure that your model is learning anything at all.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://fall2019.fullstackdeeplearning.com/course-content/training-and-debugging/debug.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
