ML in a nutshell
Optimization, and machine learning, are intimately connected. At a very coarse level, ML works as follows.
First, you come up somehow with a very complicated model, which computes an output as a function of an input and of a vector of parameters . In general, , , and are vectors, as the model has multiple inputs, multiple outputs, and several parameters. The model needs to be complicated, because only complicated models can represent complicated phenomena; for instance, can be a multi-layer neural net with parameters , where is the number of parameters of the model.
Second, you come up with a notion of loss, that is, how badly the model is doing. For instance, if you have a list of inputs , and a set of desired outputs , you can use as loss:
Here, we wrotebecause, once the inputs and the desired outputs are chosen, the loss depends only on .
Once the loss is chosen, you decrease it, by computing its gradient with respect to. Remembering that ,
The gradient is a vector that indicates how to tweakto decrease the loss. You then choose a small step size , and you update via . This makes the loss a little bit smaller, and the model a little bit better. If you repeat this step many times, the model will hopefully get (a good bit) better.
It is possible to use these advanced ML libraries without ever knowing what is under the hood, and how autogradient works. Here, we will insted dive in, and implement autogradient.
Building a modelcorresponds to building an expression with inputs , . We will provide a representaton for expressions that enables both the calculation of the expression value, and the differentiation with respect to any of the inputs. This will enable us to implement autogradient. On the basis of this, we will be able to implement a simple ML framework.
We say we, but we mean you. You will implement it; we will just provide guidance.
Question 1 With these clarifications, we ask you to implement the compute_gradient method, which again must:
Question 2: Rounding up the implementation
Question 3: Implementation of the fit function