Hyper Parameter Optimization
$$ M(x_1, x_2, x_3, ... , x_n) = \sum_{q=0}^{2n} \Phi_q\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right). $$
At the initial stage, we limit the number of possible options of the model to the following list:
- Number of addends. Single addend not look as scary as entire network
$$\Phi\left(\sum_{p=1}^{n} \phi_{p}(x_{p})\right). $$
We may add and remove addends during the training.
- We express all functions as the same length sequential linear blocks. The optimal number of blocks must be determined in training.
- Parameter of regularization.
|
|