Content



Hyper Parameter Optimization

$$ M(x_1, x_2, x_3, ... , x_n) = \sum_{q=0}^{2n} \Phi_q\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right). $$ At the initial stage, we limit the number of possible options of the model to the following list:
  • Number of addends. Single addend not look as scary as entire network
    $$\Phi\left(\sum_{p=1}^{n} \phi_{p}(x_{p})\right). $$ We may add and remove addends during the training.

  • We express all functions as the same length sequential linear blocks. The optimal number of blocks must be determined in training.

  • Parameter of regularization.