Initial value of He
Let's find the initial value of He. The initial value of He is random number that follows a normal distribution, with the mean specified as 0 and the standard deviation specified as "sqrt(2 / number of inputs)". ..
The number of inputs is the value of m in the conversion from m to n.
Initial value of weight when mainly using ReLU function as activation function It seems to be used for. By choosing a good initial value, the value to which the activation function is applied after the conversion from m to n in each layer will vary moderately.
Get the initial value of #He sub he_init_value { my ($inputs_length) = @_; return randn (0, sqrt(1 / $inputs_length)); }
Initialize an array of weights using the initial value of He
use strict; use warnings; # Function to find random numbers that follow a normal distribution # $m is mean, $sigma is standard deviation, sub randn { my ($m, $sigma) = @_; my ($r1, $r2) = (rand(), rand()); while ($r1 == 0) {$r1 = rand();} return($sigma * sqrt(-2 * log($r1)) * sin(2 * 3.14159265359 * $r2)) + $m; } # Create initial value of He sub create_he_init_value { my ($inputs_length) = @_; return randn (0, sqrt(2 / $inputs_length)); } #Create an array with the default value of He sub array_create_he_init_value { my ($array_length, $inputs_length) = @_; my $nums_out = []; for (my $i = 0; $i <$array_length; $i ++) { $nums_out->[$i] = create_he_init_value ($inputs_length); } return $nums_out; } # If the number of inputs is 728 and the number of outputs is 30, the length of the array of the matrix is "728 * 30". my $inputs_length = 728; my $outputs_length = 30; my $weights_mat = { rows_length => $outputs_length, columns_length => $inputs_length, };; my $weights_values_length = $inputs_length * $outputs_length; $weights_mat->{values} = array_create_he_init_value ($weights_values_length, $inputs_length); use Data::Dumper; print Dumper $weights_mat;
Initial values other than the initial value of He
When using the sigmoid function as the activation function, it is better to use Xavier default value.