Find the cross entropy error-loss function

Let's write a function in Perl to find the cross entropy error. Cross entropy error is one of the loss functions used to calculate the error between the output result and the expected output result (correct answer).

use strict;
use warnings;

# Cross entropy error
sub cross_entropy_cost {
  my ($outputs, $desired_outputs) = @_;
  
  if (@$outputs! = @$desired_outputs) {
    die "Outputs length is different from Desired length";
  }
  
  my $cross_entropy_cost = 0;
  
  for (my $i = 0; $i <@$outputs; $i ++) {
    $cross_entropy_cost + =-$desired_outputs->[$i] * log($outputs->[$i])-(1- $desired_outputs->[$i]) * log(1-$outputs->[$i] ]);
  }
  
  return $cross_entropy_cost;
}

my $outputs = [0.7, 0.2, 0.1];
my $desired_outputs = [1, 0, 0];
my $cross_entropy_cost = cross_entropy_cost ($outputs, $desired_outputs);

print "$cross_entropy_cost\n";

In deep learning, the weight and bias parameters are adjusted so that the error obtained by the loss function is small.

As a loss function in the problem of pattern recognition, it is better to use the cross entropy error than square sum error because the form of partial differential is difficult and the calculation is complicated. Seems desirable.

Partial derivative of cross entropy error

Let's write the partial derivative of the cross entropy error in Perl. The partial derivative of the loss function is needed when implementing the inverse mispropagation method.

Note that the return value of the partial derivative of the loss function is an array. It is different from the loss function where the return value is one value.

use strict;
use warnings;

sub cross_entropy_cost_delta {
  my ($outputs, $activate_outputs, $desired_outputs) = @_;

  if (@$activate_outputs! = @$desired_outputs) {
    die "Outputs length is different from Desired length";
  }
  
  my $cross_entropy_cost_delta = [];
  for (my $i = 0; $i <@$activate_outputs; $i ++) {
    $cross_entropy_cost_delta->[$i] = $activate_outputs->[$i]-$desired_outputs->[$i];
  }
  
  return $cross_entropy_cost_delta;
}

my $activate_outputs = [0.6, 0, 0.2];
my $desired_outputs = [1, 0, 0];
my $cross_entropy_cost = cross_entropy_cost_delta (undef, $activate_outputs, $desired_outputs);

print "@$cross_entropy_cost\n";

For software engineers, think of the image of partial derivative as the rate of change in output (the value of the loss function) with respect to small changes in individual inputs.

Try increasing the first input value by 0.01. The output has increased by 0.3. The slope is "0.3 / 0.01", which is 30.

Try increasing the following input value by 0.01. The output has increased by 0.5. The slope is "0.5 / 0.01", which is 50.

The difficult word partial derivative may overwhelm your brain, but it's actually easy.

Mathematics might have been easier if the word "ratio of change in output to change in input" was used.

Associated Information