How to find the slope in the case of a multi-stage function --Differentiation of the composite function

I will explain how to find the slope in the case of a multi-stage function. How to find the slope is explained in What is the slope, so it is assumed that you understand this.

When using a multi-stage function to find the slope, the terminology of mathematics describes it as the derivative of a composite function. Here, we will explain in easy-to-understand words that software engineers can understand.

Apply functions in multiple stages

First, consider a sample in which a squared function and a doubled function are applied in succession.

use strict;
use warnings;

sub pow2 {
  my ($input) = @_;
  
  my $output = $input ** 2;
  
  return $output;
}

sub mul2 {
  my ($input) = @_;
  
  my $output = $input * 2;
  
  return $output;
}

my $input = 3;
my $output_pow2 = pow2 ($input);
my $output_mul2_pow2 = mul2 ($output_pow2);

print "$output_mul2_pow2\n";

We squared 3 and doubled it, so the result is 18.

Find the value based on the definition of slope

Let's find the value based on the definition of slope. The function is multi-stage, but it's not difficult because you only see the ratio of the small change of the first input to the small change of the last output (the denominator is the small change of the input).

use strict;
use warnings;

sub pow2 {
  my ($input) = @_;
  
  my $output = $input ** 2;
  
  return $output;
}

sub mul2 {
  my ($input) = @_;
  
  my $output = $input * 2;
  
  return $output;
}

my $input = 3;
my $output_pow2 = pow2 ($input);
my $output_mul2_pow2 = mul2 ($output_pow2);

my $delta = 0.00000001;
my $input_plus_delta = $input + $delta;
my $output_pow2_plus_delta = pow2 ($input_plus_delta);
my $output_mul2_pow2_plus_delta = mul2 ($output_pow2_plus_delta);

my $grad = ($output_mul2_pow2_plus_delta-$output_mul2_pow2) / ($input_plus_delta-$input);

# 12
print "$grad\n";

The slope is now 12.

Formula for finding the slope of a multi-stage function

In fact, there is a formula for finding the slope of a multi-stage function. All you have to do is multiply the result of the slope obtained using the derivative of each function.

# Formula for finding the slope of a multi-stage function
my $grad = pow2_derivative ($input) * mul2_derivative ($input_plus_delta);

Now, let's see if the results match, excluding the error, given the derivatives of each function.

use strict;
use warnings;

sub pow2 {
  my ($input) = @_;
  
  my $output = $input ** 2;
  
  return $output;
}

sub pow2_derivative {
  my ($input) = @_;
  
  my $output = $input * 2;
  
  return $output;
}


sub mul2 {
  my ($input) = @_;
  
  my $output = $input * 2;
  
  return $output;
}

sub mul2_derivative {
  my ($input) = @_;
  
  my $output = 2;
  
  return $output;
}

my $input = 3;
my $output_pow2 = pow2 ($input);
my $output_mul2_pow2 = mul2 ($output_pow2);

my $delta = 0.00000001;
my $input_plus_delta = $input + $delta;
my $output_pow2_plus_delta = pow2 ($input_plus_delta);
my $output_mul2_pow2_plus_delta = mul2 ($output_pow2_plus_delta);

#Slope obtained using the definition
my $grad = ($output_mul2_pow2_plus_delta-$output_mul2_pow2) / ($input_plus_delta-$input);

# Slope obtained using the formula
my $grad_formula = pow2_derivative ($input) * mul2_derivative ($output_pow2_plus_delta);

# 12
print "$grad_formula\n";

The result was 12. It suits you. This time, there seems to be no error.

This time, how to find the slope in the case of 2 steps, but the way of thinking does not change whether it is 3 steps or 4 steps.

In the case of deep learning, the term partial derivative is used, but think of it as meaning to find the slope for one input. Partial differentiation is to consider it as a constant except for one input to change. Therefore, if you can understand the contents of this time, you can understand the partial differential naturally.

When using mathematical formulas, you need to understand the words of math and the implicit understanding that the words of math contain, but when expressed in code, everything is expressed in code, so for software engineers. I think it feels easy to understand.

Associated Information