autodiff_reverse

Attribute Macro autodiff_reverse 

Source
#[autodiff_reverse]
🔬This is a nightly-only experimental API. (autodiff #124509)
Expand description

This macro handles automatic differentiation. This macro uses reverse-mode automatic differentiation to generate a new function. It may only be applied to a function. The new function will compute the derivative of the function to which the macro was applied.

The expected usage syntax is: #[autodiff_reverse(NAME, INPUT_ACTIVITIES, OUTPUT_ACTIVITY)]

  • NAME: A string that represents a valid function name.
  • INPUT_ACTIVITIES: Specifies one valid activity for each input parameter.
  • OUTPUT_ACTIVITY: Must not be set if the function implicitly returns nothing (or explicitly returns -> ()). Otherwise, it must be set to one of the allowed activities.

ACTIVITIES might either be Active, Duplicated or Const, more options will be exposed later.

Active can be used for float scalar values. If used on an input, a new float will be appended to the return tuple of the generated function. If the function returns a float scalar, Active can be used for the return as well. In this case a float scalar will be appended to the argument list, it works as seed.

Duplicated can be used on references, raw pointers, or other indirect input arguments. It creates a new shadow argument of the same type, following the original argument. A const reference or pointer argument will receive a mutable reference or pointer as shadow.

Const should be used on non-float arguments, or float-based arguments as an optimization if we are not interested in computing the derivatives with respect to this argument.

§Usage examples:

#![feature(autodiff)]
use std::autodiff::*;
#[autodiff_reverse(rb_rev, Active, Active, Active)]
fn rosenbrock(x: f64, y: f64) -> f64 {
    (1.0 - x).powi(2) + 100.0 * (y - x.powi(2)).powi(2)
}
#[autodiff_reverse(rb_inp_rev, Active, Active, Duplicated)]
fn rosenbrock_inp(x: f64, y: f64, out: &mut f64) {
    *out = (1.0 - x).powi(2) + 100.0 * (y - x.powi(2)).powi(2);
}

fn main() {
    let (output1, dx1, dy1) = rb_rev(1.0, 3.0, 1.0);
    dbg!(output1, dx1, dy1); // (400.0, -800.0, 400.0)
    let mut output2 = 0.0;
    let mut seed = 1.0;
    let (dx2, dy2) = rb_inp_rev(1.0, 3.0, &mut output2, &mut seed);
    // (dx2, dy2, output2, seed) == (-800.0, 400.0, 400.0, 0.0)
}

We often want to track how one or more input floats affect one output float. This output can be a scalar return value, or a mutable reference or pointer argument. In the latter case, the mutable input should be marked as duplicated and its shadow initialized to 0.0. The shadow of the output should be marked as active or duplicated and initialized to 1.0. After calling the generated function, the shadow(s) of the input(s) will contain the derivatives. The shadow of the outputs (“seed”) will be reset to zero. If the function has more than one output float marked as active or duplicated, users might want to set one of them to 1.0 and the others to 0.0 to compute partial derivatives. Unlike forward-mode, a call to the generated function does not reset the shadow of the inputs. Reverse mode is generally more efficient if we have more active/duplicated input than output floats.

Related information can also be found under the term “Jacobian-Vector Product” (JVP).