Skip to content

Implement value_gradient_and_hessian #305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 6, 2024
Merged

Implement value_gradient_and_hessian #305

merged 14 commits into from
Jun 6, 2024

Conversation

gdalle
Copy link
Member

@gdalle gdalle commented Jun 5, 2024

DI source

  • Add value_gradient_and_hessian for dense and sparse backends
  • Adjust extras accordingly to prepare the gradient too
  • Add maybe_inner, maybe_outer and maybe_dense_ad logic to accommodate first-order backends

DI extensions

Add operator for:

  • FastDifferentiation
  • FiniteDiff
  • ForwardDiff
  • PolyesterForwardDiff
  • ReverseDiff
  • Symbolics
  • Zygote

DI docs

  • Document value_gradient_and_hessian

DIT source

  • Add correctness, benchmark and type stability tests for value_gradient_and_hessian

@gdalle gdalle changed the title Implement value_gradient_and_hessian Implement value_gradient_and_hessian Jun 5, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jun 5, 2024

Codecov Report

Attention: Patch coverage is 96.49123% with 6 lines in your changes missing coverage. Please review.

Project coverage is 96.32%. Comparing base (af0b6ce) to head (de8cbb7).

Files Patch % Lines
...ferentiationInterfaceTest/src/tests/correctness.jl 84.00% 4 Missing ⚠️
DifferentiationInterface/src/utils/maybe.jl 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #305      +/-   ##
==========================================
+ Coverage   96.11%   96.32%   +0.20%     
==========================================
  Files          94       95       +1     
  Lines        4191     4294     +103     
==========================================
+ Hits         4028     4136     +108     
+ Misses        163      158       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gdalle gdalle linked an issue Jun 5, 2024 that may be closed by this pull request
@gdalle gdalle merged commit bb75cf8 into main Jun 6, 2024
50 checks passed
@gdalle gdalle deleted the gd/grad_and_hess branch June 6, 2024 07:26
@gerlero
Copy link
Contributor

gerlero commented Jun 6, 2024

@gdalle Thanks a lot for the work! I've already managed to test the new functionality with my projects and can confirm that it works.

Unfortunately for the case of ForwardDiff (although not a surprise to me given it's what prompted JuliaDiff/AbstractDifferentiation.jl#122), the DiffResults-based implementation of value_and_derivative and value_derivative_and_second_derivative is a bit slower (by 5% in my test code that does a lot of derivation) than AbstractDifferentiation's implementation based on manipulating dual numbers.

with AbstractDifferentiation:
5.917 ms (1243 allocations: 111.25 KiB)

with DifferentiationInterface:
6.207 ms (1243 allocations: 111.25 KiB)

Although the difference is small, I can't justify the migration of my code from AbstractDifferentiation if it'll be slower. That is unless you're okay with also taking AbstractDifferentiation's implementation verbatim here (even if it won't work for non-scalar functions)?

@gdalle
Copy link
Member Author

gdalle commented Jun 6, 2024

Wait, we can do value_derivative_and_second_derivative using the public ForwardDiff API and DiffResults? I thought this only worked for the hessian (which computes gradient as a side product if you supply a DiffResult).

In any case, yes, a PR would be welcome with a custom dual number implementation. Let's start with a version that works in the scalar case, and then make it work in the vector case.
Here is the file you need to modify: https://github.com/gdalle/DifferentiationInterface.jl/blob/main/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl
And here are some utilities that are useful for writing code that works on both scalar and vector output: https://github.com/gdalle/DifferentiationInterface.jl/blob/main/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/utils.jl
You can take a look at DI.pushforward to get inspiration.

@gdalle
Copy link
Member Author

gdalle commented Jun 6, 2024

Do you want to open said PR? I don't have complete confidence in my ability to get your code working on vectors, so it definitely would be great if you could help.

@gerlero
Copy link
Contributor

gerlero commented Jun 6, 2024

Sure, I'll start it later today (also don't think I can get it working on vectors, but it's worth a try)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return gradient with hessian, or derivative with second_derivative
3 participants