Skip to content

Add Ops for Gaussian Hypergeometric Function, Pochhammer Symbol, and Factorials #90

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 5, 2023

Conversation

ColtAllen
Copy link
Contributor

This PR is a continuation of #87 in a new fork and adds Ops for the Gaussian Hypergeometric Function, Pochhammer Symbol, and Factorials as hyp2f1, poch, and factorial, respectively.

hyp2f1 involves an infinite summation and uses a scipy implementation for this reason, but once scan is performant enough to assume this task, hyp2f1 can be rewritten in terms of poch and factorial in a future PR.

The only test failing is test_grad for hyp2f1. It seems to be a data type issue:

TypeError: ('float() argument must be a string or a number, not \'ScalarVariable\'\n
Apply node that caused the error: Elemwise{hyp2f1_der}(input 0, input 1, input 2, input 3, TensorConstant{(1, 1) of 0.0})\n
Toposort index: 0\nInputs types: [TensorType(float64, (2, 3)), TensorType(float64, (2, 3)), TensorType(float64, (2, 3)), TensorType(float64, (2, 3)), TensorType(float64, (1, 1))]\n
Inputs shapes: [(2, 3), (2, 3), (2, 3), (2, 3), (1, 1)]\n
Inputs strides: [(24, 8), (24, 8), (24, 8), (24, 8), (8, 8)]\n
Inputs values: [\'not shown\', \'not shown\', \'not shown\', \'not shown\', array([[0.]])]\n
Outputs clients: [[Elemwise{Mul}[(0, 1)](random_projection, Elemwise{hyp2f1_der}.0)]]\n\n
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):\n  
File "/mnt/c/Users/colta/portfolio/pytensor/tests/tensor/utils.py", line 590, in test_grad\n    
utt.verify_grad(\n  
File "/mnt/c/Users/colta/portfolio/pytensor/tests/unittest_tools.py", line 70, in verify_grad\n    
orig_verify_grad(op, pt, n_tests, rng, *args, **kwargs)\n  
File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 1844, in verify_grad\n    
symbolic_grad = grad(cost, tensor_pt, disconnected_inputs="ignore")\n  
File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 619, in grad\n    _
rval: Sequence[Variable] = _populate_grad_dict(\n  File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 1426, in _populate_grad_dict\n    
rval = [access_grad_cache(elem) for elem in wrt]\n  
File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 1426, in <listcomp>\n    
rval = [access_grad_cache(elem) for elem in wrt]\n  
File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 1381, in access_grad_cache\n    
term = access_term_cache(node)[idx]\n  
File "/mnt/c/Users/colta/portfolio/pytensor/pytensor/gradient.py", line 1209, in access_term_cache\n    
input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)\n\n
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.', 
'Test Elemwise{hyp2f1,no_inplace}::normal: Error occurred while computing the gradient on the following inputs: 
[array([[764.16214925, 550.49823533, 542.19109217],\n       [613.93532095, 341.6967123 , 284.38215306]]), 
array([[764.16214925, 550.49823533, 542.19109217],\n       [613.93532095, 341.6967123 , 284.38215306]]), 
array([[764.16214925, 550.49823533, 542.19109217],\n       [613.93532095, 341.6967123 , 284.38215306]]), 
array([[0.38208107, 0.27524912, 0.27109555],\n       [0.30696766, 0.17084836, 0.14219108]])]')

@ricardoV94 ricardoV94 changed the base branch from downstream_1288 to main December 7, 2022 16:01
@ricardoV94
Copy link
Member

@ColtAllen Do you mind squashing your commits into: 1) Add factorial and poch helpers and 2) Add Hyp2F1 and derivatives?

@ColtAllen
Copy link
Contributor Author

@ColtAllen Do you mind squashing your commits into: 1) Add factorial and poch helpers and 2) Add Hyp2F1 and derivatives?

Commits are squashed, but I edited the wrong commit message; sorry about that.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 7, 2022

You can edit the commit message with interactive rebase (backup your branch first if it's your first time)

https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/changing-a-commit-message

@twiecki
Copy link
Member

twiecki commented Dec 21, 2022

@ColtAllen Can you run pre-commit on this?

@ColtAllen
Copy link
Contributor Author

Hey @twiecki,

After a conversation with @ricardoV94, we decided the next step(s) are for the hypf1 derivatives to be written in Stan. He offered to do so since he's more familiar with that language. This could involve some considerable changes to what has already been written, so it may be best to wait until afterward to run a pre-commit.

@codecov-commenter
Copy link

codecov-commenter commented Dec 30, 2022

Codecov Report

Merging #90 (10d4d94) into main (8051ffb) will increase coverage by 0.00%.
The diff coverage is 84.21%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main      #90   +/-   ##
=======================================
  Coverage   79.95%   79.95%           
=======================================
  Files         170      170           
  Lines       44856    44950   +94     
  Branches     9498     9510   +12     
=======================================
+ Hits        35863    35942   +79     
- Misses       6780     6790   +10     
- Partials     2213     2218    +5     
Impacted Files Coverage Δ
pytensor/scalar/math.py 85.00% <82.55%> (-0.30%) ⬇️
pytensor/tensor/inplace.py 100.00% <100.00%> (ø)
pytensor/tensor/math.py 90.69% <100.00%> (+0.01%) ⬆️
pytensor/tensor/special.py 90.90% <100.00%> (+0.26%) ⬆️

@ricardoV94
Copy link
Member

@OriolAbril does the error in RTD make any sense to you?

@OriolAbril
Copy link
Member

Doesn't mean anything to me, I'd need to try and reproduce it locally to get a better idea :/. Has nothing similar ever happened outside rtd?

@ricardoV94
Copy link
Member

@ColtAllen I've finally come around this one. Do you want to have a look?

Comment on lines +1529 to +1547
def check_2f1_converges(a, b, c, z) -> bool:
num_terms = 0
is_polynomial = False

def is_nonpositive_integer(x):
return x <= 0 and x.is_integer()

if is_nonpositive_integer(a) and abs(a) >= num_terms:
is_polynomial = True
num_terms = int(np.floor(abs(a)))
if is_nonpositive_integer(b) and abs(b) >= num_terms:
is_polynomial = True
num_terms = int(np.floor(abs(b)))

is_undefined = is_nonpositive_integer(c) and abs(c) <= num_terms

return not is_undefined and (
is_polynomial or np.abs(z) < 1 or (np.abs(z) == 1 and c > (a + b))
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just idle curiosity, but what is the significance of the num_terms variable?

Copy link
Member

@ricardoV94 ricardoV94 Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, when the series has a negative integer in the denominator it will eventually have a divide by zero and blow up... unless it has a negative integer in the numerator that will arrive at zero first, truncating the series before that happens

@ColtAllen
Copy link
Contributor Author

@ColtAllen I've finally come around this one. Do you want to have a look?

Nice! We discussed using Stan, but it looks like you were able to get this running with numpy. Will there be a performance hit in doing so?

Also, thinking long-term, do you think this is good to go for the conceivable future, or are there known bottlenecks worth revisiting later as additional backends are adopted and improvements made elsewhere in pytensor? If so these are worth documenting for future work.

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 3, 2023

I didn't think of using Stan directly just adapting their implementation. Using numpy is certainly not the most efficient thing we can do. The broader issue is discussed here: #83

I want to try to take a stab at it in the following days, but for now I think numpy will suffice. Depending on the range of values it can actually converge pretty quickly (~10 iterations).

Actually I wanted to ask if you could try it in the model that motivated the issue and see how it fares.

@OriolAbril
Copy link
Member

The rtd issue doesn't seem to happen in other PRs :/.

@michaelosthege
Copy link
Member

rtd build should get fixed by a rebase @ricardoV94

@ricardoV94 ricardoV94 added the enhancement New feature or request label Jan 4, 2023
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-approval warning :)

@twiecki twiecki merged commit 9b2cb97 into pymc-devs:main Jan 5, 2023
@ColtAllen ColtAllen deleted the downstream_1288 branch January 10, 2023 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants