Skip to content

Nodes #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 26, 2022
Merged

Nodes #35

merged 15 commits into from
Aug 26, 2022

Conversation

uncomputable
Copy link
Collaborator

This PR removes linear programs entirely and replaces them with heap structures.

Philosophy

Linear programs exist because of the (de)serialisation of Simplicity programs. However, besides that there is really no need for index bookkeeping in our code. On top of that, sharing and pruning can are transport-layer optimisations and should be done during serialisation. Therefore, let us use one of Rust's strengths instead, namely reference bookkeeping.

Immediate effects

The code is more concise and readable. Children of left and right assertions are checked to be hidden nodes (which already gave away a faulty program). Providing a witness during finalisation gives proper errors and never panics. Fuzz tests have been temporarily removed, because programs can no longer be encoded (future PR).

Future extensions

By constructing programs inductively, we enable many useful features: We could locally check if the types of children nodes match the type of their parent as we construct it, and provide an error to the user if there is a mismatch. We could implement a program factory that keeps track of the sub-programs it has produced and that returns an Rc::clone whenever the same sub-program is produced twice. In general, freeing ourselves from indices allows for many runtime optimisations in Rust.

- rewrite once encoding of programs is done
@uncomputable uncomputable requested a review from apoelstra August 22, 2022 11:51
@apoelstra
Copy link
Collaborator

This might be a huge PITA but could you change the commits which comment out whole blocks with // to just wrap the code blocks in /*? It would greatly reduce the about of diff and for some reason my differ is not smart enough to highlight that the // -prefixed code is the same.

@uncomputable
Copy link
Collaborator Author

Will do; this is a useful trick. Removing whole files and re-pasting the adapted parts later is also an option. It disables git blame (which is why I used comments), but maybe that is not too bad.

@apoelstra
Copy link
Collaborator

If you remove them and re-paste in later commits that would be very hard to review because I'd have to combine the commits' diffs to check that there had been no net change.

Breaking git blame isn't the end of the world, it's pretty common to have to re-blame many times to get to a point in history prior to reformatting/moving/whatever.

- temporarily disable most of the library
@uncomputable
Copy link
Collaborator Author

Done adding block comments. As far as I can see, this reduced the additions / deletions by ~1000 lines.

@uncomputable
Copy link
Collaborator Author

uncomputable commented Aug 24, 2022

We could think about removing the generic parameter Witness for Node, like for Program. I don't know what your plans are for generic witness types. Convert CommitNode<Witness, App> to CommitNode<Value, App> via functions that are specific to the witness type? For instance, "if Witness = (), then provide a witness via an iterator over Value."

pub(crate) mod schnorr0;
pub(crate) mod schnorr6;
pub(crate) mod sighash_all;
// FIXME: Schnorr and SIGHASH_ALL programs have left assertions whose children are not a hidden node
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a silly mistake on my part (left and right hidden nodes being reversed). Schnorr and SIGHASH_ALL work and I will force push the fix.

@apoelstra
Copy link
Collaborator

apoelstra commented Aug 25, 2022

We could think about removing the generic parameter Witness for Node, like for Program. I don't know what your plans are for generic witness types. Convert CommitNode<Witness, App> to CommitNode<Value, App> via functions that are specific to the witness type? For instance, "if Witness = (), then provide a witness via an iterator over Value."

I think my intention here was to simulate your commit-time/redeem-time distinction by swapping out Value for () for programs that did not have any witnesses. What you've done is cleaner. We can remove the generic.

More specifically, we can remove Witness entirely from CommitNode (we still need the variant of course but it doesn't need any associated data). Then in Node we'd have Witess(Value) rather than Witness(Witness).

@apoelstra
Copy link
Collaborator

Can you remind me what purpose RefWrapper serves?

@uncomputable
Copy link
Collaborator Author

The RefWrapper implements Copy, Eq and Hash for node references via pointer methods. This makes it very easy to put references into hash maps and to compare two nodes.

@apoelstra
Copy link
Collaborator

Ah, that's right! We should definitely have a doccomment explaining that.

@apoelstra
Copy link
Collaborator

apoelstra commented Aug 25, 2022

Done reviewing 637576b. Overall this looks great. I think we should merge it as-is and do fixups later so that we're not iterating on such a massive diff. But there are a few missing pieces:

  • We should export Node and CommitNode at the crate level
  • Type checking should be done locally and in CommitNode (I know this is a big job, but just mentioning it here)
  • ...and then re-done for all nodes that are affected by Witness nodes during conversion
  • Decoding needs to check sharing before/during conversion to the internal representationi
  • As discussed elsewhere we can drop the Witness generic. CommitNode should have no witness data and Node should use Value as its witness data.

I am also musing about whether or not we should rename Node to RedeemNode or something. I'm not sure what the best word for this is. I guess we should nail down what the rules/invariants for these types are. My understanding is

  • CommitNode should be a complete tree, i.e. have no hidden nodes. (Is this true? Maybe not..)
  • CommitNode should have no witnesses (I'm pretty confident this is true)
  • Node may have hidden nodes (also confident)
  • Node must have witnesses (less confident).

I guess my observation here is that there are two distinctions to be made: whether or not the tree is complete in the sense of having no hidden nodes, and whether or not the tree has witnesses. Which one are we trying to capture here?

Copy link
Collaborator

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 637576b

@uncomputable
Copy link
Collaborator Author

uncomputable commented Aug 26, 2022

I completely agree with the missing pieces. The broken decoding only became visible after I re-enabled fuzzing in a follow-up branch. I will create a fix-up PR.

What do you mean by redoing type checking for nodes "that are affected by Witness nodes"? Isn't the value type of a witness node equal to its node type? Maybe this rule does not apply when Witness is neither () nor Value. In this case, the issue will be resolved by removing the Witness generic.

@uncomputable
Copy link
Collaborator Author

Your understanding of the node types is pretty spot-on: CommitNode is for the time of commitment and Node is for the time of redemption. Renaming Node to RedeemNode would make its purpose more explicit.

  • CommitNode must enable hidden nodes, because programs are first decoded as CommitNode and then finalized as Node. If the program is pruned, then it has hidden nodes.
  • CommitNode should not have witness data, like you said.
  • Node may have hidden nodes, just like CommitNode and like you said.
  • Node must have witness data because that is used to compute the IMR. There is no definition of an IMR without witness data and these Merkle roots cannot be changed later, so the witness data must be there from the beginning.

@apoelstra
Copy link
Collaborator

What do you mean by redoing type checking for nodes "that are affected by Witness nodes"? Isn't the value type of a witness node equal to its node type?

You're right, sorry, I was confusing witness addition with pruning. But as you say, the Node/CommitNode distinction has nothing to do with pruning, so we don't need to worry about this. (Later on, when we support pruning (and unpruning?) we'll have to think about this.)

@uncomputable
Copy link
Collaborator Author

Pruning will be fun. As far as I have thought it through, we construct a new Node (tree) as we execute a given Node on the Bit Machine. We replace case with assertl and assertr as appropriate. The IMRs may change in the process. The new Node goes into sharing as it is encoded into bits.

@apoelstra apoelstra merged commit 8f3ad9e into BlockstreamResearch:master Aug 26, 2022
@uncomputable uncomputable deleted the nodes branch August 29, 2022 18:56
@uncomputable uncomputable mentioned this pull request Aug 30, 2022
uncomputable added a commit that referenced this pull request Sep 6, 2022
acea6e8 Rename rtt_natural (Christian Lewe)
9b327d5 Add fuzz test for decoding of witness (Christian Lewe)
a9bf0f1 Add fuzz test for decoding of programs (Christian Lewe)
15bfbfb Re-enable test programs (Christian Lewe)
d61e946 Add Debug and Display methods (Christian Lewe)
db0f762 Export de/encoding during fuzzing only (Christian Lewe)
6b6cec1 Add encoding of programs without witness (Christian Lewe)
3bdee2a Adapt encoding (Christian Lewe)
2ff57bd Add powers of two to Type (Christian Lewe)
bbad34e Fix decoding (Christian Lewe)

Pull request description:

  This PR adapts the encoding of programs to the new heap structures from #35. Importantly, programs without witness data can now be encoded and compressed via sharing. This works by populating each witness node with a `Unit` value during encoding, and requires that witness nodes are made fresh during decoding _(i.e., each witness node has at most one parent)_. Fuzz and unit tests that were disabled in #35 are adapted and re-enabled. An implementation of `Display` is added to programs for easier debugging.

ACKs for top commit:
  apoelstra:
    ACK acea6e8

Tree-SHA512: 227151d510e2599fdc2c6b6e626a53d22d46911f14813f731b6b6b03c8ce0af9794fba81595f5d143ab694855ece2da8e357878222ddaf678754f66232f58909
@uncomputable uncomputable mentioned this pull request Sep 6, 2022
uncomputable added a commit that referenced this pull request Sep 7, 2022
26fd68b Export node types at crate level (Christian Lewe)
46b9a42 Fix cargo doc (Christian Lewe)
5221ae1 Add documentation of RefWrapper's (Christian Lewe)
57a5891 Remove Witness generic (Christian Lewe)
967c737 Cargo fmt (Christian Lewe)
6df4111 Rename Node to RedeemNode (Christian Lewe)

Pull request description:

  This PR is a follow-up of #35 and fixes various things that came up in the discussion. `Node` is renamed to `RedeemNode` for more clarity. The `Witness` generic is removed so that `CommitNode` contains no witness data and `RedeemNode` contains `Value`. The node types are exported at crate level. Finally, documentation is improved.

ACKs for top commit:
  apoelstra:
    ACK 26fd68b
  sanket1729:
    ACK 26fd68b

Tree-SHA512: 5aa1855818af4c651a3e4541c9a68cabea8537bd75d80d891a29680589f3caba4ce6b8ce7406111be0ad39292bd82f1f08b59478cac9fffac341449a129ff3ab
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants