-
Notifications
You must be signed in to change notification settings - Fork 405
Trivial Serialization Tweaks #1956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trivial Serialization Tweaks #1956
Conversation
An enum implements de/serialization via `impl_writeable_tlv_based_enum_upgradable` currently cannot contain an object that only implements `MaybeReadable`. This solves that by implementing the required blocks for `ignorable`, opting to return `Ok(None)` in the top-level read in cases where the inner field read returns `Ok(None)`.
...to make it easier to add new implementations and implement it for all tuples which implement `Readabe` + `Writeable`. Note that we don't want to just convert to a blanket implementation as we'd really like to keep our optimized `Vec<u8>` wrapper or we'll end up spinning way too much when writing vecs of bytes.
Codecov ReportBase: 90.72% // Head: 90.75% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1956 +/- ##
==========================================
+ Coverage 90.72% 90.75% +0.03%
==========================================
Files 97 97
Lines 50579 51356 +777
Branches 50579 51356 +777
==========================================
+ Hits 45888 46610 +722
- Misses 4691 4746 +55
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK after Arik's comment
@@ -581,6 +587,9 @@ macro_rules! _init_tlv_based_struct_field { | |||
($field: ident, option) => { | |||
$field | |||
}; | |||
($field: ident, ignorable) => { | |||
if $field.is_none() { return Ok(None); } else { $field.unwrap() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No test coverage here, I assume it's covered in follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I can drop it if you prefer but it is used in #1897
lightning/src/util/ser.rs
Outdated
/// more than 65K entries, we need to be able to store larger values. Thus, we define a variable | ||
/// length integer here which is backwards-compatible but treats 0xffff as "read eight more bytes". | ||
struct U16Or64(pub u64); | ||
impl Writeable for U16Or64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kinda looks like U160r64
. Would prefer to name this either CollectionLength
or at least U16OrU64
which is a little easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strongly agree with Jeff. I only ever read it as 160r64 and didn't realize there was an Or
in there until Jeff's comment. I wasn't sure what either 160
or r
meant, but didn't even think to ask. The O
of Or
remains rather unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol I had to Google it before realizing it was not a zero. Not sure it's worth putting the implementation details into the name, tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, alright, well y'all need to get better terminal fonts, but in the mean time I'll rename it :p
lightning/src/util/ser.rs
Outdated
@@ -694,18 +723,23 @@ macro_rules! impl_for_vec { | |||
impl Writeable for Vec<u8> { | |||
#[inline] | |||
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> { | |||
(self.len() as u16).write(w)?; | |||
U16Or64(self.len() as u64).write(w)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be concerned with writing an incompatible length for an LN message? Or are we leaving that to the caller to enforce / assume will never happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I mean we could add some assertion, but its currently up to the caller anyway (we as
), so it remains up to the caller. If there's something with more than 2^16 elements it won't fit in the wire 2^16-byte messages anyway, though.
0xffffu16.write(writer)?; | ||
(self.0 - 0xffff).write(writer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unlikely, but I don't really see how this is backwards compatible if we write more than 65k entries, seems like we could still fail to deserialize on previous versions. Maybe there's no specific map this is possible for, though.
#[inline] | ||
fn read<R: Read>(r: &mut R) -> Result<Self, DecodeError> { | ||
let mut val: u64 = <u16 as Readable>::read(r)? as u64; | ||
if val == 0xffff { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, if a previous version happened to write 0xffff
exactly, we could fail to deserialize now?
Regarding the 64k-entries stuff - yes, this is technically not backwards compatible for exactly |
LGTM after squash |
lightning/src/util/ser.rs
Outdated
/// The lightning protocol uses u16s for lengths in most cases. As our serialization framework | ||
/// primarily targets that, we must as well. However, because we may serialize objects that have | ||
/// more than 65K entries, we need to be able to store larger values. Thus, we define a variable | ||
/// length integer here which is backwards-compatible but treats 0xffff as "read eight more bytes". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/which is backwards-compatible/that is backwards-compatible for < 0xffff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the last part that says it's backwards compatible only for < 0xffff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if somebody is at exactly 0xffff, they're about to run into problems anyway.
b439740
to
d0c8b14
Compare
Squashed, diff since the initial PR: $ git diff-tree -U1 abbd452fae0956400c0512ea87b15b4fc3e44ef1 d0c8b14c
diff --git a/lightning/src/util/ser.rs b/lightning/src/util/ser.rs
index 8d4ab9f2f..ff5e7c923 100644
--- a/lightning/src/util/ser.rs
+++ b/lightning/src/util/ser.rs
@@ -388,5 +388,9 @@ impl Readable for BigSize {
/// more than 65K entries, we need to be able to store larger values. Thus, we define a variable
-/// length integer here which is backwards-compatible but treats 0xffff as "read eight more bytes".
-struct U16Or64(pub u64);
-impl Writeable for U16Or64 {
+/// length integer here that is backwards-compatible but treats 0xffff as "read eight more bytes".
+///
+/// To ensure we only have one valid encoding per value, we add 0xffff to values written as eight
+/// bytes. Thus, 0xfffe is serialized as 0xfffe, whereas 0xffff is serialized as
+/// 0xffff0000000000000000 (i.e. read-eight-bytes then zero).
+struct U16OrU64(pub u64);
+impl Writeable for U16OrU64 {
#[inline]
@@ -397,3 +401,3 @@ impl Writeable for U16Or64 {
0xffffu16.write(writer)?;
- (self.0 - 0xfffe).write(writer)
+ (self.0 - 0xffff).write(writer)
}
@@ -402,3 +406,3 @@ impl Writeable for U16Or64 {
-impl Readable for U16Or64 {
+impl Readable for U16OrU64 {
#[inline]
@@ -408,5 +412,5 @@ impl Readable for U16Or64 {
val = <u64 as Readable>::read(r)?
- .checked_add(0xfffe).ok_or(DecodeError::InvalidValue)?;
+ .checked_add(0xffff).ok_or(DecodeError::InvalidValue)?;
}
- Ok(U16Or64(val))
+ Ok(U16OrU64(val))
}
@@ -628,3 +632,3 @@ macro_rules! impl_for_map {
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> {
- U16Or64(self.len() as u64).write(w)?;
+ U16OrU64(self.len() as u64).write(w)?;
for (key, value) in self.iter() {
@@ -642,3 +646,3 @@ macro_rules! impl_for_map {
fn read<R: Read>(r: &mut R) -> Result<Self, DecodeError> {
- let len: U16Or64 = Readable::read(r)?;
+ let len: U16OrU64 = Readable::read(r)?;
let mut ret = $constr(len.0 as usize);
@@ -668,3 +672,3 @@ where T: Writeable + Eq + Hash
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> {
- U16Or64(self.len() as u64).write(w)?;
+ U16OrU64(self.len() as u64).write(w)?;
for item in self.iter() {
@@ -681,3 +685,3 @@ where T: Readable + Eq + Hash
fn read<R: Read>(r: &mut R) -> Result<Self, DecodeError> {
- let len: U16Or64 = Readable::read(r)?;
+ let len: U16OrU64 = Readable::read(r)?;
let mut ret = HashSet::with_capacity(cmp::min(len.0 as usize, MAX_BUF_SIZE / core::mem::size_of::<T>()));
@@ -698,3 +702,3 @@ macro_rules! impl_for_vec {
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> {
- U16Or64(self.len() as u64).write(w)?;
+ U16OrU64(self.len() as u64).write(w)?;
for elem in self.iter() {
@@ -709,3 +713,3 @@ macro_rules! impl_for_vec {
fn read<R: Read>(r: &mut R) -> Result<Self, DecodeError> {
- let len: U16Or64 = Readable::read(r)?;
+ let len: U16OrU64 = Readable::read(r)?;
let mut ret = Vec::with_capacity(cmp::min(len.0 as usize, MAX_BUF_SIZE / core::mem::size_of::<$ty>()));
@@ -725,3 +729,3 @@ impl Writeable for Vec<u8> {
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> {
- U16Or64(self.len() as u64).write(w)?;
+ U16OrU64(self.len() as u64).write(w)?;
w.write_all(&self)
@@ -733,3 +737,3 @@ impl Readable for Vec<u8> {
fn read<R: Read>(r: &mut R) -> Result<Self, DecodeError> {
- let mut len: U16Or64 = Readable::read(r)?;
+ let mut len: U16OrU64 = Readable::read(r)?;
let mut ret = Vec::new();
@@ -1076,3 +1080,3 @@ impl Writeable for String {
fn write<W: Writer>(&self, w: &mut W) -> Result<(), io::Error> {
- U16Or64(self.len() as u64).write(w)?;
+ U16OrU64(self.len() as u64).write(w)?;
w.write_all(self.as_bytes()) |
Actually, pushed another rename to align with Jeff/Arik's push above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I blame Github's and my diff tool's fonts.
The lightning protocol uses u16s for lengths in most cases. As our serialization framework primarily targets that, we must as well. However, because we may serialize objects that have more than 65K entries, we want to be able to store larger values. Thus, we define a variable length integer here which is backwards-compatible but treats 0xffff as "read eight more bytes". This doesn't address any specific known issue, but feels like good practice just in case.
6270a8f
to
3e9727b
Compare
Squashed with no further changes. |
val = <u64 as Readable>::read(r)? | ||
.checked_add(0xffff).ok_or(DecodeError::InvalidValue)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wait, can we forgo the addition here and the subtraction in the write by simply ignoring the u16
on read (when val == 0xffff
) and writing the full u64
after 0xffff
in the write (when val >= 0xffff
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, yes, but then we have multiple ways to represent the same integers, which is both awkward, and makes the fuzzers angry, which would be a bit of work to fix and would reduce coverage on the gossip messages we need to round-trip exactly. I don't think the performance penalty here matters, we should really almost never be hitting the 8-byte cases.
These are the serialization prereqs for #1897.