-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lldb/DWARF] Respect member layout for types parsed through declarations #110648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LLDB code for using the type layout data from DWARF was not kicking in for types which were initially parsed from a declaration. The problem was in these lines of code: ``` if (type) layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8; ``` which determine the types layout size by getting the size from the lldb_private::Type object. The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet). The reasons why this hasn't been noticed so far are: - this is a relatively new bug. I haven't checked but I suspect it was introduced in the "delay type definition search" patchset from this summer -- if we search for the definition eagerly, we will always have a cached size value. - it requires the presence of another bug/issue, as otherwise the automatically computed layout will match the real thing. - it reproduces (much) more easily with -flimit-debug-info (though it is possible to trigger it without that flag). My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE).
@@ -251,6 +253,7 @@ c1: | |||
.LB1: | |||
.byte 6 # Abbrev [6] DW_TAG_class_type | |||
.asciz "B1" # DW_AT_name | |||
.byte 8 # DW_AT_byte_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this was necessary because lldb crashed without it (in llvm record layout code). Removal of the check was hiding this problem by recomputing the size without lldb's input. While this is something that we may to fix, I don't think we're likely to run into this situation (structure definition without a size attribute) outside of hyper-reduced test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me, given we end up here only for definition DIEs. (did have couple of questions though)
My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE).
Was it possibly protecting against cases where we generate structure definitions without a size attribute? (as you allude to in your test comment)? Reporting back a ExternalLayout::Size
of 0
back to the RecordLayoutBuilder
seems problematic. If Alignment == 0
, the RecordLayoutBuilder
knows to infer it. But we don't have that for Size
, so it just takes it at face-value. I don't know if that's something we should account for. Probably not?
Side-note, should we also check DW_AT_bit_size
here? The DWARFv5 spec says:
A structure type, union type or class type entry may have either a
DW_AT_byte_size or a DW_AT_bit_size attribute (see Section 2.21 on page 56),
whose value is the amount of storage needed to hold an instance of the structure,
union or class type, including any padding.
The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet)
That sounds worrying. Wonder if we do this anywhere else around the DWARF AST parser. And is there any safeguarding against these recursive calls?
I can't disprove that, but I think it would be a very roundabout way of achieving that. To support the scenario of missing DW_AT_byte_size, I'd probably do something like set the size field to
In principle I would say that we should, but I think this problem is in the same bucket as compiler putting a nonsensical (small) value in the DW_AT_byte_size field. In theory that should not crash lldb, but we're not likely to run into an actual compiler that does that.
We could do that, but then we'd also need to do that in ParseStructureLikeDIE. And we may need to figure out what does it mean to occupy a non-whole number of bytes, so I'd say that's orthogonal to this.
We have a bunch of calls like that, but I don't think they're a problem, as they would just trigger a regular type completion, compute the dwarf-guided layout and then get the size from that. The problem is really specific to this place in the code, where we've already completed the clang type, but we haven't created the layout for it. |
Yea I agree, I think this makes things easier to reason about. And if the "no-DW_AT_byte_size on definition" case comes up, that can be addressed differently.
Yup, since it wasn't there before, happy to defer it.
Ack |
…ons (llvm#110648) LLDB code for using the type layout data from DWARF was not kicking in for types which were initially parsed from a declaration. The problem was in these lines of code: ``` if (type) layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8; ``` which determine the types layout size by getting the size from the lldb_private::Type object. The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet). The reasons why this hasn't been noticed so far are: - this is a relatively new bug. I haven't checked but I suspect it was introduced in the "delay type definition search" patchset from this summer -- if we search for the definition eagerly, we will always have a cached size value. - it requires the presence of another bug/issue, as otherwise the automatically computed layout will match the real thing. - it reproduces (much) more easily with -flimit-debug-info (though it is possible to trigger it without that flag). My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE).
…ons (llvm#110648) LLDB code for using the type layout data from DWARF was not kicking in for types which were initially parsed from a declaration. The problem was in these lines of code: ``` if (type) layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8; ``` which determine the types layout size by getting the size from the lldb_private::Type object. The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet). The reasons why this hasn't been noticed so far are: - this is a relatively new bug. I haven't checked but I suspect it was introduced in the "delay type definition search" patchset from this summer -- if we search for the definition eagerly, we will always have a cached size value. - it requires the presence of another bug/issue, as otherwise the automatically computed layout will match the real thing. - it reproduces (much) more easily with -flimit-debug-info (though it is possible to trigger it without that flag). My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE). (cherry picked from commit 15f9020)
…-20240723 [lldb/DWARF] Respect member layout for types parsed through declarations (llvm#110648)
…ons (llvm#110648) LLDB code for using the type layout data from DWARF was not kicking in for types which were initially parsed from a declaration. The problem was in these lines of code: ``` if (type) layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8; ``` which determine the types layout size by getting the size from the lldb_private::Type object. The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet). The reasons why this hasn't been noticed so far are: - this is a relatively new bug. I haven't checked but I suspect it was introduced in the "delay type definition search" patchset from this summer -- if we search for the definition eagerly, we will always have a cached size value. - it requires the presence of another bug/issue, as otherwise the automatically computed layout will match the real thing. - it reproduces (much) more easily with -flimit-debug-info (though it is possible to trigger it without that flag). My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE). (cherry picked from commit 15f9020)
…-20240723 [lldb/DWARF] Respect member layout for types parsed through declarations (llvm#110648)
…ecordType Became unused since the recent llvm#110648
…ecordType (llvm#120456) Became unused since the recent llvm#110648 (cherry picked from commit 6fd267d)
…o CompleteRecordType (#120456) Became unused since the recent llvm/llvm-project#110648
LLDB code for using the type layout data from DWARF was not kicking in for types which were initially parsed from a declaration. The problem was in these lines of code:
which determine the types layout size by getting the size from the lldb_private::Type object. The problem is that if the type object does not have this information cached, this request can trigger another (recursive) request to lay out/complete the type. This one, somewhat surprisingly, succeeds, but does that without the type layout information (because it hasn't been computed yet). The reasons why this hasn't been noticed so far are:
My fix consists of always fetching type size information from DWARF (which so far existed as a fallback path). I'm not quite sure why this code was there in the first place (the code goes back to before the Great LLDB Reformat), but I don't believe it is necessary, as the type size (for types parsed from definition DIEs) is set exactly from this attribute (in ParseStructureLikeDIE).