-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LLDB] Add unary operators Dereference and AddressOf to DIL #134428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Ilia Kuklin (kuilpd) ChangesFull diff: https://github.com/llvm/llvm-project/pull/134428.diff 9 Files Affected:
diff --git a/lldb/docs/dil-expr-lang.ebnf b/lldb/docs/dil-expr-lang.ebnf
index 0bbbecbdc78c1..13b9cc22e6000 100644
--- a/lldb/docs/dil-expr-lang.ebnf
+++ b/lldb/docs/dil-expr-lang.ebnf
@@ -3,7 +3,9 @@
(* This is currently a subset of the final DIL Language, matching the current
DIL implementation. *)
-expression = primary_expression ;
+expression = unary_expression ;
+
+unary_expression = unary_operator primary_expression ;
primary_expression = id_expression
| "(" expression ")";
diff --git a/lldb/include/lldb/ValueObject/DILAST.h b/lldb/include/lldb/ValueObject/DILAST.h
index 05d87e9cc4b6b..323ebe8dd49ec 100644
--- a/lldb/include/lldb/ValueObject/DILAST.h
+++ b/lldb/include/lldb/ValueObject/DILAST.h
@@ -20,6 +20,13 @@ namespace lldb_private::dil {
enum class NodeKind {
eErrorNode,
eIdentifierNode,
+ eUnaryOpNode,
+};
+
+/// The Unary operators recognized by DIL.
+enum class UnaryOpKind {
+ AddrOf, // "&"
+ Deref, // "*"
};
/// Forward declaration, for use in DIL AST nodes. Definition is at the very
@@ -44,6 +51,8 @@ class ASTNode {
virtual llvm::Expected<lldb::ValueObjectSP> Accept(Visitor *v) const = 0;
+ virtual bool is_rvalue() const { return false; }
+
uint32_t GetLocation() const { return m_location; }
NodeKind GetKind() const { return m_kind; }
@@ -81,6 +90,27 @@ class IdentifierNode : public ASTNode {
std::string m_name;
};
+class UnaryOpNode : public ASTNode {
+public:
+ UnaryOpNode(uint32_t location, UnaryOpKind kind, ASTNodeUP rhs)
+ : ASTNode(location, NodeKind::eUnaryOpNode), m_kind(kind),
+ m_rhs(std::move(rhs)) {}
+
+ llvm::Expected<lldb::ValueObjectSP> Accept(Visitor *v) const override;
+ bool is_rvalue() const override { return m_kind != UnaryOpKind::Deref; }
+
+ UnaryOpKind kind() const { return m_kind; }
+ ASTNode *rhs() const { return m_rhs.get(); }
+
+ static bool classof(const ASTNode *node) {
+ return node->GetKind() == NodeKind::eUnaryOpNode;
+ }
+
+private:
+ UnaryOpKind m_kind;
+ ASTNodeUP m_rhs;
+};
+
/// This class contains one Visit method for each specialized type of
/// DIL AST node. The Visit methods are used to dispatch a DIL AST node to
/// the correct function in the DIL expression evaluator for evaluating that
@@ -90,6 +120,8 @@ class Visitor {
virtual ~Visitor() = default;
virtual llvm::Expected<lldb::ValueObjectSP>
Visit(const IdentifierNode *node) = 0;
+ virtual llvm::Expected<lldb::ValueObjectSP>
+ Visit(const UnaryOpNode *node) = 0;
};
} // namespace lldb_private::dil
diff --git a/lldb/include/lldb/ValueObject/DILEval.h b/lldb/include/lldb/ValueObject/DILEval.h
index 335035d3f9248..0080f4dba9291 100644
--- a/lldb/include/lldb/ValueObject/DILEval.h
+++ b/lldb/include/lldb/ValueObject/DILEval.h
@@ -38,6 +38,18 @@ lldb::ValueObjectSP LookupGlobalIdentifier(llvm::StringRef name_ref,
lldb::DynamicValueType use_dynamic,
CompilerType *scope_ptr = nullptr);
+class FlowAnalysis {
+public:
+ FlowAnalysis(bool address_of_is_pending)
+ : m_address_of_is_pending(address_of_is_pending) {}
+
+ bool AddressOfIsPending() const { return m_address_of_is_pending; }
+ void DiscardAddressOf() { m_address_of_is_pending = false; }
+
+private:
+ bool m_address_of_is_pending;
+};
+
class Interpreter : Visitor {
public:
Interpreter(lldb::TargetSP target, llvm::StringRef expr,
@@ -47,12 +59,29 @@ class Interpreter : Visitor {
llvm::Expected<lldb::ValueObjectSP> Evaluate(const ASTNode *node);
private:
+ llvm::Expected<lldb::ValueObjectSP>
+ EvaluateNode(const ASTNode *node, FlowAnalysis *flow = nullptr);
+
llvm::Expected<lldb::ValueObjectSP>
Visit(const IdentifierNode *node) override;
+ llvm::Expected<lldb::ValueObjectSP> Visit(const UnaryOpNode *node) override;
+
+ lldb::ValueObjectSP EvaluateDereference(lldb::ValueObjectSP rhs);
+
+ FlowAnalysis *flow_analysis() { return m_flow_analysis_chain.back(); }
// Used by the interpreter to create objects, perform casts, etc.
lldb::TargetSP m_target;
llvm::StringRef m_expr;
+ // Flow analysis chain represents the expression evaluation flow for the
+ // current code branch. Each node in the chain corresponds to an AST node,
+ // describing the semantics of the evaluation for it. Currently, flow analysis
+ // propagates the information about the pending address-of operator, so that
+ // combination of address-of and a subsequent dereference can be eliminated.
+ // End of the chain (i.e. `back()`) contains the flow analysis instance for
+ // the current node. It may be `nullptr` if no relevant information is
+ // available, the caller/user is supposed to check.
+ std::vector<FlowAnalysis *> m_flow_analysis_chain;
lldb::ValueObjectSP m_scope;
lldb::DynamicValueType m_default_dynamic;
std::shared_ptr<StackFrame> m_exe_ctx_scope;
diff --git a/lldb/include/lldb/ValueObject/DILLexer.h b/lldb/include/lldb/ValueObject/DILLexer.h
index d15fc382d1623..3508b8b7a85c6 100644
--- a/lldb/include/lldb/ValueObject/DILLexer.h
+++ b/lldb/include/lldb/ValueObject/DILLexer.h
@@ -24,11 +24,13 @@ namespace lldb_private::dil {
class Token {
public:
enum Kind {
+ amp,
coloncolon,
eof,
identifier,
l_paren,
r_paren,
+ star,
};
Token(Kind kind, std::string spelling, uint32_t start)
diff --git a/lldb/include/lldb/ValueObject/DILParser.h b/lldb/include/lldb/ValueObject/DILParser.h
index 9b7a6cd487939..9f09994a54729 100644
--- a/lldb/include/lldb/ValueObject/DILParser.h
+++ b/lldb/include/lldb/ValueObject/DILParser.h
@@ -83,6 +83,7 @@ class DILParser {
ASTNodeUP Run();
ASTNodeUP ParseExpression();
+ ASTNodeUP ParseUnaryExpression();
ASTNodeUP ParsePrimaryExpression();
std::string ParseNestedNameSpecifier();
diff --git a/lldb/source/ValueObject/DILAST.cpp b/lldb/source/ValueObject/DILAST.cpp
index e75958d784627..ea847587501ee 100644
--- a/lldb/source/ValueObject/DILAST.cpp
+++ b/lldb/source/ValueObject/DILAST.cpp
@@ -19,4 +19,8 @@ llvm::Expected<lldb::ValueObjectSP> IdentifierNode::Accept(Visitor *v) const {
return v->Visit(this);
}
+llvm::Expected<lldb::ValueObjectSP> UnaryOpNode::Accept(Visitor *v) const {
+ return v->Visit(this);
+}
+
} // namespace lldb_private::dil
diff --git a/lldb/source/ValueObject/DILEval.cpp b/lldb/source/ValueObject/DILEval.cpp
index 4889834c7a3c1..b6aa349a62e62 100644
--- a/lldb/source/ValueObject/DILEval.cpp
+++ b/lldb/source/ValueObject/DILEval.cpp
@@ -18,6 +18,22 @@
namespace lldb_private::dil {
+static lldb::ValueObjectSP
+ArrayToPointerConversion(lldb::ValueObjectSP valobj,
+ std::shared_ptr<ExecutionContextScope> ctx) {
+ assert(valobj->IsArrayType() &&
+ "an argument to array-to-pointer conversion must be an array");
+
+ uint64_t addr = valobj->GetLoadAddress();
+ llvm::StringRef name = "result";
+ ExecutionContext exe_ctx;
+ ctx->CalculateExecutionContext(exe_ctx);
+ return ValueObject::CreateValueObjectFromAddress(
+ name, addr, exe_ctx,
+ valobj->GetCompilerType().GetArrayElementType(ctx.get()).GetPointerType(),
+ /* do_deref */ false);
+}
+
static lldb::ValueObjectSP LookupStaticIdentifier(
VariableList &variable_list, std::shared_ptr<StackFrame> exe_scope,
llvm::StringRef name_ref, llvm::StringRef unqualified_name) {
@@ -206,10 +222,25 @@ Interpreter::Interpreter(lldb::TargetSP target, llvm::StringRef expr,
: m_target(std::move(target)), m_expr(expr), m_default_dynamic(use_dynamic),
m_exe_ctx_scope(frame_sp) {}
-llvm::Expected<lldb::ValueObjectSP> Interpreter::Evaluate(const ASTNode *node) {
+llvm::Expected<lldb::ValueObjectSP> Interpreter::Evaluate(const ASTNode *tree) {
+ // Evaluate an AST.
+ auto value_or_error = EvaluateNode(tree);
+
+ // Return the computed result-or-error.
+ return value_or_error;
+}
+llvm::Expected<lldb::ValueObjectSP>
+Interpreter::EvaluateNode(const ASTNode *node, FlowAnalysis *flow) {
+ // Set up the evaluation context for the current node.
+ m_flow_analysis_chain.push_back(flow);
// Traverse an AST pointed by the `node`.
- return node->Accept(this);
+ auto value_or_error = node->Accept(this);
+ // Cleanup the context.
+ m_flow_analysis_chain.pop_back();
+ // Return the computed value-or-error. The caller is responsible for
+ // checking if an error occured during the evaluation.
+ return value_or_error;
}
llvm::Expected<lldb::ValueObjectSP>
@@ -232,4 +263,106 @@ Interpreter::Visit(const IdentifierNode *node) {
return identifier;
}
-} // namespace lldb_private::dil
+llvm::Expected<lldb::ValueObjectSP>
+Interpreter::Visit(const UnaryOpNode *node) {
+ FlowAnalysis rhs_flow(
+ /* address_of_is_pending */ node->kind() == UnaryOpKind::AddrOf);
+
+ Status error;
+ auto rhs_or_err = EvaluateNode(node->rhs(), &rhs_flow);
+ if (!rhs_or_err) {
+ return rhs_or_err;
+ }
+ lldb::ValueObjectSP rhs = *rhs_or_err;
+
+ if (rhs->GetCompilerType().IsReferenceType()) {
+ rhs = rhs->Dereference(error);
+ if (error.Fail()) {
+ return llvm::make_error<DILDiagnosticError>(m_expr, error.AsCString(),
+ node->GetLocation(), 1);
+ }
+ }
+ CompilerType rhs_type = rhs->GetCompilerType();
+ switch (node->kind()) {
+ case UnaryOpKind::Deref: {
+ if (rhs_type.IsArrayType())
+ rhs = ArrayToPointerConversion(rhs, m_exe_ctx_scope);
+
+ lldb::ValueObjectSP dynamic_rhs = rhs->GetDynamicValue(m_default_dynamic);
+ if (dynamic_rhs)
+ rhs = dynamic_rhs;
+
+ if (rhs->GetCompilerType().IsPointerType())
+ return EvaluateDereference(rhs);
+ lldb::ValueObjectSP child_sp = rhs->Dereference(error);
+ if (error.Success())
+ rhs = child_sp;
+
+ return rhs;
+ }
+ case UnaryOpKind::AddrOf: {
+ if (node->rhs()->is_rvalue()) {
+ std::string errMsg =
+ llvm::formatv("cannot take the address of an rvalue of type {0}",
+ rhs_type.TypeDescription());
+ return llvm::make_error<DILDiagnosticError>(m_expr, errMsg,
+ node->GetLocation(), 1);
+ }
+ if (rhs->IsBitfield()) {
+ return llvm::make_error<DILDiagnosticError>(
+ m_expr, "address of bit-field requested", node->GetLocation(), 1);
+ }
+ // If the address-of operation wasn't cancelled during the evaluation of
+ // RHS (e.g. because of the address-of-a-dereference elision), apply it
+ // here.
+ if (rhs_flow.AddressOfIsPending()) {
+ Status error;
+ lldb::ValueObjectSP value = rhs->AddressOf(error);
+ if (error.Fail()) {
+ return llvm::make_error<DILDiagnosticError>(m_expr, error.AsCString(),
+ node->GetLocation(), 1);
+ }
+ return value;
+ }
+ return rhs;
+ }
+ }
+
+ // Unsupported/invalid operation.
+ return llvm::make_error<DILDiagnosticError>(
+ m_expr, "invalid ast: unexpected binary operator", node->GetLocation(),
+ 1);
+}
+
+lldb::ValueObjectSP Interpreter::EvaluateDereference(lldb::ValueObjectSP rhs) {
+ // If rhs is a reference, dereference it first.
+ Status error;
+ if (rhs->GetCompilerType().IsReferenceType())
+ rhs = rhs->Dereference(error);
+
+ assert(rhs->GetCompilerType().IsPointerType() &&
+ "invalid ast: must be a pointer type");
+
+ if (rhs->GetDerefValobj())
+ return rhs->GetDerefValobj()->GetSP();
+
+ CompilerType pointer_type = rhs->GetCompilerType();
+ lldb::addr_t base_addr = rhs->GetValueAsUnsigned(0);
+
+ llvm::StringRef name = "result";
+ ExecutionContext exe_ctx(m_target.get(), false);
+ lldb::ValueObjectSP value = ValueObject::CreateValueObjectFromAddress(
+ name, base_addr, exe_ctx, pointer_type,
+ /* do_deref */ false);
+
+ // If we're in the address-of context, skip the dereference and cancel the
+ // pending address-of operation as well.
+ if (flow_analysis() && flow_analysis()->AddressOfIsPending()) {
+ flow_analysis()->DiscardAddressOf();
+ return value;
+ }
+
+ return value->Dereference(error);
+}
+
+} // namespace lldb_private::dil
\ No newline at end of file
diff --git a/lldb/source/ValueObject/DILLexer.cpp b/lldb/source/ValueObject/DILLexer.cpp
index 1f013288c839b..b9c2e7971e3b4 100644
--- a/lldb/source/ValueObject/DILLexer.cpp
+++ b/lldb/source/ValueObject/DILLexer.cpp
@@ -19,6 +19,8 @@ namespace lldb_private::dil {
llvm::StringRef Token::GetTokenName(Kind kind) {
switch (kind) {
+ case Kind::amp:
+ return "amp";
case Kind::coloncolon:
return "coloncolon";
case Kind::eof:
@@ -29,6 +31,8 @@ llvm::StringRef Token::GetTokenName(Kind kind) {
return "l_paren";
case Kind::r_paren:
return "r_paren";
+ case Token::star:
+ return "star";
}
llvm_unreachable("Unknown token name");
}
@@ -82,9 +86,8 @@ llvm::Expected<Token> DILLexer::Lex(llvm::StringRef expr,
return Token(Token::identifier, maybe_word->str(), position);
constexpr std::pair<Token::Kind, const char *> operators[] = {
- {Token::l_paren, "("},
- {Token::r_paren, ")"},
- {Token::coloncolon, "::"},
+ {Token::amp, "&"}, {Token::coloncolon, "::"}, {Token::l_paren, "("},
+ {Token::r_paren, ")"}, {Token::star, "*"},
};
for (auto [kind, str] : operators) {
if (remainder.consume_front(str))
diff --git a/lldb/source/ValueObject/DILParser.cpp b/lldb/source/ValueObject/DILParser.cpp
index a8baba2c06e7a..c233c535c0355 100644
--- a/lldb/source/ValueObject/DILParser.cpp
+++ b/lldb/source/ValueObject/DILParser.cpp
@@ -82,7 +82,37 @@ ASTNodeUP DILParser::Run() {
// expression:
// primary_expression
//
-ASTNodeUP DILParser::ParseExpression() { return ParsePrimaryExpression(); }
+ASTNodeUP DILParser::ParseExpression() { return ParseUnaryExpression(); }
+
+// Parse an unary_expression.
+//
+// unary_expression:
+// unary_operator primary_expression
+//
+// unary_operator:
+// "&"
+// "*"
+//
+ASTNodeUP DILParser::ParseUnaryExpression() {
+ if (CurToken().IsOneOf({Token::amp, Token::star})) {
+ Token token = CurToken();
+ uint32_t loc = token.GetLocation();
+ m_dil_lexer.Advance();
+ auto rhs = ParsePrimaryExpression();
+ switch (token.GetKind()) {
+ case Token::star:
+ return std::make_unique<UnaryOpNode>(loc, UnaryOpKind::Deref,
+ std::move(rhs));
+ case Token::amp:
+ return std::make_unique<UnaryOpNode>(loc, UnaryOpKind::AddrOf,
+ std::move(rhs));
+
+ default:
+ llvm_unreachable("invalid token kind");
+ }
+ }
+ return ParsePrimaryExpression();
+}
// Parse a primary_expression.
//
|
2b85f5a
to
58c4f08
Compare
lldb/docs/dil-expr-lang.ebnf
Outdated
expression = primary_expression ; | ||
expression = unary_expression ; | ||
|
||
unary_expression = unary_operator primary_expression ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add just plain primary_expression -- we're not forcing all expressions to have unary ops -- could still be plain variable names.
I see this is just a draft, but in order to save time for everyone, I want to say this as early as possible. With this patch, the biggest question you will have to answer is "why is the implementation of I'm also not very convinced by this "FlowAnalysis" thingy. Is it supposed to be a performance optimization (collapsing |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Trying to add unit tests with this patch. We already have a big test suite that came from I didn't find an existing way to build the binary for debugging using in-tree compiler and libc++, so had to improvise how to do it in CMake. The compiler options for building I took from Python testing. I think it needs some improvement on how to handle this when the required projects and runtimes are not enabled.
It's not, the implementation itself is one line, the rest is type and error checking. Same with dereference, only creating an address value first. The node itself is a base for other unary operations in the future.
I removed the dereferencing so that it matches the current frame var behavior. |
This is the reason. Long-term maintainability of the project. We already have (too) many different ways to build test binaries, so you need a very good reason to introduce another method, and frankly, I don't think you have one. If you don't build the test binaries with the same compiler as the rest of the tests, you will expose the test to all of the quirks and bugs of the host compiler, which means that this test will be testing a different thing for every user.
The fancy reference handling was my main concern, which you've now addressed. I haven't looked at the rest of the code too closely, so it may be fine, but in general, I think you shouldn't be looking at the type of the value in the evaluator: You already have the value in hand, so if the expression says you should dereference it, you should just ask the value to dereference itself. If it (and it's type system) knows how to dereference the value, it will give you a result. If not, you'll get an error. |
Is it a general consensus that people shouldn't use unit tests anymore?
This is what I tried to do, I looked at how Python builds test binaries and repeated the same commands in CMake. |
They shouldn't use unit tests for building debug subjects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fancy reference handling was my main concern, which you've now addressed. I haven't looked at the rest of the code too closely, so it may be fine, but in general, I think you shouldn't be looking at the type of the value in the evaluator: You already have the value in hand, so if the expression says you should dereference it, you should just ask the value to dereference itself. If it (and it's type system) knows how to dereference the value, it will give you a result. If not, you'll get an error.
I took a closer look at the implementation now, and I do think you're looking at the type of the value too much. More on that in the inline comments.
I'm also still waiting on the rationale behind the FlowAnalysis subsystem.
lldb/source/ValueObject/DILEval.cpp
Outdated
static lldb::ValueObjectSP | ||
ArrayToPointerConversion(lldb::ValueObjectSP valobj, | ||
std::shared_ptr<ExecutionContextScope> ctx) { | ||
assert(valobj->IsArrayType() && | ||
"an argument to array-to-pointer conversion must be an array"); | ||
|
||
uint64_t addr = valobj->GetLoadAddress(); | ||
llvm::StringRef name = "result"; | ||
ExecutionContext exe_ctx; | ||
ctx->CalculateExecutionContext(exe_ctx); | ||
return ValueObject::CreateValueObjectFromAddress( | ||
name, addr, exe_ctx, | ||
valobj->GetCompilerType().GetArrayElementType(ctx.get()).GetPointerType(), | ||
/* do_deref */ false); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an interesting question (and one that may deserve a PR of its own). For C, this makes sense, but I'm not sure if we want to do this for all languages. Also, the implementations assumes that an array is just a list of elements (i.e., that the address of the first array element is the same as the address of the array itself). This is true for C, but it may not be true for all languages that may want to support. I'm not sure what's the situation with Swift, but Go (which we don't support, but it makes a good point) has a first class "array slice" type, which is basically equivalent to llvm::ArrayRef
-- under the hood it's a pointer+length pair. I think it'd be reasonable for Go's type system to claim that this is an array type, but you definitely couldn't get to the first element of the slice by reinterpret_cast
ing the address of the slice.
Another problem here is that this ignores any data formatters the type may have. A user may have defined a synthetic child provider for the array type, and that provider may define (by providing a special $$dereference$$
child) what it wants to be the result "dereferencing" that object. By special-casing array types, you're subverting that behavior.
For these reasons, I think this logic should go into ValueObject::Dereference
. There, we could either:
- implement this logic directly (but without reinterpret_casting -- just do something roughly equivalent to
GetChildAtIndex(0)
) - or delegate the operation to the type system in some way, so that it can define what exactly (if anything) does "dereferencing an array" mean for the language in question
@jimingham, @adrian-prantl, do you concur, or have anything to add here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I concur. As a general rule, we want the evaluator to know as little as possible about how any operation is performed and delegate as much as possible to ValueObject methods.
The evaluator can't know anything about the language of the values it is operating on or it will either only work for C-backed ValueObjects or end up cluttered with "if C/C++/swift/go/future_language" code, which we really want to avoid. However, ValueObjects always know what type system their type comes from, and so they can naturally do the right thing for their language. So we should defer to them whenever we're doing any non-trivial operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure here, how would we go about implementing something like array to pointer conversion for C only (or any other language specific thing)? Do we just add an if (Language::LanguageIsCFamily()
check ValueObject::Dereference
or is there a better way to delegate it to TypeSystemClang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well.. that's why I said it's an interesting question :)
I don't think it should be a language check. It should be plumbed into the type system code.. somehow. Currently ValueObject::Dereference
calls GetCompilerType().GetChildCompilerTypeAtIndex
(which eventually calls into the type system). That's kind of a problem because the type system doesn't actually know that it's being asked to dereference. Nonetheless, after it returns, it does tell you whether the returned child is a dereference of the parent (via the child_is_deref_of_parent
by-ref result).
So, one option that might be acceptable (to me, others may disagree) would be to call this function on array types as well and then to check whether the result has the child_is_deref_of_parent
flag set (we'd also need to change the TypeSystemClang implementation so that it sets the flag for the first element of the array).
The alternative would be to create a new function that explicitly tells the type system that you're trying to dereference the value, and then it can do pretty much anything it wants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any downside to the second option? It seems to me the general philosophy here is to have DIL do as little as possible and the type system as much as possible, since then we're asking the entity that actually knows what it is doing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I make a separate PR and merge this one as is?
As I understand, I should make a CompilerType::Dereference
function that calls TypeSystem::Dereference
that will be implemented in TypeSystemClang
? And then call that function from ValueObject::Dereference
instead of GetChildCompilerTypeAtIndex
, but without type checks beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's definitely a job for a separate patch.
To answer Jim's question, I don't see a specific downside to the second option. Just some open questions. I don't exactly know what the interface of the new methods should be for instance (maybe it should not return a ValueObject since GetChildCompilerTypeAtIndex does not either). Ideally, I also wouldn't want too much code duplication between this and GetChildCompilerTypeAtIndex. But generally, yes, I think this would be better.
lldb/source/ValueObject/DILEval.cpp
Outdated
return rhs; | ||
} | ||
case UnaryOpKind::AddrOf: { | ||
if (node->rhs()->is_rvalue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I don't feel as strongly about this as about the array dereference thing, I do think that (and I hope you start to see a theme here) we shouldn't need to track the rvalue-ness of an expression/value separately from the ValueObject itself. I think that a ValueObject should know whether it's able to take the address of itself, because people may invoke AddressOf
directly. The current behavior of AddressOf doesn't make much sense to me -- it lets me chain AddressOf twice (but not three times!) -- but I think that's a bug that should be fixed inside that function, not worked around in the DIL (I'm not saying you should fix that here -- you can leave that to a separate patch, or even a separate person):
(lldb) script lldb.frame.FindVariable("y")
(int) y = 47
(lldb) script lldb.frame.FindVariable("y").AddressOf()
(int *) &y = 0x00007fffffffd7d0
(lldb) script lldb.frame.FindVariable("y").AddressOf().AddressOf()
(int **) &&y = 0x00007fffffffd7d0 # ???
(lldb) script lldb.frame.FindVariable("y").AddressOf().AddressOf().Dereference()
(int *) *&&y = 0x000000000000002f # :facepalm:
(lldb) script lldb.frame.FindVariable("y").AddressOf().AddressOf().AddressOf()
No value
(lldb) script lldb.frame.FindVariable("y").AddressOf().AddressOf().AddressOf().GetError()
error: error: invalid value object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I could gather, the first one is ValueObject::AddressOf
that creates a ValueObjectConstResult
and fills in its address. The second one calls ValueObjectConstResultImpl::AddressOf
that also creates ValueObjectConstResult
but with no address this time (which makes sense for a const value), so the third one fails.
I'm guessing that the first ValueObjectConstResult
is pretty much an implementation of a reference, so it has to have an address behind it so it can be dereferenced normally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can believe that's what happens now, but I don't think that's what should happen. I think my snipped above shows pretty clearly that the behavior of double AddressOf
is broken. Now, I claim that if we (you, I, someone) fix it do to something reasonable, one of these two things will happen:
- The second
AddressOf
will return an error. At that point the DIL can just return that error and be done. I think this is the likeliest scenario. - The second
AddressOf
will actually return something reasonable. I don't know what would that be, but I suppose it could happen. If that happens, then I think the DIL should return for&(&(foo))
exactly the same value as you'd get for a doubleAddressOf
call.
Notice that in both cases, the DIL implementation of &
is the same: call AddressOf
and return the result (be it an error or a valid object). And that's my point: the DIL shouldn't be second-guessing the implementation of AddressOf
(or any other ValueObject function). I also think that fixing this isn't a prerequisite for this patch. I'll happily approve a PR which produces the same (broken) behavior as what we have right now. What I don't want is to build an rvalue tracking system to work around a ValueObject bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #137688 to address this issue.
lldb/source/ValueObject/DILEval.cpp
Outdated
if (rhs->IsBitfield()) { | ||
return llvm::make_error<DILDiagnosticError>( | ||
m_expr, "address of bit-field requested", node->GetLocation()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you call AddressOf
on a bitfield?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValueObject::GetAddressOf
returns LLDB_INVALID_ADDRESS
for a bitfield without an error, so this check is to provide a meaningful error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error object (in the by-ref return value) will contain the string <variable> doesn't have a valid address
which, while slightly vague, is not all that bad. And current frame var does show it:
(lldb) v a.y
(int:4) a.y = 7
(lldb) v &a.y
error: 'a.y' doesn't have a valid address
If you want to make the error more specific, I think you should do it inside AddressOf
so that the improvements are available to all users.
lldb/source/ValueObject/DILEval.cpp
Outdated
assert(rhs->GetCompilerType().IsPointerType() && | ||
"invalid ast: must be a pointer type"); | ||
|
||
if (rhs->GetDerefValobj()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this call Dereference()
?
GetDerefValobj
is currently unused. It was added in #87197, but without any context. Such a function might be useful in some situations, but I don't think this is one of them. You really should let the ValueObject tell you what is the result of dereferencing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks if the value has already been dereferenced. The same check is also done in the beginning of Dereference()
, but if we want to skip CreateValueObjectFromAddress
as well, this is done separately beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's exactly my problem with it. I don't see why the behavior of the DIL should depened on whether somebody has called Dereference()
on this value before. If Dereference()
returns a value (regardless of whether it has computed it just now, or if it returned a cached result), you should use that. If not, then maybe (and that's a very big MAYBE) you can do something on top of that, but I don't think the behavior should depend on whether the dereferenced object is cached or not.
lldb/source/ValueObject/DILEval.cpp
Outdated
ExecutionContext exe_ctx(m_target.get(), false); | ||
lldb::ValueObjectSP value = ValueObject::CreateValueObjectFromAddress( | ||
name, base_addr, exe_ctx, pointer_type, | ||
/* do_deref */ false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the comment style used in llvm
/* do_deref */ false); | |
/*do_deref=*/false); |
lldb/source/ValueObject/DILEval.cpp
Outdated
return value->Dereference(error); | ||
} | ||
|
||
} // namespace lldb_private::dil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing newline
|
||
private: | ||
UnaryOpKind m_kind; | ||
ASTNodeUP m_rhs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kinda weird to have the right side of a unary expression (and it would get particularly weird if we end up having a postfix ++
operator). Clang calls this Val
or SubExpr
, so maybe call this m_(sub_)expr?
FlowAnalysis is also used in subscript operator, to short-circuit getting an address of an element: |
I see. So, I agree I would really want to suggest that you take out this feature and implement dereference/addressof in the simplest possible terms. Once we have all of the pieces in place (*), we can take a look the performance of the code and see if anything needs optimizing. It may turn out that that we can optimize something else so that (*) My goal is to replace/delete the old "frame var" implementation as soon as possible. Until we do that, every line here is just adding technical debt. Fancy addressof handling is not required to do that. What we need is (simple) implementations of operations that are supported currently. Besides |
As far as I remember flow analysis was implemented because according to C rules I could see how DIL may not want to care about this and just do dereference and return error if the pointer is invalid. I had to implement it in lldb-eval because we did have some user expressions that relied on this fact :) |
…reference` error, cleanup code.
@labath |
Yes, I think that would be best, although if you structure the error message like the current command, it may not be even necessary to have a dedicated error message for that. This is what I get with the current implementation, and I think it's quite okay:
|
No description provided.