Skip to content

[OpenMP][test] Define print_possible_return_addresses on SPARC #138523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions openmp/runtime/test/ompt/callback.h
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,14 @@ ompt_label_##id:
printf("%" PRIu64 ": current_address=%p or %p or %p\n", \
ompt_get_thread_data()->value, ((char *)addr) - 2, \
((char *)addr) - 8, ((char *)addr) - 12)
#elif KMP_ARCH_SPARC
// FIXME: Need to distinguish between 32 and 64-bit SPARC?
// On SPARC the NOP instruction is 4 bytes long.
// FIXME: Explain. Can use __builtin_frob_return_addr?
#define print_possible_return_addresses(addr) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I don't know much about how OpenMP works so, what is being passed in addr here?
Some sort of register dump struct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this question: never got an email notification for the update.

You can see this in runtime/test/ompt/callback.h: print_current_address first emits a nop insn followed by a local label, then passes that label's address to print_possible_return_addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so this is essencially tries to find where the start of the inserted nop is right?
If so, then I understand the addr-12 case:

  • nop itself is 4 byte, so we need to decrement the label's address by that much: addr-4
  • Depending on optimization level there might be a ba plus the corresponding delay slot inserted, so need to decrement again by zero or two instruction: addr-4-0 -> addr-4 or addr-4-8 -> addr-12

See e.g https://godbolt.org/z/8hYcsWTTd
Though, as far as I can tell there's no difference between 32 and 64 bit SPARC.

But what I still don't understand is, why is addr-20 a possible address?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value printed by this function will be compared to the value of __builtin_return_address(0) executed in the kmpc function generated for the directive before the nop. So, what we try to identify here is not the IP of nop, but the IP of the instruction following the kmpc call. I extended you example by some random kmpc API call (c&p from runtime/tasking/omp51_task_dep_inoutset.c): https://godbolt.org/z/h6KhaYhMW

So, the additional bytes account for the mov following the function call (and the !APP/!NO_APP ?).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so this is essencially tries to find where the start of the inserted nop is right? If so, then I understand the addr-12 case:

* `nop` itself is 4 byte, so we need to decrement the label's address by that much: `addr-4`

* Depending on optimization level there might be a `ba` plus the corresponding delay slot inserted, so need to decrement again by zero or two instruction: `addr-4-0` -> `addr-4` or `addr-4-8` -> `addr-12`

We don't need to worry about optimization, at least not initially: print_possible_return_addresses is only used inside the openmp testsuite, and that is always compiled with just -fopenmp without any -O option. I found that in both Release and Debug builds, where libomp itself is compiled with either optimization or debug options.

See e.g https://godbolt.org/z/8hYcsWTTd Though, as far as I can tell there's no difference between 32 and 64 bit SPARC.

But what I still don't understand is, why is addr-20 a possible address?

I saw the need only in two tests: ompt/synchronization/masked.c and ompt/synchronization/master.c. In both cases, after a compiler-inserted call to __kmpc_end_master due to #pragma omp directives, we got a code sequence like

   0x10000005fcc:	sethi  %hi(0x400), %i1
   0x10000005fd0:	add  %i1, 0x220, %i1	! 0x620
   0x10000005fd4:	call  0x100002008a0 <[email protected]>
   0x10000005fd8:	ldx  [ %i0 + %i1 ], %o0
   0x10000005fdc:	b  0x10000005fe4				<= codeptr_ra
   0x10000005fe0:	nop 
   0x10000005fe4:	nop 						<= current_addr
   0x10000005fe8:	b  0x10000005ff0
   0x10000005fec:	nop 

i.e. 8 more bytes to allow for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think all is okay then.

printf("%" PRIu64 ": current_address=%p or %p\n", \
ompt_get_thread_data()->value, ((char *)addr) - 12, \
(char *)addr - 20)
#else
#error Unsupported target architecture, cannot determine address offset!
#endif
Expand Down
Loading