Description
The LLVM project contains llvm-ml
, a MASM-syntax assembler intended to behave like ml.exe
or ml64.exe
. As of https://reviews.llvm.org/D121510, it also contains some MASM-syntax assembly source files in llvm/lib/Support/BLAKE3
. But the former cannot accept the latter as input:
$ build/bin/llvm-ml -m64 llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:8:27: error: unexpected token in align directive
_TEXT SEGMENT ALIGN(16) 'CODE'
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2408:25: error: endp outside of procedure block
blake3_hash_many_avx512 ENDP
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2490:1: error: symbol '@@' is already defined
@@:
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2502:33: error: endp outside of procedure block
blake3_compress_in_place_avx512 ENDP
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2534:1: error: symbol '@@' is already defined
@@:
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2585:1: error: symbol '@@' is already defined
@@:
^
llvm/lib/Support/BLAKE3/blake3_avx512_x86-64_windows_msvc.asm:2601:28: error: endp outside of procedure block
blake3_compress_xof_avx512 ENDP
^
My own partial diagnosis of the problems, just from looking at the source file:
It looks as if declaring @@
as a label multiple times is expected to be allowed, with jmp @b
or jmp @f
going to the previous or next instance of it respectively. But llvm-ml
is expecting it to be like any other label, defined at most once.
The 'endp outside of procedure block' errors appear to be arising from nesting of two PROC
/ENDP
pairs around the same code in order to define two procedures with the same address and extent (foo PROC
, bar PROC
, code code code, bar ENDP
, foo ENDP
). llvm-ml
is complaining on the second ENDP
, suggesting that the real ml64.exe
is able to track multiple open PROC
definitions and llvm-ml
isn't.
As for the first error in the segment directive: it looks to me as if llvm-ml
has terminated parsing after the word SEGMENT
, and attempted to parse the rest of the line independently, as if there had been a newline in between. (For example, I can write _TEXT SEGMENT mov rax,1
without provoking an error.) So the suffix ALIGN(16)
is being regarded as a standalone alignment directive, rather than an attribute on the segment directive, and that's why the error message about 'CODE'
is reporting an unexpected token in the align directive and not in the segment directive.
(This is all as of git commit fc8f465, with llvm-ml built from the sources in that commit trying to assemble the BLAKE3 code in the same commit.)