Description
Hi Rust folks,
I'm using a procedural macro to read and preprocess some strings to be stored in an application without hitting the disk at any intermediate point. The macro outputs a TokenStream
for a literal fixed-size array of string slices. When scaling up the amount of strings to a moderately high count (100k), I noticed something strange: compiling the application suddenly takes orders of magnitude more time and makes the compiler consumes almost 40 GiB of RAM in the process. Here's a MWE, for both the proc macro lib
and the consumer binary bin
:
lib/Cargo.toml
:
[package]
name = "lib"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
proc-macro = true
[dependencies]
lib/src/lib.rs
:
use proc_macro::TokenStream;
fn parse_count(stream: TokenStream) -> u32 {
stream.to_string().parse().unwrap()
}
#[proc_macro]
pub fn generate_data(input: TokenStream) -> TokenStream {
let mut s = String::from("[");
for i in 0..parse_count(input) {
if i > 0 {
s.push_str(", ");
}
s.push_str(&format!("\"{}\"", i));
}
s.push_str("]");
s.parse().unwrap()
}
bin/Cargo.toml
:
[package]
name = "bin"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
lib = {path = "../lib"}
bin/src/main.rs
:
use lib::generate_data;
fn main() {
let data = generate_data!(100000);
println!("{}", data.len());
}
Compiling the above takes many minutes, compared to just some seconds when the issue does not occur (see cases below). The memory consumption also peaks very high (truncated from the start, there's a slow increase over many minutes):
This issue is present on both latest stable and nightly (see versions used below), using both debug and release builds. In my project I have lto = true
and opt-level = "z"
as well, but neither helps. Running RUSTFLAGS="-Z time-passes" cargo build
on nightly reveals that most of the time is spent on some borrow checking routine:
...
time: 110.766; rss: 161MB -> 623MB ( +462MB) MIR_borrow_checking
...
I have also verified that it is not the macro that causes the hang by using proc-macro-error to emit a warning just as the macro is about to return the final TokenStream
. The warning is emitted right away, before memory hogging begins.
From my testing, with the same element count, the issue does not occur, when:
- The array is defined using a reference, i.e. the proc macro returns a reference to an array (
&[...]
). - The stored type in the array is something other than
&str
, compilation happens in seconds with e.g.u32
.
This leads me to believe that the references of the &str
types are at fault, but I don't have enough experience in rustc
internals to say anything more than that something doesn't scale (quadratic time?).
Meta
rustc +stable --version --verbose
:
rustc 1.56.0 (09c42c458 2021-10-18)
binary: rustc
commit-hash: 09c42c45858d5f3aedfa670698275303a3d19afa
commit-date: 2021-10-18
host: x86_64-unknown-linux-gnu
release: 1.56.0
LLVM version: 13.0.0
rustc +nightly --version --verbose
:
rustc 1.58.0-nightly (e249ce6b2 2021-10-30)
binary: rustc
commit-hash: e249ce6b2345587d6e11052779c86adbad626dff
commit-date: 2021-10-30
host: x86_64-unknown-linux-gnu
release: 1.58.0-nightly
LLVM version: 13.0.0
Backtrace
No backtrace, compilation does eventually succeed given enough time and memory.