Skip to content

Vec::clone and String::clone are very slow #17844

Closed
@rusty-nail2

Description

@rusty-nail2

Platform: Windows, rust nightly.

The optimizations made to fix #13472 appear to have been undone by #15471 and 3316b1e

In 32-bit mode, rustc -O --test test.rs && ./test --bench gives:

running 7 tests
test clone_str                  ... bench:   1540862 ns/iter (+/- 25433)
test clone_vec_from_fn          ... bench:    745343 ns/iter (+/- 7535)
test clone_vec_from_incremental ... bench:   1088468 ns/iter (+/- 5358)
test fast_clone_vec             ... bench:    310484 ns/iter (+/- 3212)
test memory_copy                ... bench:    102162 ns/iter (+/- 1220)
test naive_clone_vec            ... bench:    745599 ns/iter (+/- 8111)

With -C target-cpu=x86-64 SSE enabled (-C target-feature=+sse2), things look better but String::clone still stands out:

running 7 tests
test clone_str                  ... bench:   1661609 ns/iter (+/- 60586)
test clone_vec_from_fn          ... bench:    359253 ns/iter (+/- 202820)
test clone_vec_from_incremental ... bench:    357418 ns/iter (+/- 3406)
test fast_clone_vec             ... bench:    309973 ns/iter (+/- 9762)
test memory_copy                ... bench:    102129 ns/iter (+/- 1317)
test naive_clone_vec            ... bench:    753919 ns/iter (+/- 8473)

Performance seems very dependent on minor details: clone_vec_from_incremental is much slower than clone_vec_from_fn in 32 bit non-SSE mode.

Benchmark program:

extern crate test;
extern crate core;

use test::{Bencher, black_box};
use core::ptr;

static SIZE: uint = 1024*1024;

#[bench]
fn clone_str(bh: &mut Bencher) {
    let mut x = String::with_capacity(SIZE);
    for _ in range(0, SIZE) {
        x.push('x');
    }
    bh.iter(|| black_box(x.clone()));
}

#[bench]
fn clone_vec_from_incremental(bh: &mut Bencher) {
    let mut x: Vec<u8> = Vec::with_capacity(SIZE);
    for _ in range(0, SIZE) {
        x.push(0x78);
    }
    bh.iter(|| black_box(x.clone()));
}

#[bench]
fn clone_vec_from_fn(bh: &mut Bencher) {
    let x: Vec<u8> = Vec::from_fn(SIZE, |_| 0x78);
    bh.iter(|| black_box(x.clone()));
}

#[bench]
fn naive_clone_vec(bh: &mut Bencher) {
    let x: Vec<u8> = Vec::from_fn(SIZE, |_| 0x78);
    bh.iter(|| {
        black_box(Vec::from_fn(SIZE, |i| x[i]));
    });
}

#[bench]
fn fast_clone_vec(bh: &mut Bencher) {
    let x: Vec<u8> = Vec::from_fn(SIZE, |_| 0x78);
    bh.iter(|| {
        let mut copy = Vec::<u8>::with_capacity(SIZE);
        unsafe {
            ptr::copy_memory(copy.as_mut_ptr(), x.as_ptr(), x.len());
            copy.set_len(SIZE);
        }
        black_box(copy)
    });
}

#[bench]
fn memory_copy(bh: &mut Bencher) {
    let x: Vec<u8> = Vec::from_fn(SIZE, |_| 0x78);
    let mut y: Vec<u8> = Vec::from_fn(SIZE, |_| 0);
    bh.iter(|| {
        unsafe {
            ptr::copy_memory(y.as_mut_ptr(), x.as_ptr(), x.len());
        }
    })
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-collectionsArea: `std::collections`I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions