Skip to content

Commit cebed8a

Browse files
committed
auto merge of #15593 : steveklabnik/rust/string_guide, r=kballard
I decided to change it up a little today and hack out the beginning of the String guide. Strings are different enough in Rust that I think they deserve a specific guide, especially for those who are used to managed languages. I decided to start with Strings because they get asked about a lot in IRC, and also based on discussions like this one on reddit: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/ I blatantly stole bits from our other documentation on Strings. It's a little sparse at current, but I wanted to start somewhere. I am not exactly sure what should go in "Best Practices," and would like the feedback from the team on this. Specifically due to comments like this one: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/citmxb5
2 parents f50e4ee + 226b7d1 commit cebed8a

File tree

2 files changed

+133
-4
lines changed

2 files changed

+133
-4
lines changed

src/doc/guide-strings.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
% The Strings Guide
2+
3+
# Strings
4+
5+
Strings are an important concept to master in any programming language. If you
6+
come from a managed language background, you may be surprised at the complexity
7+
of string handling in a systems programming language. Efficient access and
8+
allocation of memory for a dynamically sized structure involves a lot of
9+
details. Luckily, Rust has lots of tools to help us here.
10+
11+
A **string** is a sequence of unicode scalar values encoded as a stream of
12+
UTF-8 bytes. All strings are guaranteed to be validly-encoded UTF-8 sequences.
13+
Additionally, strings are not null-terminated and can contain null bytes.
14+
15+
Rust has two main types of strings: `&str` and `String`.
16+
17+
## &str
18+
19+
The first kind is a `&str`. This is pronounced a 'string slice.' String literals
20+
are of the type `&str`:
21+
22+
```{rust}
23+
let string = "Hello there.";
24+
```
25+
26+
Like any Rust type, string slices have an associated lifetime. A string literal
27+
is a `&'static str`. A string slice can be written without an explicit
28+
lifetime in many cases, such as in function arguments. In these cases the
29+
lifetime will be inferred:
30+
31+
```{rust}
32+
fn takes_slice(slice: &str) {
33+
println!("Got: {}", slice);
34+
}
35+
```
36+
37+
Like vector slices, string slices are simply a pointer plus a length. This
38+
means that they're a 'view' into an already-allocated string, such as a
39+
`&'static str` or a `String`.
40+
41+
## String
42+
43+
A `String` is a heap-allocated string. This string is growable, and is also
44+
guaranteed to be UTF-8.
45+
46+
```{rust}
47+
let mut s = "Hello".to_string();
48+
println!("{}", s);
49+
50+
s.push_str(", world.");
51+
println!("{}", s);
52+
```
53+
54+
You can coerce a `String` into a `&str` with the `as_slice()` method:
55+
56+
```{rust}
57+
fn takes_slice(slice: &str) {
58+
println!("Got: {}", slice);
59+
}
60+
61+
fn main() {
62+
let s = "Hello".to_string();
63+
takes_slice(s.as_slice());
64+
}
65+
```
66+
67+
You can also get a `&str` from a stack-allocated array of bytes:
68+
69+
```{rust}
70+
use std::str;
71+
72+
let x: &[u8] = &[b'a', b'b'];
73+
let stack_str: &str = str::from_utf8(x).unwrap();
74+
```
75+
76+
## Best Practices
77+
78+
### `String` vs. `&str`
79+
80+
In general, you should prefer `String` when you need ownership, and `&str` when
81+
you just need to borrow a string. This is very similar to using `Vec<T>` vs. `&[T]`,
82+
and `T` vs `&T` in general.
83+
84+
This means starting off with this:
85+
86+
```{rust,ignore}
87+
fn foo(s: &str) {
88+
```
89+
90+
and only moving to this:
91+
92+
```{rust,ignore}
93+
fn foo(s: String) {
94+
```
95+
96+
If you have good reason. It's not polite to hold on to ownership you don't
97+
need, and it can make your lifetimes more complex. Furthermore, you can pass
98+
either kind of string into `foo` by using `.as_slice()` on any `String` you
99+
need to pass in, so the `&str` version is more flexible.
100+
101+
### Comparisons
102+
103+
To compare a String to a constant string, prefer `as_slice()`...
104+
105+
```{rust}
106+
fn compare(string: String) {
107+
if string.as_slice() == "Hello" {
108+
println!("yes");
109+
}
110+
}
111+
```
112+
113+
... over `to_string()`:
114+
115+
```{rust}
116+
fn compare(string: String) {
117+
if string == "Hello".to_string() {
118+
println!("yes");
119+
}
120+
}
121+
```
122+
123+
Converting a `String` to a `&str` is cheap, but converting the `&str` to a
124+
`String` involves an allocation.
125+
126+
## Other Documentation
127+
128+
* [the `&str` API documentation](/std/str/index.html)
129+
* [the `String` API documentation](std/string/index.html)

src/libcollections/str.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ other languages.
5555
5656
# Representation
5757
58-
Rust's string type, `str`, is a sequence of unicode codepoints encoded as a
59-
stream of UTF-8 bytes. All safely-created strings are guaranteed to be validly
60-
encoded UTF-8 sequences. Additionally, strings are not null-terminated
61-
and can contain null codepoints.
58+
Rust's string type, `str`, is a sequence of unicode scalar values encoded as a
59+
stream of UTF-8 bytes. All strings are guaranteed to be validly encoded UTF-8
60+
sequences. Additionally, strings are not null-terminated and can contain null
61+
bytes.
6262
6363
The actual representation of strings have direct mappings to vectors: `&str`
6464
is the same as `&[u8]`.

0 commit comments

Comments
 (0)