@@ -92,9 +92,33 @@ fn foo(s: String) {
92
92
```
93
93
94
94
If you have good reason. It's not polite to hold on to ownership you don't
95
- need, and it can make your lifetimes more complex. Furthermore, you can pass
96
- either kind of string into ` foo ` by using ` .as_slice() ` on any ` String ` you
97
- need to pass in, so the ` &str ` version is more flexible.
95
+ need, and it can make your lifetimes more complex.
96
+
97
+ ## Generic functions
98
+
99
+ To write a function that's generic over types of strings, use [ the ` Str `
100
+ trait] ( http://doc.rust-lang.org/std/str/trait.Str.html ) :
101
+
102
+ ``` {rust}
103
+ fn some_string_length<T: Str>(x: T) -> uint {
104
+ x.as_slice().len()
105
+ }
106
+
107
+ fn main() {
108
+ let s = "Hello, world";
109
+
110
+ println!("{}", some_string_length(s));
111
+
112
+ let s = "Hello, world".to_string();
113
+
114
+ println!("{}", some_string_length(s));
115
+ }
116
+ ```
117
+
118
+ Both of these lines will print ` 12 ` .
119
+
120
+ The only method that the ` Str ` trait has is ` as_slice() ` , which gives you
121
+ access to a ` &str ` value from the underlying string.
98
122
99
123
## Comparisons
100
124
@@ -121,6 +145,65 @@ fn compare(string: String) {
121
145
Converting a ` String ` to a ` &str ` is cheap, but converting the ` &str ` to a
122
146
` String ` involves an allocation.
123
147
148
+ ## Indexing strings
149
+
150
+ You may be tempted to try to access a certain character of a ` String ` , like
151
+ this:
152
+
153
+ ``` {rust,ignore}
154
+ let s = "hello".to_string();
155
+
156
+ println!("{}", s[0]);
157
+ ```
158
+
159
+ This does not compile. This is on purpose. In the world of UTF-8, direct
160
+ indexing is basically never what you want to do. The reason is that each
161
+ character can be a variable number of bytes. This means that you have to iterate
162
+ through the characters anyway, which is a O(n) operation.
163
+
164
+ To iterate over a string, use the ` graphemes() ` method on ` &str ` :
165
+
166
+ ``` {rust}
167
+ let s = "αἰθήρ";
168
+
169
+ for l in s.graphemes(true) {
170
+ println!("{}", l);
171
+ }
172
+ ```
173
+
174
+ Note that ` l ` has the type ` &str ` here, since a single grapheme can consist of
175
+ multiple codepoints, so a ` char ` wouldn't be appropriate.
176
+
177
+ This will print out each character in turn, as you'd expect: first "α", then
178
+ "ἰ", etc. You can see that this is different than just the individual bytes.
179
+ Here's a version that prints out each byte:
180
+
181
+ ``` {rust}
182
+ let s = "αἰθήρ";
183
+
184
+ for l in s.bytes() {
185
+ println!("{}", l);
186
+ }
187
+ ```
188
+
189
+ This will print:
190
+
191
+ ``` {notrust,ignore}
192
+ 206
193
+ 177
194
+ 225
195
+ 188
196
+ 176
197
+ 206
198
+ 184
199
+ 206
200
+ 174
201
+ 207
202
+ 129
203
+ ```
204
+
205
+ Many more bytes than graphemes!
206
+
124
207
# Other Documentation
125
208
126
209
* [ the ` &str ` API documentation] ( /std/str/index.html )
0 commit comments