Skip to content

Commit 0b4f0a4

Browse files
committed
Add a first stab at a tutorial
You build it with `cd doc/tutorial; node build.js`, and then point your browser at doc/tutorial/web/index.html. Not remotely ready for publicity yet.
1 parent 80c926c commit 0b4f0a4

File tree

14 files changed

+2672
-0
lines changed

14 files changed

+2672
-0
lines changed

doc/tutorial/args.md

+126
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Argument passing
2+
3+
Rust datatypes are not trivial to copy (the way, for example,
4+
JavaScript values can be copied by simply taking one or two machine
5+
words and plunking them somewhere else). Shared boxes require
6+
reference count updates, big records or tags require an arbitrary
7+
amount of data to be copied (plus updating the reference counts of
8+
shared boxes hanging off them), unique pointers require their origin
9+
to be de-initialized.
10+
11+
For this reason, the way Rust passes arguments to functions is a bit
12+
more involved than it is in most languages. It performs some
13+
compile-time cleverness to get rid of most of the cost of copying
14+
arguments, and forces you to put in explicit copy operators in the
15+
places where it can not.
16+
17+
## Safe references
18+
19+
The foundation of Rust's argument-passing optimization is the fact
20+
that Rust tasks for single-threaded worlds, which share no data with
21+
other tasks, and that most data is immutable.
22+
23+
Take the following program:
24+
25+
let x = get_really_big_record();
26+
myfunc(x);
27+
28+
We want to pass `x` to `myfunc` by pointer (which is easy), *and* we
29+
want to ensure that `x` stays intact for the duration of the call
30+
(which, in this example, is also easy). So we can just use the
31+
existing value as the argument, without copying.
32+
33+
There are more involved cases. The call could look like this:
34+
35+
myfunc(x, {|| x = get_another_record(); });
36+
37+
Now, if `myfunc` first calls its second argument and then accesses its
38+
first argument, it will see a different value from the one that was
39+
passed to it.
40+
41+
The compiler will insert an implicit copy of `x` in such a case,
42+
*except* if `x` contains something mutable, in which case a copy would
43+
result in code that behaves differently (if you mutate the copy, `x`
44+
stays unchanged). That would be bad, so the compiler will disallow
45+
such code.
46+
47+
When inserting an implicit copy for something big, the compiler will
48+
warn, so that you know that the code is not as efficient as it looks.
49+
50+
There are even more tricky cases, in which the Rust compiler is forced
51+
to pessimistically assume a value will get mutated, even though it is
52+
not sure.
53+
54+
fn for_each(v: [mutable @int], iter: block(@int)) {
55+
for elt in v { iter(elt); }
56+
}
57+
58+
For all this function knows, calling `iter` (which is a closure that
59+
might have access to the vector that's passed as `v`) could cause the
60+
elements in the vector to be mutated, with the effect that it can not
61+
guarantee that the boxes will live for the duration of the call. So it
62+
has to copy them. In this case, this will happen implicitly (bumping a
63+
reference count is considered cheap enough to not warn about it).
64+
65+
## The copy operator
66+
67+
If the `for_each` function given above were to take a vector of
68+
`{mutable a: int}` instead of `@int`, it would not be able to
69+
implicitly copy, since if the `iter` function changes a copy of a
70+
mutable record, the changes won't be visible in the record itself. If
71+
we *do* want to allow copies there, we have to explicitly allow it
72+
with the `copy` operator:
73+
74+
type mutrec = {mutable x: int};
75+
fn for_each(v: [mutable mutrec], iter: block(mutrec)) {
76+
for elt in v { iter(copy elt); }
77+
}
78+
79+
## Argument passing styles
80+
81+
The fact that arguments are conceptually passed by safe reference does
82+
not mean all arguments are passed by pointer. Composite types like
83+
records and tags *are* passed by pointer, but others, like integers
84+
and pointers, are simply passed by value.
85+
86+
It is possible, when defining a function, to specify a passing style
87+
for a parameter by prefixing the parameter name with a symbol. The
88+
most common special style is by-mutable-reference, written `&`:
89+
90+
fn vec_push(&v: [int], elt: int) {
91+
v += [elt];
92+
}
93+
94+
This will make it possible for the function to mutate the parameter.
95+
Clearly, you are only allowed to pass things that can actually be
96+
mutated to such a function.
97+
98+
Another style is by-move, which will cause the argument to become
99+
de-initialized on the caller side, and give ownership of it to the
100+
called function. This is written `-`.
101+
102+
Finally, the default passing styles (by-value for non-structural
103+
types, by-reference for structural ones) are written `+` for by-value
104+
and `&&` for by(-immutable)-reference. It is sometimes necessary to
105+
override the defaults. We'll talk more about this when discussing
106+
[generics][gens].
107+
108+
[gens]: FIXME
109+
110+
## Other uses of safe references
111+
112+
Safe references are not only used for argument passing. When you
113+
destructure on a value in an `alt` expression, or loop over a vector
114+
with `for`, variables bound to the inside of the given data structure
115+
will use safe references, not copies. This means such references have
116+
little overhead, but you'll occasionally have to copy them to ensure
117+
safety.
118+
119+
let my_rec = {a: 4, b: [1, 2, 3]};
120+
alt my_rec {
121+
{a, b} {
122+
log b; // This is okay
123+
my_rec = {a: a + 1, b: b + [a]};
124+
log b; // Here reference b has become invalid
125+
}
126+
}

doc/tutorial/build.js

+82
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
var fs = require("fs"), md = require("./lib/markdown");
2+
3+
function markdown(str) { return md.toHTML(str, "Maruku"); }
4+
5+
function fileDates(file, c) {
6+
function takeTime(str) {
7+
return Number(str.match(/^(\S+)\s/)[1]) * 1000;
8+
}
9+
require("child_process").exec("git rev-list --timestamp HEAD -- " + file, function(err, stdout) {
10+
if (err != null) { console.log("Failed to run git rev-list"); return; }
11+
var history = stdout.split("\n");
12+
if (history.length && history[history.length-1] == "") history.pop();
13+
var created = history.length ? takeTime(history[0]) : Date.now();
14+
var modified = created;
15+
if (history.length > 1) modified = takeTime(history[history.length-1]);
16+
c(created, modified);
17+
});
18+
}
19+
20+
function head(title) {
21+
return "<html><head><link rel='stylesheet' href='style.css' type='text/css'>" +
22+
"<meta http-equiv='Content-Type' content='text/html; charset=utf-8'><title>" +
23+
title + "</title></head><body>\n";
24+
}
25+
26+
function foot(created, modified) {
27+
var r = "<p class='head'>"
28+
var crStr = formatTime(created), modStr = formatTime(modified);
29+
if (created) r += "Created " + crStr;
30+
if (crStr != modStr)
31+
r += (created ? ", l" : "L") + "ast modified on " + modStr;
32+
return r + "</p>";
33+
}
34+
35+
function formatTime(tm) {
36+
var d = new Date(tm);
37+
var months = ["", "January", "February", "March", "April", "May", "June", "July", "August",
38+
"September", "October", "November", "December"];
39+
return months[d.getMonth()] + " " + d.getDate() + ", " + d.getFullYear();
40+
}
41+
42+
var files = fs.readFileSync("order", "utf8").split("\n").filter(function(x) { return x; });
43+
var max_modified = 0;
44+
var sections = [];
45+
46+
// Querying git for modified dates has to be done async in node it seems...
47+
var queried = 0;
48+
for (var i = 0; i < files.length; ++i)
49+
(function(i) { // Make lexical i stable
50+
fileDates(files[i], function(ctime, mtime) {
51+
sections[i] = {
52+
text: fs.readFileSync(files[i] + ".md", "utf8"),
53+
ctime: ctime, mtime: mtime,
54+
name: files[i],
55+
};
56+
max_modified = Math.max(mtime, max_modified);
57+
if (++queried == files.length) buildTutorial();
58+
});
59+
})(i);
60+
61+
function htmlName(i) { return sections[i].name + ".html"; }
62+
63+
function buildTutorial() {
64+
var index = head("Rust language tutorial") + "<div id='content'>" +
65+
markdown(fs.readFileSync("index.md", "utf8")) + "<ol>";
66+
for (var i = 0; i < sections.length; ++i) {
67+
var s = sections[i];
68+
var html = htmlName(i);
69+
var title = s.text.match(/^# (.*)\n/)[1];
70+
index += '<li><a href="' + html + '">' + title + "</a></li>";
71+
72+
var nav = '<p class="head">Section ' + (i + 1) + ' of the Rust language tutorial.<br>';
73+
if (i > 0) nav += '<a href="' + htmlName(i-1) + '">« Section ' + i + "</a> | ";
74+
nav += '<a href="index.html">Index</a>';
75+
if (i + 1 < sections.length) nav += ' | <a href="' + htmlName(i+1) + '">Section ' + (i + 2) + " »</a>";
76+
nav += "</p>";
77+
fs.writeFileSync("web/" + html, head(title) + nav + '<div id="content">' + markdown(s.text) + "</div>" +
78+
nav + foot(s.ctime, s.mtime) + "</body></html>");
79+
}
80+
index += "</ol></div>" + foot(null, max_modified) + "</body></html>";
81+
fs.writeFileSync("web/index.html", index);
82+
}

doc/tutorial/control.md

+169
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Control structures
2+
3+
## Conditionals
4+
5+
We've seen `if` pass by a few times already. To recap, braces are
6+
compulsory, an optional `else` clause can be appended, and multiple
7+
`if`/`else` constructs can be chained together:
8+
9+
if false {
10+
std::io::println("that's odd");
11+
} else if true {
12+
std::io::println("right");
13+
} else {
14+
std::io::println("neither true nor false");
15+
}
16+
17+
The condition given to an `if` construct *must* be of type boolean (no
18+
implicit conversion happens). If the arms return a value, this value
19+
must be of the same type for every arm in which control reaches the
20+
end of the block:
21+
22+
fn signum(x: int) -> int {
23+
if x < 0 { -1 }
24+
else if x > 0 { 1 }
25+
else { ret 0; }
26+
}
27+
28+
The `ret` (return) and its semicolon could have been left out without
29+
changing the meaning of this function, but it illustrates that you
30+
will not get a type error in this case, although the last arm doesn't
31+
have type `int`, because control doesn't reach the end of that arm
32+
(`ret` is jumping out of the function).
33+
34+
## Pattern matching
35+
36+
Rust's `alt` construct is a generalized, cleaned-up version of C's
37+
`switch` construct. You provide it with a value and a number of arms,
38+
each labelled with a pattern, and it will execute the arm that matches
39+
the value.
40+
41+
alt my_number {
42+
0 { std::io::println("zero"); }
43+
1 | 2 { std::io::println("one or two"); }
44+
3 to 10 { std::io::println("three to ten"); }
45+
_ { std::io::println("something else"); }
46+
}
47+
48+
There is no 'falling through' between arms, as in C—only one arm is
49+
executed, and it doesn't have to explicitly `break` out of the
50+
construct when it is finished.
51+
52+
The part to the left of each arm is called the pattern. Literals are
53+
valid patterns, and will match only their own value. The pipe operator
54+
(`|`) can be used to assign multiple patterns to a single arm. Ranges
55+
of numeric literal patterns can be expressed with `to`. The underscore
56+
(`_`) is a wildcard pattern that matches everything.
57+
58+
If the arm with the wildcard pattern was left off in the above
59+
example, running it on a number greater than ten (or negative) would
60+
cause a run-time failure. When no arm matches, `alt` constructs do not
61+
silently fall through—they blow up instead.
62+
63+
A powerful application of pattern matching is *destructuring*, where
64+
you use the matching to get at the contents of data types. Remember
65+
that `(float, float)` is a tuple of two floats:
66+
67+
fn angle(vec: (float, float)) -> float {
68+
alt vec {
69+
(0f, y) when y < 0f { 1.5 * std::math::pi }
70+
(0f, y) { 0.5 * std::math::pi }
71+
(x, y) { std::math::atan(y / x) }
72+
}
73+
}
74+
75+
A variable name in a pattern matches everything, *and* binds that name
76+
to the value of the matched thing inside of the arm block. Thus, `(0f,
77+
y)` matches any tuple whose first element is zero, and binds `y` to
78+
the second element. `(x, y)` matches any tuple, and binds both
79+
elements to a variable.
80+
81+
Any `alt` arm can have a guard clause (written `when EXPR`), which is
82+
an expression of type `bool` that determines, after the pattern is
83+
found to match, whether the arm is taken or not. The variables bound
84+
by the pattern are available in this guard expression.
85+
86+
## Destructuring let
87+
88+
To a limited extent, it is possible to use destructuring patterns when
89+
declaring a variable with `let`. For example, you can say this to
90+
extract the fields from a tuple:
91+
92+
let (a, b) = get_tuple_of_two_ints();
93+
94+
This will introduce two new variables, `a` and `b`, bound to the
95+
content of the tuple.
96+
97+
You may only use irrevocable patterns in let bindings, though. Things
98+
like literals, which only match a specific value, are not allowed.
99+
100+
## Loops
101+
102+
`while` produces a loop that runs as long as its given condition
103+
(which must have type `bool`) evaluates to true. Inside a loop, the
104+
keyword `break` can be used to abort the loop, and `cont` can be used
105+
to abort the current iteration and continue with the next.
106+
107+
let x = 5;
108+
while true {
109+
x += x - 3;
110+
if x % 5 == 0 { break; }
111+
std::io::println(std::int::str(x));
112+
}
113+
114+
This code prints out a weird sequence of numbers and stops as soon as
115+
it finds one that can be divided by five.
116+
117+
When iterating over a vector, use `for` instead.
118+
119+
for elt in ["red", "green", "blue"] {
120+
std::io::println(elt);
121+
}
122+
123+
This will go over each element in the given vector (a three-element
124+
vector of strings, in this case), and repeatedly execute the body with
125+
`elt` bound to the current element. You may add an optional type
126+
declaration (`elt: str`) for the iteration variable if you want.
127+
128+
For more involved iteration, such as going over the elements of a hash
129+
table, Rust uses higher-order functions. We'll come back to those in a
130+
moment.
131+
132+
## Failure
133+
134+
The `fail` keyword causes the current [task][tasks] to fail. You use
135+
it to indicate unexpected failure, much like you'd use `exit(1)` in a
136+
C program, except that in Rust, it is possible for other tasks to
137+
handle the failure, allowing the program to continue running.
138+
139+
`fail` takes an optional argument, which must have type `str`. Trying
140+
to access a vector out of bounds, or running a pattern match with no
141+
matching clauses, both result in the equivalent of a `fail`.
142+
143+
[tasks]: FIXME
144+
145+
## Logging
146+
147+
Rust has a built-in logging mechanism, using the `log` statement.
148+
Logging is polymorphic—any type of value can be logged, and the
149+
runtime will do its best to output a textual representation of the
150+
value.
151+
152+
log "hi";
153+
log (1, [2.5, -1.8]);
154+
155+
By default, you *will not* see the output of your log statements. The
156+
environment variable `RUST_LOG` controls which log statements actually
157+
get output. It can contain a comma-separated list of paths for modules
158+
that should be logged. For example, running `rustc` with
159+
`RUST_LOG=rustc::front::attr` will turn on logging in its attribute
160+
parser. If you compile a program `foo.rs`, you can set `RUST_LOG` to
161+
`foo` to enable its logging.
162+
163+
Turned-off `log` statements impose minimal overhead on the code that
164+
contains them, so except in code that needs to be really, really fast,
165+
you should feel free to scatter around debug logging statements, and
166+
leave them in.
167+
168+
For interactive debugging, you often want unconditional logging. For
169+
this, use `log_err` instead of `log` [FIXME better name].

0 commit comments

Comments
 (0)