Skip to content

Commit 1199935

Browse files
committed
Fill in the foreign-function part of the tutorial
1 parent 4fec179 commit 1199935

File tree

5 files changed

+191
-2
lines changed

5 files changed

+191
-2
lines changed

doc/tutorial/ext.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Syntax extension
2+
3+
FIXME to be written

doc/tutorial/ffi.md

+184-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,186 @@
11
# Interacting with foreign code
22

3-
FIXME to be written
3+
On of Rust's aims, as a system programming language, is to
4+
interoperate well with C code.
5+
6+
We'll start with an example. It's a bit bigger than usual, and
7+
contains a number of new concepts. We'll go over it one piece at a
8+
time.
9+
10+
This is a program that uses OpenSSL's `SHA1` function to compute the
11+
hash of its first command-line argument, which it then converts to a
12+
hexadecimal string and prints to standard output. If you have the
13+
OpenSSL libraries installed, it should 'just work'.
14+
15+
use std;
16+
import std::{vec, str};
17+
18+
native "cdecl" mod ssl {
19+
fn SHA1(src: *u8, sz: uint, out: *u8) -> *u8;
20+
}
21+
22+
fn as_hex(data: [u8]) -> str {
23+
let acc = "";
24+
for byte in data { acc += #fmt("%02x", byte as uint); }
25+
ret acc;
26+
}
27+
28+
fn sha1(data: str) -> str unsafe {
29+
let bytes = str::bytes(data);
30+
let hash = ssl::SHA1(vec::unsafe::to_ptr(bytes),
31+
vec::len(bytes), std::ptr::null());
32+
ret as_hex(vec::unsafe::from_buf(hash, 20u));
33+
}
34+
35+
fn main(args: [str]) {
36+
std::io::println(sha1(args[1]));
37+
}
38+
39+
## Native modules
40+
41+
Before we can call `SHA1`, we have to declare it. That is what this
42+
part of the program is responsible for:
43+
44+
native "cdecl" mod ssl {
45+
fn SHA1(src: *u8, sz: uint, out: *u8) -> *u8;
46+
}
47+
48+
A `native` module declaration tells the compiler that the program
49+
should be linked with a library by that name, and that the given list
50+
of functions are available in that library.
51+
52+
In this case, it'll change the name `ssl` to a shared library name in
53+
a platform-specific way (`libssl.so` on Linux, for example), and link
54+
that in. If you want the module to have a different name from the
55+
actual library, you can say `native "cdecl" mod something = "ssl" {
56+
... }`.
57+
58+
The `"cdecl"` word indicates the calling convention to use for
59+
functions in this module. Most C libraries use cdecl as their calling
60+
convention. You can also specify `"x86stdcall"` to use stdcall
61+
instead.
62+
63+
FIXME: Mention c-stack variants? Are they going to change?
64+
65+
## Unsafe pointers
66+
67+
The native `SHA1` function is declared to take three arguments, and
68+
return a pointer.
69+
70+
fn SHA1(src: *u8, sz: uint, out: *u8) -> *u8;
71+
72+
When declaring the argument types to a foreign function, the Rust
73+
compiler has no way to check whether your declaration is correct, so
74+
you have to be careful. If you get the number or types of the
75+
arguments wrong, you're likely to get a segmentation fault. Or,
76+
probably even worse, your code will work on one platform, but break on
77+
another.
78+
79+
In this case, `SHA1` is defined as taking two `unsigned char*`
80+
arguments and one `unsigned long`. The rust equivalents are `*u8`
81+
unsafe pointers and an `uint` (which, like `unsigned long`, is a
82+
machine-word-sized type).
83+
84+
Unsafe pointers can be created through various functions in the
85+
standard lib, usually with `unsafe` somewhere in their name. You can
86+
dereference an unsafe pointer with `*` operator, but use
87+
caution—unlike Rust's other pointer types, unsafe pointers are
88+
completely unmanaged, so they might point at invalid memory, or be
89+
null pointers.
90+
91+
## Unsafe blocks
92+
93+
The `sha1` function is the most obscure part of the program.
94+
95+
fn sha1(data: str) -> str unsafe {
96+
let bytes = str::bytes(data);
97+
let hash = ssl::SHA1(vec::unsafe::to_ptr(bytes),
98+
vec::len(bytes), std::ptr::null());
99+
ret as_hex(vec::unsafe::from_buf(hash, 20u));
100+
}
101+
102+
Firstly, what does the `unsafe` keyword at the top of the function
103+
mean? `unsafe` is a block modifier—it declares the block following it
104+
to be known to be unsafe.
105+
106+
Some operations, like dereferencing unsafe pointers or calling
107+
functions that have been marked unsafe, are only allowed inside unsafe
108+
blocks. With the `unsafe` keyword, you're telling the compiler 'I know
109+
what I'm doing'. The main motivation for such an annotation is that
110+
when you have a memory error (and you will, if you're using unsafe
111+
constructs), you have some idea where to look—it will most likely be
112+
caused by some unsafe code.
113+
114+
Unsafe blocks isolate unsafety. Unsafe functions, on the other hand,
115+
advertise it to the world. An unsafe function is written like this:
116+
117+
unsafe fn kaboom() { log "I'm harmless!"; }
118+
119+
This function can only be called from an unsafe block or another
120+
unsafe function.
121+
122+
## Pointer fiddling
123+
124+
The standard library defines a number of helper functions for dealing
125+
with unsafe data, casting between types, and generally subverting
126+
Rust's safety mechanisms.
127+
128+
Let's look at our `sha1` function again.
129+
130+
let bytes = str::bytes(data);
131+
let hash = ssl::SHA1(vec::unsafe::to_ptr(bytes),
132+
vec::len(bytes), std::ptr::null());
133+
ret as_hex(vec::unsafe::from_buf(hash, 20u));
134+
135+
The `str::bytes` function is perfectly safe, it converts a string to
136+
an `[u8]`. This byte array is then fed to `vec::unsafe::to_ptr`, which
137+
returns an unsafe pointer to its contents.
138+
139+
This pointer will become invalid as soon as the vector it points into
140+
is cleaned up, so you should be very careful how you use it. In this
141+
case, the local variable `bytes` outlives the pointer, so we're good.
142+
143+
Passing a null pointer as third argument to `SHA1` causes it to use a
144+
static buffer, and thus save us the effort of allocating memory
145+
ourselves. `ptr::null` is a generic function that will return an
146+
unsafe null pointer of the correct type (Rust generics are awesome
147+
like that—they can take the right form depending on the type that they
148+
are expected to return).
149+
150+
Finally, `vec::unsafe::from_buf` builds up a new `[u8]` from the
151+
unsafe pointer that was returned by `SHA1`. SHA1 digests are always
152+
twenty bytes long, so we can pass `20u` for the length of the new
153+
vector.
154+
155+
## Passing structures
156+
157+
C functions often take pointers to structs as arguments. Since Rust
158+
records are binary-compatible with C structs, Rust programs can call
159+
such functions directly.
160+
161+
This program uses the Posix function `gettimeofday` to get a
162+
microsecond-resolution timer.
163+
164+
use std;
165+
type timeval = {tv_sec: u32, tv_usec: u32};
166+
native "cdecl" mod libc = "" {
167+
fn gettimeofday(tv: *mutable timeval, tz: *()) -> i32;
168+
}
169+
fn unix_time_in_microseconds() -> u64 unsafe {
170+
let x = {tv_sec: 0u32, tv_usec: 0u32};
171+
libc::gettimeofday(std::ptr::addr_of(x), std::ptr::null());
172+
ret (x.tv_sec as u64) * 1000_000_u64 + (x.tv_usec as u64);
173+
}
174+
175+
The `libc = ""` sets the name of the native module to the empty string
176+
to prevent the rust compiler from trying to link it. The standard C
177+
library is already linked with Rust programs.
178+
179+
A `timeval`, in C, is a struct with two 32-bit integers. Thus, we
180+
define a record type with the same contents, and declare
181+
`gettimeofday` to take a pointer to such a record.
182+
183+
The second argument to `gettimeofday` (the time zone) is not used by
184+
this program, so it simply declares it to be a pointer to the nil
185+
type. Since null pointer look the same, no matter which type they are
186+
supposed to point at, this is safe.

doc/tutorial/order

+1
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,6 @@ args
88
generic
99
mod
1010
ffi
11+
ext
1112
task
1213
test

doc/tutorial/syntax.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Syntax Basics
22

3+
FIXME: briefly mention syntax extentions, #fmt
4+
35
## Braces
46

57
Assuming you've programmed in any C-family language (C++, Java,

doc/tutorial/web/default.css

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
.cm-s-default span.cm-def {color: #00f;}
55
.cm-s-default span.cm-variable {color: black;}
66
.cm-s-default span.cm-variable-2 {color: #05a;}
7-
.cm-s-default span.cm-variable-3 {color: #0a5;}
7+
.cm-s-default span.cm-variable-3 {color: #085;}
88
.cm-s-default span.cm-property {color: black;}
99
.cm-s-default span.cm-operator {color: black;}
1010
.cm-s-default span.cm-comment {color: #a50;}

0 commit comments

Comments
 (0)