diff options
| author | Niko Matsakis <niko@alum.mit.edu> | 2012-01-10 19:41:57 -0800 |
|---|---|---|
| committer | Niko Matsakis <niko@alum.mit.edu> | 2012-01-10 19:57:00 -0800 |
| commit | ef895b96320a9d5c64090bad1c8a147b0431eef1 (patch) | |
| tree | 416e6d05266fe8d3afcb199024121f0c61e8aa0a /doc/tutorial | |
| parent | 441a42c5d2707ae93a8be3c6be5c426e7416e50b (diff) | |
| download | rust-ef895b96320a9d5c64090bad1c8a147b0431eef1.tar.gz rust-ef895b96320a9d5c64090bad1c8a147b0431eef1.zip | |
update various parts of the tutorial
Diffstat (limited to 'doc/tutorial')
| -rw-r--r-- | doc/tutorial/data.md | 4 | ||||
| -rw-r--r-- | doc/tutorial/func.md | 168 | ||||
| -rw-r--r-- | doc/tutorial/generic.md | 4 | ||||
| -rw-r--r-- | doc/tutorial/task.md | 75 |
4 files changed, 202 insertions, 49 deletions
diff --git a/doc/tutorial/data.md b/doc/tutorial/data.md index 661fb2f0c6d..1906aac05f4 100644 --- a/doc/tutorial/data.md +++ b/doc/tutorial/data.md @@ -188,6 +188,8 @@ All pointer types can be dereferenced with the `*` unary operator. ### Shared boxes +<a name="shared-box"></a> + Shared boxes are pointers to heap-allocated, reference counted memory. A cycle collector ensures that circular references do not result in memory leaks. @@ -207,6 +209,8 @@ Shared boxes never cross task boundaries. ### Unique boxes +<a name="unique-box"></a> + In contrast to shared boxes, unique boxes are not reference counted. Instead, it is statically guaranteed that only a single owner of the box exists at any time. diff --git a/doc/tutorial/func.md b/doc/tutorial/func.md index 15811da04b3..f7397204f5c 100644 --- a/doc/tutorial/func.md +++ b/doc/tutorial/func.md @@ -29,48 +29,86 @@ expected to return. ## Closures -Normal Rust functions (declared with `fn`) do not close over their -environment. A `lambda` expression can be used to create a closure. - - fn make_plus_function(x: int) -> lambda(int) -> int { - lambda(y: int) -> int { x + y } - } - let plus_two = make_plus_function(2); - assert plus_two(3) == 5; - -A `lambda` function *copies* its environment (in this case, the -binding for `x`). It can not mutate the closed-over bindings, and will -not see changes made to these variables after the `lambda` was -evaluated. `lambda`s can be put in data structures and passed around -without limitation. - -The type of a closure is `lambda(args) -> type`, as opposed to -`fn(args) -> type`. The `fn` type stands for 'bare' functions, with no -closure attached. Keep this in mind when writing higher-order -functions. - -A different form of closure is the block. Blocks are written like they -are in Ruby: `{|x| x + y}`, the formal parameters between pipes, -followed by the function body. They are stack-allocated and properly -close over their environment (they see updates to closed over -variables, for example). But blocks can only be used in a limited set -of circumstances. They can be passed to other functions, but not -stored in data structures or returned. - - fn map_int(f: block(int) -> int, vec: [int]) -> [int] { - let result = []; - for i in vec { result += [f(i)]; } - ret result; - } - map_int({|x| x + 1 }, [1, 2, 3]); - -The type of blocks is spelled `block(args) -> type`. Both closures and -bare functions are automatically convert to `block`s when appropriate. -Most higher-order functions should take their function arguments as -`block`s. - -A block with no arguments is written `{|| body(); }`—you can not leave -off the pipes. +Named rust functions, like those in the previous section, do not close +over their environment. Rust also includes support for closures, which +are anonymous functions that can access the variables that were in +scope at the time the closure was created. Closures are represented +as the pair of a function pointer (as in C) and the environment, which +is where the values of the closed over variables are stored. Rust +includes support for three varieties of closure, each with different +costs and capabilities: + +- Stack closures (written `block`) store their environment in the + stack frame of their creator; they are very lightweight but cannot + be stored in a data structure. +- Boxed closures (written `fn@`) store their environment in a + [shared box](data#shared-box). These are good for storing within + data structures but cannot be sent to another task. +- Unique closures (written `fn~`) store their environment in a + [unique box](data#unique-box). These are limited in the kinds of + data that they can close over so that they can be safely sent + between tasks. As with any unique pointer, copying a unique closure + results in a deep clone of the environment. + +Both boxed closures and unique closures are subtypes of stack +closures, meaning that wherever a stack closure type appears, a boxed +or unique closure value can be used. This is due to the restrictions +placed on the use of stack closures, which ensure that all operations +on a stack closure are also safe on any kind of closure. + +### Working with closures + +Closures are specified by writing an inline, anonymous function +declaration. For example, the following code creates a boxed closure: + + let plus_two = fn@(x: int) -> int { + ret x + 2; + }; + +Creating a unique closure is very similar: + + let plus_two_uniq = fn~(x: int) -> int { + ret x + 2; + }; + +Stack closures can be created in a similar way; however, because stack +closures literally point into their creator's stack frame, they can +only be used in a very specific way. Stack closures may be passed as +parameters and they may be called, but they may not be stored into +local variables or fields. Creating a stack closure can therefore be +done using a syntax like the following: + + let doubled = vec::map([1, 2, 3], block(x: int) -> int { + x * 2 + }); + +Here the `vec::map()` is the standard higher-order map function, which +applies the closure to each item in the vector and returns a new +vector containing the results. + +### Shorthand syntax + +The syntax in the previous section was very explicit; it fully +specifies the kind of closure as well as the type of every parameter +and the return type. In practice, however, closures are often used as +parameters to functions, and all of these details can be inferred. +Therefore, we support a shorthand syntax similar to Ruby or Smalltalk +blocks, which looks as follows: + + let doubled = vec::map([1, 2, 3], {|x| x*2}); + +Here the vertical bars after the open brace `{` indicate that this is +a closure. A list of parameters appears between the bars. The bars +must always be present: if there are no arguments, then simply write +`{||...}`. + +As a further simplification, if the final parameter to a function is a +closure, the closure need not be placed within parenthesis. +Therefore, one could write + + let doubled = vec::map([1, 2, 3]) {|x| x*2}; + +This form is often easier to parse as it involves less nesting. ## Binding @@ -79,8 +117,8 @@ Partial application is done using the `bind` keyword in Rust. let daynum = bind std::vec::position(_, ["mo", "tu", "we", "do", "fr", "sa", "su"]); -Binding a function produces a closure (`lambda` type) in which some of -the arguments to the bound function have already been provided. +Binding a function produces a boxed closure (`fn@` type) in which some +of the arguments to the bound function have already been provided. `daynum` will be a function taking a single string argument, and returning the day of the week that string corresponds to (if any). @@ -103,11 +141,47 @@ To run such an iteration, you could do this: # fn for_rev(v: [int], act: block(int)) {} for_rev([1, 2, 3], {|n| log n; }); -But Rust allows a more pleasant syntax for this situation, with the -loop block moved out of the parenthesis and the final semicolon -omitted: +Making use of the shorthand where a final closure argument can be +moved outside of the parentheses permits the following, which +looks quite like a normal loop: # fn for_rev(v: [int], act: block(int)) {} for_rev([1, 2, 3]) {|n| log n; } + +Note that, because `for_rev()` returns unit type, no semicolon is +needed when the final closure is pulled outside of the parentheses. + +## Capture clauses + +When creating a boxed or unique closure, the default is to copy in the +values of any closed over variables. But sometimes, particularly if a +value is large or expensive to copy, you would like to *move* the +value into the closure instead. Rust supports this via the use of a +capture clause, which lets you specify precisely whether each variable +used in the closure is copied or moved. + +As an example, let's assume we had some type of unique tree type: + + tag tree<T> = tree_rec<T>; + type tree_rec<T> = ~{left: option<tree>, right: option<tree>, val: T}; + +Now if we have a function like the following: + + let some_tree: tree<T> = ...; + let some_closure = fn~() { + ... use some_tree in some way ... + }; + +Here the variable `some_tree` is used within the closure body, so a +deep copy will be performed. This can become quite expensive if the +tree is large. If we know that `some_tree` will not be used again, +we could avoid this expense by making use of a capture clause like so: + + let some_tree: tree<T> = ...; + let some_closure = fn~[move some_tree]() { + ... use some_tree in some way ... + }; + +This is particularly useful when moving data into [child tasks](task). diff --git a/doc/tutorial/generic.md b/doc/tutorial/generic.md index 02d37fd0e9b..c22d4016c40 100644 --- a/doc/tutorial/generic.md +++ b/doc/tutorial/generic.md @@ -87,6 +87,8 @@ without any sophistication). ## Kinds +<a name="kind"></a> + Perhaps surprisingly, the 'copy' (duplicate) operation is not defined for all Rust types. Resource types (types with destructors) can not be copied, and neither can any type whose copying would require copying a @@ -100,7 +102,7 @@ unless you explicitly declare that type parameter to have copyable // This does not compile fn head_bad<T>(v: [T]) -> T { v[0] } // This does - fn head<copy T>(v: [T]) -> T { v[0] } + fn head<T:copy>(v: [T]) -> T { v[0] } When instantiating a generic function, you can only instantiate it with types that fit its kinds. So you could not apply `head` to a diff --git a/doc/tutorial/task.md b/doc/tutorial/task.md index 50a7ad193ec..b373e27dc21 100644 --- a/doc/tutorial/task.md +++ b/doc/tutorial/task.md @@ -1,3 +1,76 @@ # Tasks -FIXME to be written +Rust supports a system of lightweight tasks, similar to what is found +in Erlang or other actor systems. Rust tasks communicate via messages +and do not share data. However, it is possible to send data without +copying it by making use of [unique boxes][uniques] (still, the data +is owned by only one task at a time). + +[uniques]: data.html#unique-box + +NOTE: As Rust evolves, we expect the Task API to grow and change +somewhat. The tutorial documents the API as it exists today. + +## Spawning a task + +Spawning a task is done using the various spawn functions in the +module task. Let's begin with the simplest one, `task::spawn()`, and +later move on to the others: + + let some_value = 22; + let child_task = task::spawn {|| + std::io::println("This executes in the child task."); + std::io::println(#fmt("%d", some_value)); + }; + +The argument to `task::spawn()` is a [unique closure](func) of type +`fn~()`, meaning that it takes no arguments and generates no return +value. The effect of `task::spawn()` is to fire up a child task that +will execute the closure in parallel with the creator. The result is +a task id, here stored into the variable `child_task`. + +## Ports and channels + +Now that we have spawned a child task, it would be nice if we could +communicate with it. This is done by creating a *port* with an +associated *channel*. A port is simply a location to receive messages +of a particular type. A channel is used to send messages to a port. +For example, imagine we wish to perform two expensive computations +in parallel. We might write something like: + + let port = comm::port::<int>(); + let chan = comm::chan::<int>(port); + let child_task = task::spawn {|| + let result = some_expensive_computation(); + comm::send(chan, result); + }; + some_other_expensive_computation(); + let result = comm::recv(port); + +Let's walk through this code line-by-line. The first line creates a +port for receiving integers: + + let port = comm::port::<int>(); + +This port is where we will receive the message from the child task +once it is complete. The second line creates a channel for sending +integers to the port `port`: + + let chan = comm::chan::<int>(port); + +The channel will be used by the child to send a message to the port. +The next statement actually spawns the child: + + let child_task = task::spawn {|| + let result = some_expensive_computation(); + comm::send(chan, result); + }; + +This child will perform the expensive computation send the result +over the channel. Finally, the parent continues by performing +some other expensive computation and then waiting for the child's result +to arrive on the port: + + some_other_expensive_computation(); + let result = comm::recv(port); + |
