From 64dad2cb0333e12d0e5b4bb4e9b013af68f14156 Mon Sep 17 00:00:00 2001 From: Alan Andrade Date: Mon, 12 May 2014 10:31:13 -0700 Subject: Cleanup lifetime guide Clean pointers guide --- src/doc/guide-pointers.md | 171 +++++++--------------------------------------- 1 file changed, 23 insertions(+), 148 deletions(-) (limited to 'src/doc/guide-pointers.md') diff --git a/src/doc/guide-pointers.md b/src/doc/guide-pointers.md index 948d033e06c..bee1dbcd2ce 100644 --- a/src/doc/guide-pointers.md +++ b/src/doc/guide-pointers.md @@ -5,7 +5,7 @@ are also one of the more confusing topics for newcomers to Rust. They can also be confusing for people coming from other languages that support pointers, such as C++. This guide will help you understand this important topic. -# You don't actually need pointers +# You don't actually need pointers, use references I have good news for you: you probably don't need to care about pointers, especially as you're getting started. Think of it this way: Rust is a language @@ -37,7 +37,7 @@ error: mismatched types: expected `&int` but found `` (expec What gives? It needs a pointer! Therefore I have to use pointers!" -Turns out, you don't. All you need is a reference. Try this on for size: +Turns out, you don't. __All you need is a reference__. Try this on for size: ~~~rust # fn succ(x: &int) -> int { *x + 1 } @@ -74,22 +74,22 @@ Here are the use-cases for pointers. I've prefixed them with the name of the pointer that satisfies that use-case: 1. Owned: `Box` must be a pointer, because you don't know the size of the -object, so indirection is mandatory. +object, so indirection is mandatory. Notation might change once Rust +support DST fully so we recommend you stay tuned. + 2. Owned: You need a recursive data structure. These can be infinite sized, so indirection is mandatory. + 3. Owned: A very, very, very rare situation in which you have a *huge* chunk of data that you wish to pass to many methods. Passing a pointer will make this more efficient. If you're coming from another language where this technique is common, such as C++, please read "A note..." below. -4. Managed: Having only a single owner to a piece of data would be inconvenient -or impossible. This is only often useful when a program is very large or very -complicated. Using a managed pointer will activate Rust's garbage collection -mechanism. -5. Reference: You're writing a function, and you need a pointer, but you don't + +4. Reference: You're writing a function, and you need a pointer, but you don't care about its ownership. If you make the argument a reference, callers can send in whatever kind they want. -Five exceptions. That's it. Otherwise, you shouldn't need them. Be sceptical +Four exceptions. That's it. Otherwise, you shouldn't need them. Be sceptical of pointers in Rust: use them for a deliberate purpose, not just to make the compiler happy. @@ -165,6 +165,7 @@ approximation of owned pointers follows: 1. Only one owned pointer may exist to a particular place in memory. It may be borrowed from that owner, however. + 2. The Rust compiler uses static analysis to determine where the pointer is in scope, and handles allocating and de-allocating that memory. Owned pointers are not garbage collected. @@ -204,6 +205,10 @@ The inner lists _must_ be an owned pointer, because we can't know how many elements are in the list. Without knowing the length, we don't know the size, and therefore require the indirection that pointers offer. +> Note: Nil is just part of the List enum and even though is being used +> to represent the concept of "nothing", you shouldn't think of it as +> NULL. Rust doesn't have NULL. + ## Efficiency This should almost never be a concern, but because creating an owned pointer @@ -248,81 +253,6 @@ fn main() { Now it'll be copying a pointer-sized chunk of memory rather than the whole struct. -# Managed Pointers - -> **Note**: the `@` form of managed pointers is deprecated and behind a -> feature gate (it requires a `#![feature(managed_pointers)]` attribute on -> the crate root). There are replacements, currently -> there is `std::rc::Rc` and `std::gc::Gc` for shared ownership via reference -> counting and garbage collection respectively. - -Managed pointers, notated by an `@`, are used when having a single owner for -some data isn't convenient or possible. This generally happens when your -program is very large and complicated. - -For example, let's say you're using an owned pointer, and you want to do this: - -~~~rust{.ignore} -struct Point { - x: int, - y: int, -} - -fn main() { - let a = box Point { x: 10, y: 20 }; - let b = a; - println!("{}", b.x); - println!("{}", a.x); -} -~~~ - -You'll get this error: - -~~~ {.notrust} -test.rs:10:20: 10:21 error: use of moved value: `a` -test.rs:10 println!("{}", a.x); - ^ -note: in expansion of format_args! -:158:27: 158:81 note: expansion site -:157:5: 159:6 note: in expansion of println! -test.rs:10:5: 10:25 note: expansion site -test.rs:8:9: 8:10 note: `a` moved here because it has type `Box`, which is moved by default (use `ref` to override) -test.rs:8 let b = a; - ^ -~~~ - -As the message says, owned pointers only allow for one owner at a time. When you assign `a` to `b`, `a` becomes invalid. Change your code to this, however: - -~~~rust -struct Point { - x: int, - y: int, -} - -fn main() { - let a = @Point { x: 10, y: 20 }; - let b = a; - println!("{}", b.x); - println!("{}", a.x); -} -~~~ - -And it works: - -~~~ {.notrust} -10 -10 -~~~ - -So why not just use managed pointers everywhere? There are two big drawbacks to -managed pointers: - -1. They activate Rust's garbage collector. Other pointer types don't share this -drawback. -2. You cannot pass this data to another task. Shared ownership across -concurrency boundaries is the source of endless pain in other languages, so -Rust does not let you do this. - # References References are the third major kind of pointer Rust supports. They are @@ -346,7 +276,7 @@ fn compute_distance(p1: &Point, p2: &Point) -> f32 { } fn main() { - let origin = @Point { x: 0.0, y: 0.0 }; + let origin = &Point { x: 0.0, y: 0.0 }; let p1 = box Point { x: 5.0, y: 3.0 }; println!("{:?}", compute_distance(origin, p1)); @@ -354,8 +284,9 @@ fn main() { ~~~ This prints `5.83095189`. You can see that the `compute_distance` function -takes in two references, but we give it a managed and unique pointer. Of -course, if this were a real program, we wouldn't have any of these pointers, +takes in two references, but we give it a stack allocated reference and an +owned box reference. +Of course, if this were a real program, we wouldn't have any of these pointers, they're just there to demonstrate the concepts. So how is this hard? Well, because we're ignoring ownership, the compiler needs @@ -364,9 +295,11 @@ safety, a reference's representation at runtime is the same as that of an ordinary pointer in a C program. They introduce zero overhead. The compiler does all safety checks at compile time. -This theory is called 'region pointers,' and involve a concept called -'lifetimes'. Here's the simple explanation: would you expect this code to -compile? +This theory is called 'region pointers' and you can read more about it +[here](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf). +Region pointers evolved into what we know today as 'lifetimes'. + +Here's the simple explanation: would you expect this code to compile? ~~~rust{.ignore} fn main() { @@ -428,64 +361,6 @@ hard for a computer, too! There is an entire [guide devoted to references and lifetimes](guide-lifetimes.html) that goes into lifetimes in great detail, so if you want the full details, check that out. -# Returning Pointers - -We've talked a lot about functions that accept various kinds of pointers, but -what about returning them? In general, it is better to let the caller decide -how to use a function's output, instead of assuming a certain type of pointer -is best. - -What does that mean? Don't do this: - -~~~rust -fn foo(x: Box) -> Box { - return box *x; -} - -fn main() { - let x = box 5; - let y = foo(x); -} -~~~ - -Do this: - -~~~rust -fn foo(x: Box) -> int { - return *x; -} - -fn main() { - let x = box 5; - let y = box foo(x); -} -~~~ - -This gives you flexibility, without sacrificing performance. For example, this will -also work: - -~~~rust -fn foo(x: Box) -> int { - return *x; -} - -fn main() { - let x = box 5; - let y = @foo(x); -} -~~~ - -You may think that this gives us terrible performance: return a value and then -immediately box it up?!?! Isn't that the worst of both worlds? Rust is smarter -than that. There is no copy in this code. `main` allocates enough room for the -`@int`, passes a pointer to that memory into `foo` as `x`, and then `foo` writes -the value straight into that pointer. This writes the return value directly into -the allocated box. - -This is important enough that it bears repeating: pointers are not for optimizing -returning values from your code. Allow the caller to choose how they want to -use your output. - # Related Resources -- cgit 1.4.1-3-g733a5 From 99744653d5abb949e446daf0732be79c76aa6f79 Mon Sep 17 00:00:00 2001 From: Alan Andrade Date: Sat, 24 May 2014 13:15:48 -0700 Subject: get over bold text madness, changes per PR, brought the "returning pointers" section back to pointers guide --- src/doc/guide-lifetimes.md | 30 +++++++++++----------- src/doc/guide-pointers.md | 63 ++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 67 insertions(+), 26 deletions(-) (limited to 'src/doc/guide-pointers.md') diff --git a/src/doc/guide-lifetimes.md b/src/doc/guide-lifetimes.md index e0e33337c9a..3c0d8c4797c 100644 --- a/src/doc/guide-lifetimes.md +++ b/src/doc/guide-lifetimes.md @@ -45,12 +45,11 @@ let on_the_heap : Box = box Point {x: 7.0, y: 9.0}; Suppose we wanted to write a procedure that computed the distance between any two points, no matter where they were stored. One option is to define a function -that takes two arguments of type `Point`—that is, it takes the points __by value__. -But if we define it this way, calling the function will cause the points __to be -copied__. For points, this is probably not so bad, but often copies are -expensive. Worse, if the data type contains mutable fields, copying can change -the semantics of your program in unexpected ways. So we'd like to define -a function that takes the points just as a __reference__/__borrowed pointer__. +that takes two arguments of type `Point`—that is, it takes the points by value. +But if we define it this way, calling the function will cause the points to be +copied. For points, this is probably not so bad, but often copies are +expensive. So we'd like to define a function that takes the points just as +a reference. ~~~ # struct Point {x: f64, y: f64} @@ -62,27 +61,26 @@ fn compute_distance(p1: &Point, p2: &Point) -> f64 { } ~~~ -Now we can call `compute_distance()` +Now we can call `compute_distance()`: ~~~ # struct Point {x: f64, y: f64} # let on_the_stack : Point = Point{x: 3.0, y: 4.0}; # let on_the_heap : Box = box Point{x: 7.0, y: 9.0}; # fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 } -compute_distance(&on_the_stack, on_the_heap); +compute_distance(&on_the_stack, &*on_the_heap); ~~~ Here, the `&` operator takes the address of the variable `on_the_stack`; this is because `on_the_stack` has the type `Point` (that is, a struct value) and we have to take its address to get a value. We also call this _borrowing_ the local variable -`on_the_stack`, because we have created __an alias__: that is, another +`on_the_stack`, because we have created an alias: that is, another name for the same data. -In contrast, we can pass `on_the_heap` to `compute_distance` directly. -The compiler automatically converts a box like `Box` to a reference like -`&Point`. This is another form of borrowing: in this case, the caller lends -the contents of the box to the callee. +For the second argument, we need to grab the contents of `on_the_heap` +by using the `*` operator, and then get a reference to that data. In +order to convert `Box` into a `&T`, we need to use `&*`. Whenever a caller lends data to a callee, there are some limitations on what the caller can do with the original. For example, if the contents of a @@ -166,12 +164,12 @@ as well as from the owned box, and then compute the distance between them. # Lifetimes -We’ve seen a few examples of borrowing data. Up till this point, we’ve glossed +We’ve seen a few examples of borrowing data. To this point, we’ve glossed over issues of safety. As stated in the introduction, at runtime a reference is simply a pointer, nothing more. Therefore, avoiding C's problems with dangling pointers requires a compile-time safety check. -The basis for the check is the notion of __lifetimes__. A lifetime is a +The basis for the check is the notion of _lifetimes_. A lifetime is a static approximation of the span of execution during which the pointer is valid: it always corresponds to some expression or block within the program. @@ -324,7 +322,7 @@ circle constant][tau] and not that dreadfully outdated notion of pi). The second match is more interesting. Here we match against a rectangle and extract its size: but rather than copy the `size` -struct, we use a __by-reference binding__ to create a pointer to it. In +struct, we use a by-reference binding to create a pointer to it. In other words, a pattern binding like `ref size` binds the name `size` to a pointer of type `&size` into the _interior of the enum_. diff --git a/src/doc/guide-pointers.md b/src/doc/guide-pointers.md index bee1dbcd2ce..248142851b7 100644 --- a/src/doc/guide-pointers.md +++ b/src/doc/guide-pointers.md @@ -37,7 +37,7 @@ error: mismatched types: expected `&int` but found `` (expec What gives? It needs a pointer! Therefore I have to use pointers!" -Turns out, you don't. __All you need is a reference__. Try this on for size: +Turns out, you don't. All you need is a reference. Try this on for size: ~~~rust # fn succ(x: &int) -> int { *x + 1 } @@ -74,8 +74,7 @@ Here are the use-cases for pointers. I've prefixed them with the name of the pointer that satisfies that use-case: 1. Owned: `Box` must be a pointer, because you don't know the size of the -object, so indirection is mandatory. Notation might change once Rust -support DST fully so we recommend you stay tuned. +object, so indirection is mandatory. 2. Owned: You need a recursive data structure. These can be infinite sized, so indirection is mandatory. @@ -89,7 +88,10 @@ common, such as C++, please read "A note..." below. care about its ownership. If you make the argument a reference, callers can send in whatever kind they want. -Four exceptions. That's it. Otherwise, you shouldn't need them. Be sceptical +5. Shared: You need to share data among tasks. You can achieve that via the +`Rc` and `Arc` types. + +Five exceptions. That's it. Otherwise, you shouldn't need them. Be sceptical of pointers in Rust: use them for a deliberate purpose, not just to make the compiler happy. @@ -205,10 +207,6 @@ The inner lists _must_ be an owned pointer, because we can't know how many elements are in the list. Without knowing the length, we don't know the size, and therefore require the indirection that pointers offer. -> Note: Nil is just part of the List enum and even though is being used -> to represent the concept of "nothing", you shouldn't think of it as -> NULL. Rust doesn't have NULL. - ## Efficiency This should almost never be a concern, but because creating an owned pointer @@ -284,8 +282,8 @@ fn main() { ~~~ This prints `5.83095189`. You can see that the `compute_distance` function -takes in two references, but we give it a stack allocated reference and an -owned box reference. +takes in two references, a reference to a value on the stack, and a reference +to a value in a box. Of course, if this were a real program, we wouldn't have any of these pointers, they're just there to demonstrate the concepts. @@ -361,6 +359,51 @@ hard for a computer, too! There is an entire [guide devoted to references and lifetimes](guide-lifetimes.html) that goes into lifetimes in great detail, so if you want the full details, check that out. +# Returning Pointers + +We've talked a lot about functions that accept various kinds of pointers, but +what about returning them? In general, it is better to let the caller decide +how to use a function's output, instead of assuming a certain type of pointer +is best. + +What does that mean? Don't do this: + +~~~rust +fn foo(x: Box) -> Box { + return box *x; +} + +fn main() { + let x = box 5; + let y = foo(x); +} +~~~ + +Do this: + +~~~rust +fn foo(x: Box) -> int { + return *x; +} + +fn main() { + let x = box 5; + let y = box foo(x); +} +~~~ + +This gives you flexibility, without sacrificing performance. + +You may think that this gives us terrible performance: return a value and then +immediately box it up ?! Isn't that the worst of both worlds? Rust is smarter +than that. There is no copy in this code. `main` allocates enough room for the +`box int`, passes a pointer to that memory into `foo` as `x`, and then `foo` writes +the value straight into that pointer. This writes the return value directly into +the allocated box. + +This is important enough that it bears repeating: pointers are not for optimizing +returning values from your code. Allow the caller to choose how they want to +use your output. # Related Resources -- cgit 1.4.1-3-g733a5