diff options
| author | bors <bors@rust-lang.org> | 2014-07-26 00:46:16 +0000 |
|---|---|---|
| committer | bors <bors@rust-lang.org> | 2014-07-26 00:46:16 +0000 |
| commit | cf1381c1d000a24f95be7d53c4318c18c2daddbb (patch) | |
| tree | 8c04097c7494e4ab96c91e5e3f24341d4640bf90 | |
| parent | 92c97059ff6fe728d47d7fd2c06a658fd12957d0 (diff) | |
| parent | 377b2508f2771156620f8be49940c88565f9af1a (diff) | |
| download | rust-cf1381c1d000a24f95be7d53c4318c18c2daddbb.tar.gz rust-cf1381c1d000a24f95be7d53c4318c18c2daddbb.zip | |
auto merge of #15789 : steveklabnik/rust/guide_pointers, r=cmr
This is super, super WIP, but I'm going to go get lunch for a while, and figured I'd toss my work up here in case anyone wants to see my work as I do it. This contains a new introductory section explaining the basics of pointers, and some pitfalls that Rust attempts to solve. I'd be interested in hearing how my explanation is, as well as if this belongs here. Pointers are such a crucial concept, I don't mind having a beginners' section on them in the main docs, even though our main audience is supposed to understand them already. Reasonable people may disagree, however.
| -rw-r--r-- | src/doc/guide-pointers.md | 865 |
1 files changed, 603 insertions, 262 deletions
diff --git a/src/doc/guide-pointers.md b/src/doc/guide-pointers.md index 17a1114be55..5ec16e852a5 100644 --- a/src/doc/guide-pointers.md +++ b/src/doc/guide-pointers.md @@ -5,334 +5,383 @@ are also one of the more confusing topics for newcomers to Rust. They can also be confusing for people coming from other languages that support pointers, such as C++. This guide will help you understand this important topic. -# You don't actually need pointers, use references - -I have good news for you: you probably don't need to care about pointers, -especially as you're getting started. Think of it this way: Rust is a language -that emphasizes safety. Pointers, as the joke goes, are very pointy: it's easy -to accidentally stab yourself. Therefore, Rust is made in a way such that you -don't need them very often. - -"But guide!" you may cry. "My co-worker wrote a function that looks like this: - -~~~rust -fn succ(x: &int) -> int { *x + 1 } -~~~ - -So I wrote this code to try it out: - -~~~rust{.ignore} -fn main() { - let number = 5; - let succ_number = succ(number); - println!("{}", succ_number); +Be sceptical of non-reference pointers in Rust: use them for a deliberate +purpose, not just to make the compiler happy. Each pointer type comes with an +explanation about when they are appropriate to use. Default to references +unless you're in one of those specific situations. + +You may be interested in the [cheat sheet](#cheat-sheet), which gives a quick +overview of the types, names, and purpose of the various pointers. + +# An introduction + +If you aren't familiar with the concept of pointers, here's a short +introduction. Pointers are a very fundamental concept in systems programming +languages, so it's important to understand them. + +## Pointer Basics + +When you create a new variable binding, you're giving a name to a value that's +stored at a particular location on the stack. (If you're not familiar with the +"heap" vs. "stack", please check out [this Stack Overflow +question](http://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap), +as the rest of this guide assumes you know the difference.) Like this: + +```{rust} +let x = 5i; +let y = 8i; +``` +| location | value | +|----------|-------| +| 0xd3e030 | 5 | +| 0xd3e028 | 8 | + +We're making up memory locations here, they're just sample values. Anyway, the +point is that `x`, the name we're using for our variable, corresponds to the +memory location `0xd3e030`, and the value at that location is `5`. When we +refer to `x`, we get the corresponding value. Hence, `x` is `5`. + +Let's introduce a pointer. In some languages, there is just one type of +'pointer,' but in Rust, we have many types. In this case, we'll use a Rust +**reference**, which is the simplest kind of pointer. + +```{rust} +let x = 5i; +let y = 8i; +let z = &y; +``` +|location | value | +|-------- |----------| +|0xd3e030 | 5 | +|0xd3e028 | 8 | +|0xd3e020 | 0xd3e028 | + +See the difference? Rather than contain a value, the value of a pointer is a +location in memory. In this case, the location of `y`. `x` and `y` have the +type `int`, but `z` has the type `&int`. We can print this location using the +`{:p}` format string: + +```{rust} +let x = 5i; +let y = 8i; +let z = &y; + +println!("{:p}", z); +``` + +This would print `0xd3e028`, with our fictional memory addresses. + +Because `int` and `&int` are different types, we can't, for example, add them +together: + +```{rust,ignore} +let x = 5i; +let y = 8i; +let z = &y; + +println!("{}", x + z); +``` + +This gives us an error: + +```{notrust,ignore} +hello.rs:6:24: 6:25 error: mismatched types: expected `int` but found `&int` (expected int but found &-ptr) +hello.rs:6 println!("{}", x + z); + ^ +``` + +We can **dereference** the pointer by using the `*` operator. Dereferencing a +pointer means accessing the value at the location stored in the pointer. This +will work: + +```{rust} +let x = 5i; +let y = 8i; +let z = &y; + +println!("{}", x + *z); +``` + +It prints `13`. + +That's it! That's all pointers are: they point to some memory location. Not +much else to them. Now that we've discussed the 'what' of pointers, let's +talk about the 'why.' + +## Pointer uses + +Rust's pointers are quite useful, but in different ways than in other systems +languages. We'll talk about best practices for Rust pointers later in +the guide, but here are some ways that pointers are useful in other languages: + +In C, strings are a pointer to a list of `char`s, ending with a null byte. +The only way to use strings is to get quite familiar with pointers. + +Pointers are useful to point to memory locations that are not on the stack. For +example, our example used two stack variables, so we were able to give them +names. But if we allocated some heap memory, we wouldn't have that name +available. In C, `malloc` is used to allocate heap memory, and it returns a +pointer. + +As a more general variant of the previous two points, any time you have a +structure that can change in size, you need a pointer. You can't tell at +compile time how much memory to allocate, so you've gotta use a pointer to +point at the memory where it will be allocated, and deal with it at run time. + +Pointers are useful in languages that are pass-by-value, rather than +pass-by-reference. Basically, languages can make two choices (this is made +up syntax, it's not Rust): + +```{notrust,ignore} +fn foo(x) { + x = 5 } -~~~ - -And now I get an error: - -~~~text -error: mismatched types: expected `&int` but found `<generic integer #0>` (expected &-ptr but found integral variable) -~~~ - -What gives? It needs a pointer! Therefore I have to use pointers!" - -Turns out, you don't. All you need is a reference. Try this on for size: -~~~rust -# fn succ(x: &int) -> int { *x + 1 } fn main() { - let number = 5; - let succ_number = succ(&number); - println!("{}", succ_number); + i = 1 + foo(i) + // what is the value of i here? } -~~~ +``` -It's that easy! One extra little `&` there. This code will run, and print `6`. +In languages that are pass-by-value, `foo` will get a copy of `i`, and so +the original version of `i` is not modified. At the comment, `i` will still be +`1`. In a language that is pass-by-reference, `foo` will get a reference to `i`, +and therefore, can change its value. At the comment, `i` will be `5`. -That's all you need to know. Your co-worker could have written the function -like this: +So what do pointers have to do with this? Well, since pointers point to a +location in memory... -~~~rust -fn succ(x: int) -> int { x + 1 } +```{notrust,ignore} +fn foo(&int x) { + *x = 5 +} fn main() { - let number = 5; - let succ_number = succ(number); - println!("{}", succ_number); + i = 1 + foo(&i) + // what is the value of i here? } -~~~ - -No pointers even needed. Then again, this is a simple example. I assume that -your real-world `succ` function is more complicated, and maybe your co-worker -had a good reason for `x` to be a pointer of some kind. In that case, references -are your best friend. Don't worry about it, life is too short. - -However. - -Here are the use-cases for pointers. I've prefixed them with the name of the -pointer that satisfies that use-case: +``` -1. Owned: `Box<Trait>` must be a pointer, because you don't know the size of the -object, so indirection is mandatory. +Even in a language which is pass by value, `i` will be `5` at the comment. You +see, because the argument `x` is a pointer, we do send a copy over to `foo`, +but because it points at a memory location, which we then assign to, the +original value is still changed. This pattern is called +'pass-reference-by-value.' Tricky! -2. Owned: You need a recursive data structure. These can be infinite sized, so -indirection is mandatory. +## Common pointer problems -3. Owned: A very, very, very rare situation in which you have a *huge* chunk of -data that you wish to pass to many methods. Passing a pointer will make this -more efficient. If you're coming from another language where this technique is -common, such as C++, please read "A note..." below. +We've talked about pointers, and we've sung their praises. So what's the +downside? Well, Rust attempts to mitigate each of these kinds of problems, +but here are problems with pointers in other languages: -4. Reference: You're writing a function, and you need a pointer, but you don't -care about its ownership. If you make the argument a reference, callers -can send in whatever kind they want. +Uninitialized pointers can cause a problem. For example, what does this program +do? -5. Shared: You need to share data among tasks. You can achieve that via the -`Rc` and `Arc` types. +```{notrust,ignore} +&int x; +*x = 5; // whoops! +``` -Five exceptions. That's it. Otherwise, you shouldn't need them. Be sceptical -of pointers in Rust: use them for a deliberate purpose, not just to make the -compiler happy. +Who knows? We just declare a pointer, but don't point it at anything, and then +set the memory location that it points at to be `5`. But which location? Nobody +knows. This might be harmless, and it might be catastrophic. -## A note for those proficient in pointers +When you combine pointers and functions, it's easy to accidentally invalidate +the memory the pointer is pointing to. For example: -If you're coming to Rust from a language like C or C++, you may be used to -passing things by reference, or passing things by pointer. In some languages, -like Java, you can't even have objects without a pointer to them. Therefore, if -you were writing this Rust code: +```{notrust,ignore} +fn make_pointer(): &int { + x = 5; -~~~rust -# fn transform(p: Point) -> Point { p } -#[deriving(Show)] -struct Point { - x: int, - y: int, + return &x; } fn main() { - let p0 = Point { x: 5, y: 10}; - let p1 = transform(p0); - println!("{}", p1); + &int i = make_pointer(); + *i = 5; // uh oh! } +``` -~~~ +`x` is local to the `make_pointer` function, and therefore, is invalid as soon +as `make_pointer` returns. But we return a pointer to its memory location, and +so back in `main`, we try to use that pointer, and it's a very similar +situation to our first one. Setting invalid memory locations is bad. -I think you'd implement `transform` like this: +As one last example of a big problem with pointers, **aliasing** can be an +issue. Two pointers are said to alias when they point at the same location +in memory. Like this: -~~~rust -# struct Point { -# x: int, -# y: int, -# } -# let p0 = Point { x: 5, y: 10}; -fn transform(p: &Point) -> Point { - Point { x: p.x + 1, y: p.y + 1} +```{notrust,ignore} +fn mutate(&int i, int j) { + *i = j; } -// and change this: -let p1 = transform(&p0); -~~~ +fn main() { + x = 5; + y = &x; + z = &x; //y and z are aliased -This does work, but you don't need to create those references! The better way to write this is simply: -~~~rust -#[deriving(Show)] -struct Point { - x: int, - y: int, -} + run_in_new_thread(mutate, y, 1); + run_in_new_thread(mutate, z, 100); -fn transform(p: Point) -> Point { - Point { x: p.x + 1, y: p.y + 1} + // what is the value of x here? } +``` -fn main() { - let p0 = Point { x: 5, y: 10}; - let p1 = transform(p0); - println!("{}", p1); -} -~~~ +In this made-up example, `run_in_new_thread` spins up a new thread, and calls +the given function name with its arguments. Since we have two threads, and +they're both operating on aliases to `x`, we can't tell which one finishes +first, and therefore, the value of `x` is actually non-deterministic. Worse, +what if one of them had invalidated the memory location they pointed to? We'd +have the same problem as before, where we'd be setting an invalid location. -But won't this be inefficient? Well, that's a complicated question, but it's -important to know that Rust, like C and C++, store aggregate data types -'unboxed,' whereas languages like Java and Ruby store these types as 'boxed.' -For smaller structs, this way will be more efficient. For larger ones, it may -be less so. But don't reach for that pointer until you must! Make sure that the -struct is large enough by performing some tests before you add in the -complexity of pointers. +## Conclusion -# Owned Pointers +That's a basic overview of pointers as a general concept. As we alluded to +before, Rust has different kinds of pointers, rather than just one, and +mitigates all of the problems that we talked about, too. This does mean that +Rust pointers are slightly more complicated than in other languages, but +it's worth it to not have the problems that simple pointers have. -Owned pointers are the conceptually simplest kind of pointer in Rust. A rough -approximation of owned pointers follows: +# References -1. Only one owned pointer may exist to a particular place in memory. It may be -borrowed from that owner, however. +The most basic type of pointer that Rust has is called a 'reference.' Rust +references look like this: -2. The Rust compiler uses static analysis to determine where the pointer is in -scope, and handles allocating and de-allocating that memory. Owned pointers are -not garbage collected. +```{rust} +let x = 5i; +let y = &x; -These two properties make for three use cases. +println!("{}", *y); +println!("{:p}", y); +println!("{}", y); +``` -## References to Traits +We'd say "`y` is a reference to `x`." The first `println!` prints out the +value of `y`'s referent by using the dereference operator, `*`. The second +one prints out the memory location that `y` points to, by using the pointer +format string. The third `println!` *also* prints out the value of `y`'s +referent, because `println!` will automatically dereference it for us. -Traits must be referenced through a pointer, because the struct that implements -the trait may be a different size than a different struct that implements the -trait. Therefore, unboxed traits don't make any sense, and aren't allowed. +Here's a function that takes a reference: -## Recursive Data Structures +```{rust} +fn succ(x: &int) -> int { *x + 1 } +``` -Sometimes, you need a recursive data structure. The simplest is known as a 'cons list': +You can also use `&` as an operator to create a reference, so we can +call this function in two different ways: -~~~rust -#[deriving(Show)] -enum List<T> { - Nil, - Cons(T, Box<List<T>>), -} +```{rust} +fn succ(x: &int) -> int { *x + 1 } fn main() { - let list: List<int> = Cons(1, box Cons(2, box Cons(3, box Nil))); - println!("{}", list); -} -~~~ - -This prints: - -~~~text -Cons(1, box Cons(2, box Cons(3, box Nil))) -~~~ -The inner lists _must_ be an owned pointer, because we can't know how many -elements are in the list. Without knowing the length, we don't know the size, -and therefore require the indirection that pointers offer. + let x = 5i; + let y = &x; -## Efficiency - -This should almost never be a concern, but because creating an owned pointer -boxes its value, it therefore makes referring to the value the size of the box. -This may make passing an owned pointer to a function less expensive than -passing the value itself. Don't worry yourself with this case until you've -proved that it's an issue through benchmarks. - -For example, this will work: - -~~~rust -struct Point { - x: int, - y: int, + println!("{}", succ(y)); + println!("{}", succ(&x)); } +``` -fn main() { - let a = Point { x: 10, y: 20 }; - spawn(proc() { - println!("{}", a.x); - }); -} -~~~ +Both of these `println!`s will print out `6`. -This struct is tiny, so it's fine. If `Point` were large, this would be more -efficient: +Of course, if this were real code, we wouldn't bother with the reference, and +just write: -~~~rust -struct Point { - x: int, - y: int, -} +```{rust} +fn succ(x: int) -> int { x + 1 } +``` -fn main() { - let a = box Point { x: 10, y: 20 }; - spawn(proc() { - println!("{}", a.x); - }); -} -~~~ +References are immutable by default: -Now it'll be copying a pointer-sized chunk of memory rather than the whole -struct. +```{rust,ignore} +let x = 5i; +let y = &x; -# References +*y = 5; // error: cannot assign to immutable dereference of `&`-pointer `*y` +``` -References are the third major kind of pointer Rust supports. They are -simultaneously the simplest and the most complicated kind. Let me explain: -references are considered 'borrowed' because they claim no ownership over the -data they're pointing to. They're just borrowing it for a while. So in that -sense, they're simple: just keep whatever ownership the data already has. For -example: - -~~~rust -struct Point { - x: f32, - y: f32, -} +They can be made mutable with `mut`, but only if its referent is also mutable. +This works: -fn compute_distance(p1: &Point, p2: &Point) -> f32 { - let x_d = p1.x - p2.x; - let y_d = p1.y - p2.y; +```{rust} +let mut x = 5i; +let y = &mut x; +``` - (x_d * x_d + y_d * y_d).sqrt() -} +This does not: -fn main() { - let origin = &Point { x: 0.0, y: 0.0 }; - let p1 = box Point { x: 5.0, y: 3.0 }; +```{rust,ignore} +let x = 5i; +let y = &mut x; // error: cannot borrow immutable local variable `x` as mutable +``` - println!("{}", compute_distance(origin, &*p1)); -} -~~~ +Immutable pointers are allowed to alias: -This prints `5.83095189`. You can see that the `compute_distance` function -takes in two references, a reference to a value on the stack, and a reference -to a value in a box. -Of course, if this were a real program, we wouldn't have any of these pointers, -they're just there to demonstrate the concepts. +```{rust} +let x = 5i; +let y = &x; +let z = &x; +``` -So how is this hard? Well, because we're ignoring ownership, the compiler needs -to take great care to make sure that everything is safe. Despite their complete -safety, a reference's representation at runtime is the same as that of -an ordinary pointer in a C program. They introduce zero overhead. The compiler -does all safety checks at compile time. +Mutable ones, however, are not: -This theory is called 'region pointers' and you can read more about it -[here](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf). -Region pointers evolved into what we know today as 'lifetimes'. +```{rust,ignore} +let x = 5i; +let y = &mut x; +let z = &mut x; // error: cannot borrow `x` as mutable more than once at a time +``` + +Despite their complete safety, a reference's representation at runtime is the +same as that of an ordinary pointer in a C program. They introduce zero +overhead. The compiler does all safety checks at compile time. The theory that +allows for this was originally called **region pointers**. Region pointers +evolved into what we know today as **lifetimes**. Here's the simple explanation: would you expect this code to compile? -~~~rust{.ignore} +```{rust,ignore} fn main() { println!("{}", x); let x = 5; } -~~~ +``` Probably not. That's because you know that the name `x` is valid from where it's declared to when it goes out of scope. In this case, that's the end of the `main` function. So you know this code will cause an error. We call this duration a 'lifetime'. Let's try a more complex example: -~~~rust +```{rust} fn main() { - let mut x = box 5i; + let x = &mut 5i; + if *x < 10 { let y = &x; + println!("Oh no: {}", y); return; } + *x -= 1; + println!("Oh no: {}", x); } -~~~ +``` Here, we're borrowing a pointer to `x` inside of the `if`. The compiler, however, is able to determine that that pointer will go out of scope without `x` being mutated, and therefore, lets us pass. This wouldn't work: -~~~rust{.ignore} +```{rust,ignore} fn main() { - let mut x = box 5i; + let x = &mut 5i; + if *x < 10 { let y = &x; *x -= 1; @@ -340,73 +389,365 @@ fn main() { println!("Oh no: {}", y); return; } + *x -= 1; + println!("Oh no: {}", x); } -~~~ +``` It gives this error: -~~~text +```{notrust,ignore} test.rs:5:8: 5:10 error: cannot assign to `*x` because it is borrowed test.rs:5 *x -= 1; ^~ test.rs:4:16: 4:18 note: borrow of `*x` occurs here test.rs:4 let y = &x; ^~ -~~~ +``` As you might guess, this kind of analysis is complex for a human, and therefore hard for a computer, too! There is an entire [guide devoted to references and lifetimes](guide-lifetimes.html) that goes into lifetimes in great detail, so if you want the full details, check that out. +## Best practices + +In general, prefer stack allocation over heap allocation. Using references to +stack allocated information is preferred whenever possible. Therefore, +references are the default pointer type you should use, unless you have +specific reason to use a different type. The other types of pointers cover when +they're appropriate to use in their own best practices sections. + +Use references when you want to use a pointer, but do not want to take ownership. +References just borrow ownership, which is more polite if you don't need the +ownership. In other words, prefer: + +```{rust} +fn succ(x: &int) -> int { *x + 1 } +``` + +to + +```{rust} +fn succ(x: Box<int>) -> int { *x + 1 } +``` + +As a corollary to that rule, references allow you to accept a wide variety of +other pointers, and so are useful so that you don't have to write a number +of variants per pointer. In other words, prefer: + +```{rust} +fn succ(x: &int) -> int { *x + 1 } +``` + +to + +```{rust} +fn box_succ(x: Box<int>) -> int { *x + 1 } + +fn rc_succ(x: std::rc::Rc<int>) -> int { *x + 1 } +``` + +# Boxes + +`Box<T>` is Rust's 'boxed pointer' type. Boxes provide the simplest form of +heap allocation in Rust. Creating a box looks like this: + +```{rust} +let x = box(std::boxed::HEAP) 5i; +``` + +`box` is a keyword that does 'placement new,' which we'll talk about in a bit. +`box` will be useful for creating a number of heap-allocated types, but is not +quite finished yet. In the meantime, `box`'s type defaults to +`std::boxed::HEAP`, and so you can leave it off: + +```{rust} +let x = box 5i; +``` + +As you might assume from the `HEAP`, boxes are heap allocated. They are +deallocated automatically by Rust when they go out of scope: + +```{rust} +{ + let x = box 5i; + + // stuff happens + +} // x is destructed and its memory is free'd here +``` + +However, boxes do _not_ use reference counting or garbage collection. Boxes are +what's called an **affine type**. This means that the Rust compiler, at compile +time, determines when the box comes into and goes out of scope, and inserts the +appropriate calls there. Furthermore, boxes are a specific kind of affine type, +known as a **region**. You can read more about regions [in this paper on the +Cyclone programming +language](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf). + +You don't need to fully grok the theory of affine types or regions to grok +boxes, though. As a rough approximation, you can treat this Rust code: + +```{rust} +{ + let x = box 5i; + + // stuff happens +} +``` + +As being similar to this C code: + +```{notrust,ignore} +{ + int *x; + x = (int *)malloc(sizeof(int)); + + // stuff happens + + free(x); +} +``` + +Of course, this is a 10,000 foot view. It leaves out destructors, for example. +But the general idea is correct: you get the semantics of `malloc`/`free`, but +with some improvements: + +1. It's impossible to allocate the incorrect amount of memory, because Rust + figures it out from the types. +2. You cannot forget to `free` memory you've allocated, because Rust does it + for you. +3. Rust ensures that this `free` happens at the right time, when it is truly + not used. Use-after-free is not possible. +4. Rust enforces that no other writeable pointers alias to this heap memory, + which means writing to an invalid pointer is not possible. + +See the section on references or the [lifetimes guide](guide-lifetimes.html) +for more detail on how lifetimes work. + +Using boxes and references together is very common. For example: + +```{rust} +fn add_one(x: &int) -> int { + *x + 1 +} + +fn main() { + let x = box 5i; + + println!("{}", add_one(&*x)); +} +``` + +In this case, Rust knows that `x` is being 'borrowed' by the `add_one()` +function, and since it's only reading the value, allows it. + +We can borrow `x` multiple times, as long as it's not simultaneous: + +```{rust} +fn add_one(x: &int) -> int { + *x + 1 +} + +fn main() { + let x = box 5i; + + println!("{}", add_one(&*x)); + println!("{}", add_one(&*x)); + println!("{}", add_one(&*x)); +} +``` + +Or as long as it's not a mutable borrow. This will error: + +```{rust,ignore} +fn add_one(x: &mut int) -> int { + *x + 1 +} + +fn main() { + let x = box 5i; + + println!("{}", add_one(&*x)); // error: cannot borrow immutable dereference + // of `&`-pointer as mutable +} +``` + +Notice we changed the signature of `add_one()` to request a mutable reference. + +# Best practices + +Boxes are appropriate to use in two situations: Recursive data structures, +and occasionally, when returning data. + +## Recursive data structures + +Sometimes, you need a recursive data structure. The simplest is known as a +'cons list': + + +```{rust} +#[deriving(Show)] +enum List<T> { + Cons(T, Box<List<T>>), + Nil, +} + +fn main() { + let list: List<int> = Cons(1, box Cons(2, box Cons(3, box Nil))); + println!("{}", list); +} +``` + +This prints: + +```{notrust,ignore} +Cons(1, box Cons(2, box Cons(3, box Nil))) +``` + +The reference to another `List` inside of the `Cons` enum variant must be a box, +because we don't know the length of the list. Because we don't know the length, +we don't know the size, and therefore, we need to heap allocate our list. + +Working with recursive or other unknown-sized data structures is the primary +use-case for boxes. + +## Returning data + +This is important enough to have its own section entirely. The TL;DR is this: +you don't generally want to return pointers, even when you might in a language +like C or C++. + +See [Returning Pointers](#returning-pointers) below for more. + +# Rc and Arc + +This part is coming soon. + +## Best practices + +This part is coming soon. + +# Gc + +The `Gc<T>` type exists for historical reasons, and is [still used +internally](https://github.com/rust-lang/rust/issues/7929) by the compiler. +It is not even a 'real' garbage collected type at the moment. + +In the future, Rust may have a real garbage collected type, and so it +has not yet been removed for that reason. + +## Best practices + +There is currently no legitimate use case for the `Gc<T>` type. + +# Raw Pointers + +This part is coming soon. + +## Best practices + +This part is coming soon. + # Returning Pointers -We've talked a lot about functions that accept various kinds of pointers, but -what about returning them? In general, it is better to let the caller decide -how to use a function's output, instead of assuming a certain type of pointer -is best. +In many languages with pointers, you'd return a pointer from a function +so as to avoid a copying a large data structure. For example: -What does that mean? Don't do this: +```{rust} +struct BigStruct { + one: int, + two: int, + // etc + one_hundred: int, +} -~~~rust -fn foo(x: Box<int>) -> Box<int> { +fn foo(x: Box<BigStruct>) -> Box<BigStruct> { return box *x; } fn main() { - let x = box 5; + let x = box BigStruct { + one: 1, + two: 2, + one_hundred: 100, + }; + let y = foo(x); } -~~~ +``` + +The idea is that by passing around a box, you're only copying a pointer, rather +than the hundred `int`s that make up the `BigStruct`. -Do this: +This is an antipattern in Rust. Instead, write this: -~~~rust -fn foo(x: Box<int>) -> int { +```{rust} +struct BigStruct { + one: int, + two: int, + // etc + one_hundred: int, +} + +fn foo(x: Box<BigStruct>) -> BigStruct { return *x; } fn main() { - let x = box 5; + let x = box BigStruct { + one: 1, + two: 2, + one_hundred: 100, + }; + let y = box foo(x); } -~~~ +``` -This gives you flexibility, without sacrificing performance. +This gives you flexibility without sacrificing performance. You may think that this gives us terrible performance: return a value and then immediately box it up ?! Isn't that the worst of both worlds? Rust is smarter -than that. There is no copy in this code. `main` allocates enough room for the -`box int`, passes a pointer to that memory into `foo` as `x`, and then `foo` writes -the value straight into that pointer. This writes the return value directly into +than that. There is no copy in this code. main allocates enough room for the +`box , passes a pointer to that memory into foo as x, and then foo writes the +value straight into that pointer. This writes the return value directly into the allocated box. -This is important enough that it bears repeating: pointers are not for optimizing -returning values from your code. Allow the caller to choose how they want to -use your output. +This is important enough that it bears repeating: pointers are not for +optimizing returning values from your code. Allow the caller to choose how they +want to use your output. + +# Creating your own Pointers + +This part is coming soon. + +## Best practices + +This part is coming soon. + +# Cheat Sheet + +Here's a quick rundown of Rust's pointer types: + +| Type | Name | Summary | +|--------------|---------------------|-------------------------------------------| +| `&T` | Reference | Allows one or more references to read `T` | +| `&mut T` | Mutable Reference | Allows a single reference to | +| | | read and write `T` | +| `Box<T>` | Box | Heap allocated `T` with a single owner | +| | | that may read and write `T`. | +| `Rc<T>` | "arr cee" pointer | Heap allocated `T` with many readers | +| `Arc<T>` | Arc pointer | Same as above, but safe sharing across | +| | | threads | +| `*const T` | Raw pointer | Unsafe read access to `T` | +| `*mut T` | Mutable raw pointer | Unsafe read and write access to `T` | -# Related Resources +# Related resources +* [API documentation for Box](std/boxed/index.html) * [Lifetimes guide](guide-lifetimes.html) +* [Cyclone paper on regions](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf), which inspired Rust's lifetime system |
