diff options
| author | Matthias Krüger <matthias.krueger@famsik.de> | 2022-08-28 09:35:11 +0200 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2022-08-28 09:35:11 +0200 |
| commit | b9306c231a9ffaf83d420ec09f165f58444d142e (patch) | |
| tree | 94de45df1e1e09e43ddeb4bcecc929af3e21d5d4 /library/std/src/io/readbuf.rs | |
| parent | 91f128baf7704a477ab7c499143a160fb069b3ad (diff) | |
| parent | ac70aea98509c33ec75208f7b42c8d905c74ebaf (diff) | |
| download | rust-b9306c231a9ffaf83d420ec09f165f58444d142e.tar.gz rust-b9306c231a9ffaf83d420ec09f165f58444d142e.zip | |
Rollup merge of #97015 - nrc:read-buf-cursor, r=Mark-Simulacrum
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor
This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.
The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.
In addition to the major API split, I've made the following smaller changes:
* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)
As well as simplifying the API (IMO), this approach has the following advantages:
* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.
## Outline
```rust
pub struct BorrowBuf<'a>
impl Debug for BorrowBuf<'_>
impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>
impl<'a> BorrowBuf<'a> {
pub fn capacity(&self) -> usize
pub fn len(&self) -> usize
pub fn init_len(&self) -> usize
pub fn filled(&self) -> &[u8]
pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
pub fn clear(&mut self) -> &mut Self
pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}
pub struct BorrowCursor<'buf, 'data>
impl<'buf, 'data> BorrowCursor<'buf, 'data> {
pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
pub fn capacity(&self) -> usize
pub fn written(&self) -> usize
pub fn init_ref(&self) -> &[u8]
pub fn init_mut(&mut self) -> &mut [u8]
pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
pub unsafe fn advance(&mut self, n: usize) -> &mut Self
pub fn ensure_init(&mut self) -> &mut Self
pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
pub fn append(&mut self, buf: &[u8])
}
```
## TODO
* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
* ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
* ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
* ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).
cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741
supersedes: https://github.com/rust-lang/rust/pull/95770, https://github.com/rust-lang/rust/pull/93359
fixes #93305
Diffstat (limited to 'library/std/src/io/readbuf.rs')
| -rw-r--r-- | library/std/src/io/readbuf.rs | 307 |
1 files changed, 182 insertions, 125 deletions
diff --git a/library/std/src/io/readbuf.rs b/library/std/src/io/readbuf.rs index 78d1113f837..b1a84095f13 100644 --- a/library/std/src/io/readbuf.rs +++ b/library/std/src/io/readbuf.rs @@ -5,9 +5,10 @@ mod tests; use crate::cmp; use crate::fmt::{self, Debug, Formatter}; -use crate::mem::MaybeUninit; +use crate::io::{Result, Write}; +use crate::mem::{self, MaybeUninit}; -/// A wrapper around a byte buffer that is incrementally filled and initialized. +/// A borrowed byte buffer which is incrementally filled and initialized. /// /// This type is a sort of "double cursor". It tracks three regions in the buffer: a region at the beginning of the /// buffer that has been logically filled with data, a region that has been initialized at some point but not yet @@ -20,230 +21,286 @@ use crate::mem::MaybeUninit; /// [ filled | unfilled ] /// [ initialized | uninitialized ] /// ``` -pub struct ReadBuf<'a> { - buf: &'a mut [MaybeUninit<u8>], +/// +/// A `BorrowedBuf` is created around some existing data (or capacity for data) via a unique reference +/// (`&mut`). The `BorrowedBuf` can be configured (e.g., using `clear` or `set_init`), but cannot be +/// directly written. To write into the buffer, use `unfilled` to create a `BorrowedCursor`. The cursor +/// has write-only access to the unfilled portion of the buffer (you can think of it as a +/// write-only iterator). +/// +/// The lifetime `'data` is a bound on the lifetime of the underlying data. +pub struct BorrowedBuf<'data> { + /// The buffer's underlying data. + buf: &'data mut [MaybeUninit<u8>], + /// The length of `self.buf` which is known to be filled. filled: usize, - initialized: usize, + /// The length of `self.buf` which is known to be initialized. + init: usize, } -impl Debug for ReadBuf<'_> { +impl Debug for BorrowedBuf<'_> { fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result { - f.debug_struct("ReadBuf") - .field("init", &self.initialized()) + f.debug_struct("BorrowedBuf") + .field("init", &self.init) .field("filled", &self.filled) .field("capacity", &self.capacity()) .finish() } } -impl<'a> ReadBuf<'a> { - /// Creates a new `ReadBuf` from a fully initialized buffer. +/// Create a new `BorrowedBuf` from a fully initialized slice. +impl<'data> From<&'data mut [u8]> for BorrowedBuf<'data> { #[inline] - pub fn new(buf: &'a mut [u8]) -> ReadBuf<'a> { - let len = buf.len(); + fn from(slice: &'data mut [u8]) -> BorrowedBuf<'data> { + let len = slice.len(); - ReadBuf { - //SAFETY: initialized data never becoming uninitialized is an invariant of ReadBuf - buf: unsafe { (buf as *mut [u8]).as_uninit_slice_mut().unwrap() }, + BorrowedBuf { + // SAFETY: initialized data never becoming uninitialized is an invariant of BorrowedBuf + buf: unsafe { (slice as *mut [u8]).as_uninit_slice_mut().unwrap() }, filled: 0, - initialized: len, + init: len, } } +} - /// Creates a new `ReadBuf` from a fully uninitialized buffer. - /// - /// Use `assume_init` if part of the buffer is known to be already initialized. +/// Create a new `BorrowedBuf` from an uninitialized buffer. +/// +/// Use `set_init` if part of the buffer is known to be already initialized. +impl<'data> From<&'data mut [MaybeUninit<u8>]> for BorrowedBuf<'data> { #[inline] - pub fn uninit(buf: &'a mut [MaybeUninit<u8>]) -> ReadBuf<'a> { - ReadBuf { buf, filled: 0, initialized: 0 } + fn from(buf: &'data mut [MaybeUninit<u8>]) -> BorrowedBuf<'data> { + BorrowedBuf { buf, filled: 0, init: 0 } } +} +impl<'data> BorrowedBuf<'data> { /// Returns the total capacity of the buffer. #[inline] pub fn capacity(&self) -> usize { self.buf.len() } + /// Returns the length of the filled part of the buffer. + #[inline] + pub fn len(&self) -> usize { + self.filled + } + + /// Returns the length of the initialized part of the buffer. + #[inline] + pub fn init_len(&self) -> usize { + self.init + } + /// Returns a shared reference to the filled portion of the buffer. #[inline] pub fn filled(&self) -> &[u8] { - //SAFETY: We only slice the filled part of the buffer, which is always valid + // SAFETY: We only slice the filled part of the buffer, which is always valid unsafe { MaybeUninit::slice_assume_init_ref(&self.buf[0..self.filled]) } } - /// Returns a mutable reference to the filled portion of the buffer. + /// Returns a cursor over the unfilled part of the buffer. #[inline] - pub fn filled_mut(&mut self) -> &mut [u8] { - //SAFETY: We only slice the filled part of the buffer, which is always valid - unsafe { MaybeUninit::slice_assume_init_mut(&mut self.buf[0..self.filled]) } + pub fn unfilled<'this>(&'this mut self) -> BorrowedCursor<'this> { + BorrowedCursor { + start: self.filled, + // SAFETY: we never assign into `BorrowedCursor::buf`, so treating its + // lifetime covariantly is safe. + buf: unsafe { + mem::transmute::<&'this mut BorrowedBuf<'data>, &'this mut BorrowedBuf<'this>>(self) + }, + } } - /// Returns a shared reference to the initialized portion of the buffer. + /// Clears the buffer, resetting the filled region to empty. /// - /// This includes the filled portion. + /// The number of initialized bytes is not changed, and the contents of the buffer are not modified. #[inline] - pub fn initialized(&self) -> &[u8] { - //SAFETY: We only slice the initialized part of the buffer, which is always valid - unsafe { MaybeUninit::slice_assume_init_ref(&self.buf[0..self.initialized]) } + pub fn clear(&mut self) -> &mut Self { + self.filled = 0; + self } - /// Returns a mutable reference to the initialized portion of the buffer. + /// Asserts that the first `n` bytes of the buffer are initialized. /// - /// This includes the filled portion. - #[inline] - pub fn initialized_mut(&mut self) -> &mut [u8] { - //SAFETY: We only slice the initialized part of the buffer, which is always valid - unsafe { MaybeUninit::slice_assume_init_mut(&mut self.buf[0..self.initialized]) } - } - - /// Returns a mutable reference to the unfilled part of the buffer without ensuring that it has been fully - /// initialized. + /// `BorrowedBuf` assumes that bytes are never de-initialized, so this method does nothing when called with fewer + /// bytes than are already known to be initialized. /// /// # Safety /// - /// The caller must not de-initialize portions of the buffer that have already been initialized. + /// The caller must ensure that the first `n` unfilled bytes of the buffer have already been initialized. #[inline] - pub unsafe fn unfilled_mut(&mut self) -> &mut [MaybeUninit<u8>] { - &mut self.buf[self.filled..] + pub unsafe fn set_init(&mut self, n: usize) -> &mut Self { + self.init = cmp::max(self.init, n); + self } +} - /// Returns a mutable reference to the uninitialized part of the buffer. +/// A writeable view of the unfilled portion of a [`BorrowedBuf`](BorrowedBuf). +/// +/// Provides access to the initialized and uninitialized parts of the underlying `BorrowedBuf`. +/// Data can be written directly to the cursor by using [`append`](BorrowedCursor::append) or +/// indirectly by getting a slice of part or all of the cursor and writing into the slice. In the +/// indirect case, the caller must call [`advance`](BorrowedCursor::advance) after writing to inform +/// the cursor how many bytes have been written. +/// +/// Once data is written to the cursor, it becomes part of the filled portion of the underlying +/// `BorrowedBuf` and can no longer be accessed or re-written by the cursor. I.e., the cursor tracks +/// the unfilled part of the underlying `BorrowedBuf`. +/// +/// The lifetime `'a` is a bound on the lifetime of the underlying buffer (which means it is a bound +/// on the data in that buffer by transitivity). +#[derive(Debug)] +pub struct BorrowedCursor<'a> { + /// The underlying buffer. + // Safety invariant: we treat the type of buf as covariant in the lifetime of `BorrowedBuf` when + // we create a `BorrowedCursor`. This is only safe if we never replace `buf` by assigning into + // it, so don't do that! + buf: &'a mut BorrowedBuf<'a>, + /// The length of the filled portion of the underlying buffer at the time of the cursor's + /// creation. + start: usize, +} + +impl<'a> BorrowedCursor<'a> { + /// Reborrow this cursor by cloning it with a smaller lifetime. /// - /// It is safe to uninitialize any of these bytes. + /// Since a cursor maintains unique access to its underlying buffer, the borrowed cursor is + /// not accessible while the new cursor exists. #[inline] - pub fn uninitialized_mut(&mut self) -> &mut [MaybeUninit<u8>] { - &mut self.buf[self.initialized..] + pub fn reborrow<'this>(&'this mut self) -> BorrowedCursor<'this> { + BorrowedCursor { + // SAFETY: we never assign into `BorrowedCursor::buf`, so treating its + // lifetime covariantly is safe. + buf: unsafe { + mem::transmute::<&'this mut BorrowedBuf<'a>, &'this mut BorrowedBuf<'this>>( + self.buf, + ) + }, + start: self.start, + } } - /// Returns a mutable reference to the unfilled part of the buffer, ensuring it is fully initialized. - /// - /// Since `ReadBuf` tracks the region of the buffer that has been initialized, this is effectively "free" after - /// the first use. + /// Returns the available space in the cursor. #[inline] - pub fn initialize_unfilled(&mut self) -> &mut [u8] { - // should optimize out the assertion - self.initialize_unfilled_to(self.remaining()) + pub fn capacity(&self) -> usize { + self.buf.capacity() - self.buf.filled } - /// Returns a mutable reference to the first `n` bytes of the unfilled part of the buffer, ensuring it is - /// fully initialized. - /// - /// # Panics + /// Returns the number of bytes written to this cursor since it was created from a `BorrowedBuf`. /// - /// Panics if `self.remaining()` is less than `n`. + /// Note that if this cursor is a reborrowed clone of another, then the count returned is the + /// count written via either cursor, not the count since the cursor was reborrowed. #[inline] - pub fn initialize_unfilled_to(&mut self, n: usize) -> &mut [u8] { - assert!(self.remaining() >= n); - - let extra_init = self.initialized - self.filled; - // If we don't have enough initialized, do zeroing - if n > extra_init { - let uninit = n - extra_init; - let unfilled = &mut self.uninitialized_mut()[0..uninit]; - - for byte in unfilled.iter_mut() { - byte.write(0); - } - - // SAFETY: we just initialized uninit bytes, and the previous bytes were already init - unsafe { - self.assume_init(n); - } - } - - let filled = self.filled; + pub fn written(&self) -> usize { + self.buf.filled - self.start + } - &mut self.initialized_mut()[filled..filled + n] + /// Returns a shared reference to the initialized portion of the cursor. + #[inline] + pub fn init_ref(&self) -> &[u8] { + // SAFETY: We only slice the initialized part of the buffer, which is always valid + unsafe { MaybeUninit::slice_assume_init_ref(&self.buf.buf[self.buf.filled..self.buf.init]) } } - /// Returns the number of bytes at the end of the slice that have not yet been filled. + /// Returns a mutable reference to the initialized portion of the cursor. #[inline] - pub fn remaining(&self) -> usize { - self.capacity() - self.filled + pub fn init_mut(&mut self) -> &mut [u8] { + // SAFETY: We only slice the initialized part of the buffer, which is always valid + unsafe { + MaybeUninit::slice_assume_init_mut(&mut self.buf.buf[self.buf.filled..self.buf.init]) + } } - /// Clears the buffer, resetting the filled region to empty. + /// Returns a mutable reference to the uninitialized part of the cursor. /// - /// The number of initialized bytes is not changed, and the contents of the buffer are not modified. + /// It is safe to uninitialize any of these bytes. #[inline] - pub fn clear(&mut self) -> &mut Self { - self.set_filled(0) // The assertion in `set_filled` is optimized out + pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>] { + &mut self.buf.buf[self.buf.init..] } - /// Increases the size of the filled region of the buffer. - /// - /// The number of initialized bytes is not changed. + /// Returns a mutable reference to the whole cursor. /// - /// # Panics + /// # Safety /// - /// Panics if the filled region of the buffer would become larger than the initialized region. + /// The caller must not uninitialize any bytes in the initialized portion of the cursor. #[inline] - pub fn add_filled(&mut self, n: usize) -> &mut Self { - self.set_filled(self.filled + n) + pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>] { + &mut self.buf.buf[self.buf.filled..] } - /// Sets the size of the filled region of the buffer. + /// Advance the cursor by asserting that `n` bytes have been filled. /// - /// The number of initialized bytes is not changed. + /// After advancing, the `n` bytes are no longer accessible via the cursor and can only be + /// accessed via the underlying buffer. I.e., the buffer's filled portion grows by `n` elements + /// and its unfilled portion (and the capacity of this cursor) shrinks by `n` elements. /// - /// Note that this can be used to *shrink* the filled region of the buffer in addition to growing it (for - /// example, by a `Read` implementation that compresses data in-place). - /// - /// # Panics + /// # Safety /// - /// Panics if the filled region of the buffer would become larger than the initialized region. + /// The caller must ensure that the first `n` bytes of the cursor have been properly + /// initialised. + #[inline] + pub unsafe fn advance(&mut self, n: usize) -> &mut Self { + self.buf.filled += n; + self.buf.init = cmp::max(self.buf.init, self.buf.filled); + self + } + + /// Initializes all bytes in the cursor. #[inline] - pub fn set_filled(&mut self, n: usize) -> &mut Self { - assert!(n <= self.initialized); + pub fn ensure_init(&mut self) -> &mut Self { + for byte in self.uninit_mut() { + byte.write(0); + } + self.buf.init = self.buf.capacity(); - self.filled = n; self } - /// Asserts that the first `n` unfilled bytes of the buffer are initialized. + /// Asserts that the first `n` unfilled bytes of the cursor are initialized. /// - /// `ReadBuf` assumes that bytes are never de-initialized, so this method does nothing when called with fewer - /// bytes than are already known to be initialized. + /// `BorrowedBuf` assumes that bytes are never de-initialized, so this method does nothing when + /// called with fewer bytes than are already known to be initialized. /// /// # Safety /// - /// The caller must ensure that the first `n` unfilled bytes of the buffer have already been initialized. + /// The caller must ensure that the first `n` bytes of the buffer have already been initialized. #[inline] - pub unsafe fn assume_init(&mut self, n: usize) -> &mut Self { - self.initialized = cmp::max(self.initialized, self.filled + n); + pub unsafe fn set_init(&mut self, n: usize) -> &mut Self { + self.buf.init = cmp::max(self.buf.init, self.buf.filled + n); self } - /// Appends data to the buffer, advancing the written position and possibly also the initialized position. + /// Appends data to the cursor, advancing position within its buffer. /// /// # Panics /// - /// Panics if `self.remaining()` is less than `buf.len()`. + /// Panics if `self.capacity()` is less than `buf.len()`. #[inline] pub fn append(&mut self, buf: &[u8]) { - assert!(self.remaining() >= buf.len()); + assert!(self.capacity() >= buf.len()); // SAFETY: we do not de-initialize any of the elements of the slice unsafe { - MaybeUninit::write_slice(&mut self.unfilled_mut()[..buf.len()], buf); + MaybeUninit::write_slice(&mut self.as_mut()[..buf.len()], buf); } // SAFETY: We just added the entire contents of buf to the filled section. unsafe { - self.assume_init(buf.len()); + self.set_init(buf.len()); } - self.add_filled(buf.len()); + self.buf.filled += buf.len(); } +} - /// Returns the amount of bytes that have been filled. - #[inline] - pub fn filled_len(&self) -> usize { - self.filled +impl<'a> Write for BorrowedCursor<'a> { + fn write(&mut self, buf: &[u8]) -> Result<usize> { + self.append(buf); + Ok(buf.len()) } - /// Returns the amount of bytes that have been initialized. - #[inline] - pub fn initialized_len(&self) -> usize { - self.initialized + fn flush(&mut self) -> Result<()> { + Ok(()) } } |
