about summary refs log tree commit diff
path: root/library/std/src/io/tests.rs
AgeCommit message (Collapse)AuthorLines
2022-08-28Rollup merge of #97015 - nrc:read-buf-cursor, r=Mark-SimulacrumMatthias Krüger-10/+10
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`. The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`. In addition to the major API split, I've made the following smaller changes: * Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one) * Unified naming, e.g., by replacing initialized with init and assume_init with set_init * Added an easy way to get the number of bytes written to a cursor (`written` method) As well as simplifying the API (IMO), this approach has the following advantages: * Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`. * Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour. ## Outline ```rust pub struct BorrowBuf<'a> impl Debug for BorrowBuf<'_> impl<'a> From<&'a mut [u8]> for BorrowBuf<'a> impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a> impl<'a> BorrowBuf<'a> { pub fn capacity(&self) -> usize pub fn len(&self) -> usize pub fn init_len(&self) -> usize pub fn filled(&self) -> &[u8] pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a> pub fn clear(&mut self) -> &mut Self pub unsafe fn set_init(&mut self, n: usize) -> &mut Self } pub struct BorrowCursor<'buf, 'data> impl<'buf, 'data> BorrowCursor<'buf, 'data> { pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data> pub fn capacity(&self) -> usize pub fn written(&self) -> usize pub fn init_ref(&self) -> &[u8] pub fn init_mut(&mut self) -> &mut [u8] pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>] pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>] pub unsafe fn advance(&mut self, n: usize) -> &mut Self pub fn ensure_init(&mut self) -> &mut Self pub unsafe fn set_init(&mut self, n: usize) -> &mut Self pub fn append(&mut self, buf: &[u8]) } ``` ## TODO * ~~Migrate non-unix libs and tests~~ * ~~Naming~~ * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~ * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~. * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~ * ~~Documentation~~ * Tests are incomplete (I adjusted existing tests, but did not add new ones). cc https://github.com/rust-lang/rust/issues/78485, https://github.com/rust-lang/rust/issues/94741 supersedes: https://github.com/rust-lang/rust/pull/95770, https://github.com/rust-lang/rust/pull/93359 fixes #93305
2022-08-18make many std tests work in MiriRalf Jung-1/+2
2022-08-05non-linux platformsNick Cameron-3/+3
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-08-04std::io: migrate ReadBuf to BorrowBuf/BorrowCursorNick Cameron-10/+10
Signed-off-by: Nick Cameron <nrc@ncameron.org>
2022-07-25Rollup merge of #95040 - frank-king:fix/94981, r=Mark-SimulacrumYuki Okushi-0/+19
protect `std::io::Take::limit` from overflow in `read` Resolves #94981
2022-05-29protect `std::io::Take::limit` from overflow in `read`Frank King-0/+19
fixs #94981
2022-03-11Update tests.Mara Bos-4/+4
2022-02-04Hide Repr details from io::Error, and rework `io::Error::new_const`.Thom Chiovoloni-2/+2
2021-11-02read_bufDrMeepster-1/+44
2021-10-07Optimize File::read_to_end and read_to_stringJohn Kugelman-1/+1
Reading a file into an empty vector or string buffer can incur unnecessary `read` syscalls and memory re-allocations as the buffer "warms up" and grows to its final size. This is perhaps a necessary evil with generic readers, but files can be read in smarter by checking the file size and reserving that much capacity. `std::fs::read` and `read_to_string` already perform this optimization: they open the file, reads its metadata, and call `with_capacity` with the file size. This ensures that the buffer does not need to be resized and an initial string of small `read` syscalls. However, if a user opens the `File` themselves and calls `file.read_to_end` or `file.read_to_string` they do not get this optimization. ```rust let mut buf = Vec::new(); file.read_to_end(&mut buf)?; ``` I searched through this project's codebase and even here are a *lot* of examples of this. They're found all over in unit tests, which isn't a big deal, but there are also several real instances in the compiler and in Cargo. I've documented the ones I found in a comment here: https://github.com/rust-lang/rust/issues/89516#issuecomment-934423999 Most telling, the `Read` trait and the `read_to_end` method both show this exact pattern as examples of how to use readers. What this says to me is that this shouldn't be solved by simply fixing the instances of it in this codebase. If it's here it's certain to be prevalent in the wider Rust ecosystem. To that end, this commit adds specializations of `read_to_end` and `read_to_string` directly on `File`. This way it's no longer a minor footgun to start with an empty buffer when reading a file in. A nice side effect of this change is that code that accesses a `File` as a bare `Read` constraint or via a `dyn Read` trait object will benefit. For example, this code from `compiler/rustc_serialize/src/json.rs`: ```rust pub fn from_reader(rdr: &mut dyn Read) -> Result<Json, BuilderError> { let mut contents = Vec::new(); match rdr.read_to_end(&mut contents) { ``` Related changes: - I also added specializations to `BufReader` to delegate to `self.inner`'s methods. That way it can call `File`'s optimized implementations if the inner reader is a file. - The private `std::io::append_to_string` function is now marked `unsafe`. - `File::read_to_string` being more efficient means that the performance note for `io::read_to_string` can be softened. I've added @camelid's suggested wording from: https://github.com/rust-lang/rust/issues/80218#issuecomment-936806502
2021-09-22Fix read_to_end to not grow an exact size bufferJohn Kugelman-15/+3
If you know how much data to expect and use `Vec::with_capacity` to pre-allocate a buffer of that capacity, `Read::read_to_end` will still double its capacity. It needs some space to perform a read, even though that read ends up returning `0`. It's a bummer to carefully pre-allocate 1GB to read a 1GB file into memory and end up using 2GB. This fixes that behavior by special casing a full buffer and reading into a small "probe" buffer instead. If that read returns `0` then it's confirmed that the buffer was the perfect size. If it doesn't, the probe buffer is appended to the normal buffer and the read loop continues. Fixing this allows several workarounds in the standard library to be removed: - `Take` no longer needs to override `Read::read_to_end`. - The `reservation_size` callback that allowed `Take` to inhibit the previous over-allocation behavior isn't needed. - `fs::read` doesn't need to reserve an extra byte in `initial_buffer_size`. Curiously, there was a unit test that specifically checked that `Read::read_to_end` *does* over-allocate. I removed that test, too.
2021-07-01Stabilize `Seek::rewind`Aris Merchant-0/+4
2021-06-18Auto merge of #85815 - YuhanLiin:buf-read-data-left, r=m-ou-sebors-0/+10
Add has_data_left() to BufRead This is a continuation of #40747 and also addresses #40745. The problem with the previous PR was that it had "eof" in its method name. This PR uses a more descriptive method name, but I'm open to changing it.
2021-06-17Rollup merge of #86202 - a1phyr:spec_io_bytes_size_hint, r=m-ou-seMara Bos-1/+19
Specialize `io::Bytes::size_hint` for more types Improve the result of `<io::Bytes as Iterator>::size_hint` for some readers. I did not manage to specialize `SizeHint` for `io::Cursor` Side question: would it be interesting for `io::Read` to have an optional `size_hint` method ?
2021-06-10Specialize `io::Bytes::size_hint` for more typesBenoît du Garreau-1/+19
2021-06-05Rename IoSlice(Mut)::advance_slice to advance_slicesThomas de Zeeuw-16/+16
2021-05-29Add has_data_left() to BufReadYuhanLiin-0/+10
2021-05-29Rename IoSlice(Mut)::advance to advance_sliceThomas de Zeeuw-18/+18
To make way for a new IoSlice(Mut)::advance function that advances a single slice. Also changes the signature to accept a `&mut &mut [IoSlice]`, not returning anything. This will better match the future IoSlice::advance function.
2021-03-21Use io::Error::new_const everywhere to avoid allocations.Mara Bos-2/+2
2021-01-31Add tests for SizeHint implementationsXavientois-2/+29
2021-01-31Use fully qualified syntax to avoid dynXavientois-1/+21
2020-11-13move copy specialization tests to their own moduleThe8472-181/+0
2020-11-13add benchmarksThe8472-1/+131
2020-11-13move tests module into separate fileThe8472-1/+52
2020-08-31std: move "mod tests/benches" to separate filesLzu Tao-0/+494
Also doing fmt inplace as requested.