about summary refs log tree commit diff
path: root/library/std/src/fs.rs
AgeCommit message (Collapse)AuthorLines
2021-10-10Add #[must_use] to core and std constructorsJohn Kugelman-0/+2
2021-10-07Optimize File::read_to_end and read_to_stringJohn Kugelman-15/+37
Reading a file into an empty vector or string buffer can incur unnecessary `read` syscalls and memory re-allocations as the buffer "warms up" and grows to its final size. This is perhaps a necessary evil with generic readers, but files can be read in smarter by checking the file size and reserving that much capacity. `std::fs::read` and `read_to_string` already perform this optimization: they open the file, reads its metadata, and call `with_capacity` with the file size. This ensures that the buffer does not need to be resized and an initial string of small `read` syscalls. However, if a user opens the `File` themselves and calls `file.read_to_end` or `file.read_to_string` they do not get this optimization. ```rust let mut buf = Vec::new(); file.read_to_end(&mut buf)?; ``` I searched through this project's codebase and even here are a *lot* of examples of this. They're found all over in unit tests, which isn't a big deal, but there are also several real instances in the compiler and in Cargo. I've documented the ones I found in a comment here: https://github.com/rust-lang/rust/issues/89516#issuecomment-934423999 Most telling, the `Read` trait and the `read_to_end` method both show this exact pattern as examples of how to use readers. What this says to me is that this shouldn't be solved by simply fixing the instances of it in this codebase. If it's here it's certain to be prevalent in the wider Rust ecosystem. To that end, this commit adds specializations of `read_to_end` and `read_to_string` directly on `File`. This way it's no longer a minor footgun to start with an empty buffer when reading a file in. A nice side effect of this change is that code that accesses a `File` as a bare `Read` constraint or via a `dyn Read` trait object will benefit. For example, this code from `compiler/rustc_serialize/src/json.rs`: ```rust pub fn from_reader(rdr: &mut dyn Read) -> Result<Json, BuilderError> { let mut contents = Vec::new(); match rdr.read_to_end(&mut contents) { ``` Related changes: - I also added specializations to `BufReader` to delegate to `self.inner`'s methods. That way it can call `File`'s optimized implementations if the inner reader is a file. - The private `std::io::append_to_string` function is now marked `unsafe`. - `File::read_to_string` being more efficient means that the performance note for `io::read_to_string` can be softened. I've added @camelid's suggested wording from: https://github.com/rust-lang/rust/issues/80218#issuecomment-936806502
2021-10-04Auto merge of #89165 - jkugelman:read-to-end-overallocation, r=joshtriplettbors-4/+3
Fix read_to_end to not grow an exact size buffer If you know how much data to expect and use `Vec::with_capacity` to pre-allocate a buffer of that capacity, `Read::read_to_end` will still double its capacity. It needs some space to perform a read, even though that read ends up returning `0`. It's a bummer to carefully pre-allocate 1GB to read a 1GB file into memory and end up using 2GB. This fixes that behavior by special casing a full buffer and reading into a small "probe" buffer instead. If that read returns `0` then it's confirmed that the buffer was the perfect size. If it doesn't, the probe buffer is appended to the normal buffer and the read loop continues. Fixing this allows several workarounds in the standard library to be removed: - `Take` no longer needs to override `Read::read_to_end`. - The `reservation_size` callback that allowed `Take` to inhibit the previous over-allocation behavior isn't needed. - `fs::read` doesn't need to reserve an extra byte in `initial_buffer_size`. Curiously, there was a unit test that specifically checked that `Read::read_to_end` *does* over-allocate. I removed that test, too.
2021-09-25Apply 16 commits (squashed)Frank Steffahn-7/+7
---------- Fix spacing for links inside code blocks, and improve link tooltips in alloc::fmt ---------- Fix spacing for links inside code blocks, and improve link tooltips in alloc::{rc, sync} ---------- Fix spacing for links inside code blocks, and improve link tooltips in alloc::string ---------- Fix spacing for links inside code blocks in alloc::vec ---------- Fix spacing for links inside code blocks in core::option ---------- Fix spacing for links inside code blocks, and improve a few link tooltips in core::result ---------- Fix spacing for links inside code blocks in core::{iter::{self, iterator}, stream::stream, poll} ---------- Fix spacing for links inside code blocks, and improve a few link tooltips in std::{fs, path} ---------- Fix spacing for links inside code blocks in std::{collections, time} ---------- Fix spacing for links inside code blocks in and make formatting of `&str`-like types consistent in std::ffi::{c_str, os_str} ---------- Fix spacing for links inside code blocks, and improve link tooltips in std::ffi ---------- Fix spacing for links inside code blocks, and improve a few link tooltips in std::{io::{self, buffered::{bufreader, bufwriter}, cursor, util}, net::{self, addr}} ---------- Fix typo in link to `into` for `OsString` docs ---------- Remove tooltips that will probably become redundant in the future ---------- Apply suggestions from code review Replacing `…std/primitive.reference.html` paths with just `reference` Co-authored-by: Joshua Nelson <github@jyn.dev> ---------- Also replace `…std/primitive.reference.html` paths with just `reference` in `core::pin`
2021-09-22Fix read_to_end to not grow an exact size bufferJohn Kugelman-4/+3
If you know how much data to expect and use `Vec::with_capacity` to pre-allocate a buffer of that capacity, `Read::read_to_end` will still double its capacity. It needs some space to perform a read, even though that read ends up returning `0`. It's a bummer to carefully pre-allocate 1GB to read a 1GB file into memory and end up using 2GB. This fixes that behavior by special casing a full buffer and reading into a small "probe" buffer instead. If that read returns `0` then it's confirmed that the buffer was the perfect size. If it doesn't, the probe buffer is appended to the normal buffer and the read loop continues. Fixing this allows several workarounds in the standard library to be removed: - `Take` no longer needs to override `Read::read_to_end`. - The `reservation_size` callback that allowed `Take` to inhibit the previous over-allocation behavior isn't needed. - `fs::read` doesn't need to reserve an extra byte in `initial_buffer_size`. Curiously, there was a unit test that specifically checked that `Read::read_to_end` *does* over-allocate. I removed that test, too.
2021-08-19Add comments about impls for File, TcpStream, ChildStdin, etc.Dan Gohman-0/+6
2021-08-07Document that fs::read_dir skips . and ..Timotej Lazar-0/+2
2021-07-29Fix may not to appropriate might not or must notAli Malik-4/+4
2021-07-29Add some doc aliasesD1mon-0/+2
Add `mkdir` to `create_dir`, `rmdir` to `remove_dir`.
2021-07-18Rollup merge of #87170 - xFrednet:clippy-5393-add-diagnostic-items, ↵Yuki Okushi-0/+3
r=Manishearth,oli-obk Add diagnostic items for Clippy This adds a bunch of diagnostic items to `std`/`core`/`alloc` functions, structs and traits used in Clippy. The actual refactorings in Clippy to use these items will be done in a different PR in Clippy after the next sync. This PR doesn't include all paths Clippy uses, I've only gone through the first 85 lines of Clippy's [`paths.rs`](https://github.com/rust-lang/rust-clippy/blob/ecf85f4bdc319f9d9d853d1fff68a8a25e64c7a8/clippy_utils/src/paths.rs) (after rust-lang/rust-clippy#7466) to get some feedback early on. I've also decided against adding diagnostic items to methods, as it would be nicer and more scalable to access them in a nicer fashion, like adding a `is_diagnostic_assoc_item(did, sym::Iterator, sym::map)` function or something similar (Suggested by `@camsteffen` [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/147480-t-compiler.2Fwg-diagnostics/topic/Diagnostic.20Item.20Naming.20Convention.3F/near/225024603)) There seems to be some different naming conventions when it comes to diagnostic items, some use UpperCamelCase (`BinaryHeap`) and some snake_case (`hashmap_type`). This PR uses UpperCamelCase for structs and traits and snake_case with the module name as a prefix for functions. Any feedback on is this welcome. cc: rust-lang/rust-clippy#5393 r? `@Manishearth`
2021-07-15Added diagnostic items to structs and traits for ClippyxFrednet-0/+3
2021-07-09Update docs for `fs::hard_link`Aris Merchant-2/+5
2021-07-06Rollup merge of #86852 - Amanieu:remove_doc_aliases, r=joshtriplettYuki Okushi-3/+0
Remove some doc aliases As per the new doc alias policy in https://github.com/rust-lang/std-dev-guide/pull/25, this removes some controversial doc aliases: - `malloc`, `alloc`, `realloc`, etc. - `length` (alias for `len`) - `delete` (alias for `remove` in collections and also file/directory deletion) r? `@joshtriplett`
2021-07-02Auto merge of #85746 - m-ou-se:io-error-other, r=joshtriplettbors-6/+6
Redefine `ErrorKind::Other` and stop using it in std. This implements the idea I shared yesterday in the libs meeting when we were discussing how to handle adding new `ErrorKind`s to the standard library: This redefines `Other` to be for *user defined errors only*, and changes all uses of `Other` in the standard library to a `#[doc(hidden)]` and permanently `#[unstable]` `ErrorKind` that users can not match on. This ensures that adding `ErrorKind`s at a later point in time is not a breaking change, since the user couldn't match on these errors anyway. This way, we use the `#[non_exhaustive]` property of the enum in a more effective way. Open questions: - How do we check this change doesn't cause too much breakage? Will a crate run help and be enough? - How do we ensure we don't accidentally start using `Other` again in the standard library? We don't have a `pub(not crate)` or `#[deprecated(in this crate only)]`. cc https://github.com/rust-lang/rust/pull/79965 cc `@rust-lang/libs` `@ijackson` r? `@dtolnay`
2021-06-30Remove "delete" doc aliasesAmanieu d'Antras-3/+0
2021-06-18`no_run` and `ignore` doc attributesMax Wase-1/+2
2021-06-15Rename ErrorKind::Unknown to Uncategorized.Mara Bos-1/+1
2021-06-15Redefine `ErrorKind::Other` and stop using it in std.Mara Bos-6/+6
2021-05-27Review fixes + doc-featuresMax Wase-2/+3
2021-05-27Tracking issue add.Max Wase-1/+1
2021-05-27Fix `is_symlink()` method for `Path` using added `is_symlink()` method for ↵Max Wase-0/+24
`Metadata`
2021-05-19Move the implementation of `Path::exists` to `sys_common::fs` so platforms ↵Chris Denton-0/+26
can specialize it Windows implementation of `fs::try_exists`
2021-04-10clean up example on read_to_stringSteve Klabnik-1/+2
This is the same thing, but simpler.
2021-03-28Rollup merge of #83558 - m-ou-se:use-finish-non-exhaustive, r=jackh726Yuki Okushi-1/+1
Use DebugStruct::finish_non_exhaustive() in std. See https://github.com/rust-lang/rust/issues/67364
2021-03-28Rollup merge of #79399 - pickfire:patch-3, r=JohnTitorYuki Okushi-3/+3
Use detailed and shorter fs error explaination Includes suggestion from `@the8472` https://github.com/rust-lang/rust/issues/79390#issuecomment-733263336
2021-03-27Use detailed and shorter fs error explainationIvan Tham-3/+3
Includes suggestion from the8472 https://github.com/rust-lang/rust/issues/79390#issuecomment-733263336 More detail error explanation in fs doc
2021-03-27Use DebugStruct::finish_non_exhaustive() in std.Mara Bos-1/+1
2021-03-21Use io::Error::new_const everywhere to avoid allocations.Mara Bos-1/+4
2021-01-31Add doc aliases for "delete"Konrad Borowski-0/+3
This patch adds doc aliases for "delete". The added aliases are supposed to reference usages `delete` in other programming languages. - `HashMap::remove`, `BTreeMap::remove` -> `Map#delete` and `delete` keyword in JavaScript. - `HashSet::remove`, `BTreeSet::remove` -> `Set#delete` in JavaScript. - `mem::drop` -> `delete` keyword in C++. - `fs::remove_file`, `fs::remove_dir`, `fs::remove_dir_all` -> `File#delete` in Java, `File#delete` and `Dir#delete` in Ruby. Before this change, searching for "delete" in documentation returned no results.
2020-11-22Drop support for cloudabi targetsLzu Tao-1/+1
2020-11-14Disambiguate symlink argument namesDavid Tolnay-11/+13
2020-11-14Auto merge of #75272 - the8472:spec-copy, r=KodrAusbors-1/+1
specialize io::copy to use copy_file_range, splice or sendfile Fixes #74426. Also covers #60689 but only as an optimization instead of an official API. The specialization only covers std-owned structs so it should avoid the problems with #71091 Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar. There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors. The test case executes the following: ``` [pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0 [pid 103776] write(4, "wxyz", 4) = 4 [pid 103776] write(4, "iklmn", 5) = 5 [pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5 ``` 0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted 𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`. 𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing. ## Benchmarks ``` OLD test io::tests::bench_file_to_file_copy ... bench: 21,002 ns/iter (+/- 750) = 6240 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 35,704 ns/iter (+/- 1,108) = 3671 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 57,002 ns/iter (+/- 4,205) = 2299 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 142,640 ns/iter (+/- 77,851) = 918 MB/s NEW test io::tests::bench_file_to_file_copy ... bench: 14,745 ns/iter (+/- 519) = 8889 MB/s [ext4] test io::tests::bench_file_to_file_copy ... bench: 6,128 ns/iter (+/- 227) = 21389 MB/s [btrfs] test io::tests::bench_file_to_socket_copy ... bench: 13,767 ns/iter (+/- 3,767) = 9520 MB/s test io::tests::bench_socket_pipe_socket_copy ... bench: 26,471 ns/iter (+/- 6,412) = 4951 MB/s ```
2020-11-13specialize io::copy to use copy_file_range, splice or sendfileThe8472-1/+1
Currently it only applies to linux systems. It can be extended to make use of similar syscalls on other unix systems.
2020-10-21Make it platform-specific whether `hard_link` follows symlinks.Dan Gohman-2/+3
Also mention that where possible, `hard_link` does not follow symlinks.
2020-10-16Define `fs::hard_link` to not follow symlinks.Dan Gohman-2/+5
POSIX leaves it implementation-defined whether `link` follows symlinks. In practice, for example, on Linux it does not and on FreeBSD it does. So, switch to `linkat`, so that we can pick a behavior rather than depending on OS defaults. Pick the option to not follow symlinks. This is somewhat arbitrary, but seems the less surprising choice because hard linking is a very low-level feature which requires the source and destination to be on the same mounted filesystem, and following a symbolic link could end up in a different mounted filesystem.
2020-09-02Convert many files to intra-doc linksJoshua Nelson-15/+15
- Use intra-doc links for `std::io` in `std::fs` - Use intra-doc links for File::read in unix/ext/fs.rs - Remove explicit intra-doc links for `true` in `net/addr.rs` - Use intra-doc links in alloc/src/sync.rs - Use intra-doc links in src/ascii.rs - Switch to intra-doc links in alloc/rc.rs - Use intra-doc links in core/pin.rs - Use intra-doc links in std/prelude - Use shorter links in `std/fs.rs` `io` is already in scope.
2020-08-31std: move "mod tests/benches" to separate filesLzu Tao-1348/+3
Also doing fmt inplace as requested.
2020-08-21Rollup merge of #75324 - ericseppanen:master, r=JohnTitorYuki Okushi-1/+3
clarify documentation of remove_dir errors remove_dir will error if the path doesn't exist or isn't a directory. It's useful to clarify that this is "remove dir or fail" not "remove dir if it exists". I don't think this belongs in the title. "Removes an existing, empty directory" is strangely worded-- there's no such thing as a non-existing directory. Better to just say explicitly it will return an error.
2020-08-12Move to intra doc links in std/src/fs.rsAlexis Bourget-132/+59
2020-08-08clarify documentation of remove_dir errorsEric Seppanen-1/+3
remove_dir will error if the path doesn't exist or isn't a directory. It's useful to clarify that this is "remove dir or fail" not "remove dir if it exists". I don't think this belongs in the title. "Removes an existing, empty directory" is strangely worded-- there's no such thing as a non-existing directory. Better to just say explicitly it will return an error.
2020-07-29Update `fs::remove_file` docsImbolc-0/+1
Mention that absence of file causes an error
2020-07-27mv std libs to library/mark-0/+3612