| Age | Commit message (Collapse) | Author | Lines |
|
Add -Z no-unique-section-names to reduce ELF header bloat.
This change adds a new compiler flag that can help reduce the size of ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most targets), the LLVM backend will generate different section names for each function. For example, a function `func` would generate a section called `.text.func`. Normally this is fine because the linker will merge all those sections into a single one in the binary. However, starting with [LLVM 12](https://github.com/llvm/llvm-project/commit/ee5d1a04), the backend will also generate unique section names for exception handling, resulting in thousands of `.gcc_except_table.*` sections ending up in the final binary because some linkers like LLD don't currently merge or strip these EH sections (see discussion [here](https://reviews.llvm.org/D83655)). This can bloat the ELF headers and string table significantly in binaries that contain many functions.
The new option is analogous to Clang's `-fno-unique-section-names`, and instructs LLVM to generate the same `.text` and `.gcc_except_table` section for each function, resulting in a smaller final binary.
The motivation to add this new option was because we have a binary that ended up with so many ELF sections (over 65,000) that it broke some existing ELF tools, which couldn't handle so many sections.
Here's our old binary:
```
$ readelf --sections old.elf | head -1
There are 71746 section headers, starting at offset 0x2a246508:
$ readelf --sections old.elf | grep shstrtab
[71742] .shstrtab STRTAB 0000000000000000 2977204c ad44bb 00 0 0 1
```
That's an 11MB+ string table. Here's the new binary using this option:
```
$ readelf --sections new.elf | head -1
There are 43 section headers, starting at offset 0x29143ca8:
$ readelf --sections new.elf | grep shstrtab
[40] .shstrtab STRTAB 0000000000000000 29143acc 0001db 00 0 0 1
```
The whole binary size went down by over 20MB, which is quite significant.
|
|
Both style-check and date-check are now on the 2021 edition, and this
commit also updates their repositories' submodules.
|
|
Scrape code examples from examples/ directory for Rustdoc
Adds support for the functionality described in https://github.com/rust-lang/rfcs/pull/3123
Matching changes to Cargo are here: https://github.com/rust-lang/cargo/pull/9525
Live demo here: https://willcrichton.net/example-analyzer/warp/trait.Filter.html#method.and
|
|
Implement -Z location-detail flag
This PR implements the `-Z location-detail` flag as described in https://github.com/rust-lang/rfcs/pull/2091 .
`-Z location-detail=val` controls what location details are tracked when using `caller_location`. This allows users to control what location details are printed as part of panic messages, by allowing them to exclude any combination of filenames, line numbers, and column numbers. This option is intended to provide users with a way to mitigate the size impact of `#[track_caller]`.
Some measurements of the savings of this approach on an embedded binary can be found here: https://github.com/rust-lang/rust/issues/70579#issuecomment-942556822 .
Closes #70580 (unless people want to leave that open as a place for discussion of further improvements).
This is my first real PR to rust, so any help correcting mistakes / understanding side effects / improving my tests is appreciated :)
I have one question: RFC 2091 specified this as a debugging option (I think that is what -Z implies?). Does that mean this can never be stabilized without a separate MCP? If so, do I need to submit an MCP now, or is the initial RFC specifying this option sufficient for this to be merged as is, and then an MCP would be needed for eventual stabilization?
|
|
scrape-examples options
|
|
|
|
I think this code is getting L0, not L1 cache size, if I'm reading the Intel manual right. (I might not be.) Either way, the code comment and the printed message should match, whichever way is right. :)
|
|
|
|
|
|
This reverts commit 7a62f29f3171767090949778ce0f161e930706b9.
|
|
The linkchecker doesn't seem happy with links to email addresses.
|
|
Fix broken link to target documentation.
Also fix formatting of developer list.
|
|
In particular, document the default properties and assumptions of code
built for the target.
(Work on this target sponsored by Profian.)
|
|
Most Rust freestanding/bare-metal targets use just `-unknown-none` here,
including aarch64-unknown-none, mipsel-unknown-none, and the BPF
targets. The *only* target using `-unknown-none-elf` is RISC-V.
The underlying toolchain doesn't care; LLVM accepts both `x86_64-unknown-none`
and `x86_64-unknown-none-elf`.
In addition, there's a long history of embedded x86 targets with varying
definitions for the `elf` suffix; on some of those embedded targets,
`elf` implied the inclusion of a C library based on newlib or similar.
Using `x86_64-unknown-none` avoids any potential ambiguity there.
(Work on this target sponsored by Profian.)
|
|
|
|
|
|
This change adds a new compiler flag that can help reduce the size of
ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most
targets), the LLVM backend will generate different section names for each
function. For example, a function "func" would generate a section called
".text.func". Normally this is fine because the linker will merge all those
sections into a single one in the binary. However, starting with LLVM 12
(llvm/llvm-project@ee5d1a0), the backend will
also generate unique section names for exception handling, resulting in
thousands of ".gcc_except_table.*" sections ending up in the final binary
because some linkers don't currently merge or strip these EH sections.
This can bloat the ELF headers and string table significantly in
binaries that contain many functions.
The new option is analogous to Clang's -fno-unique-section-names, and
instructs LLVM to generate the same ".text" and ".gcc_except_table"
section for each function, resulting in smaller object files and
potentially a smaller final binary.
|
|
Fix ABNF of inline asm options
This is the case since #73227.
r? `@camelid`
|
|
Add new tier-3 target: armv7-unknown-linux-uclibceabihf
This change adds a new tier-3 target: armv7-unknown-linux-uclibceabihf
This target is primarily used in embedded linux devices where system resources are slim and glibc is deemed too heavyweight. Cross compilation C toolchains are available [here](https://toolchains.bootlin.com/) or via [buildroot](https://buildroot.org).
The change is based largely on a previous PR #79380 with a few minor modifications. The author of that PR was unable to push the PR forward, and graciously allowed me to take it over.
Per the [target tier 3 policy](https://github.com/rust-lang/rfcs/blob/master/text/2803-target-tier-policy.md), I volunteer to be the "target maintainer".
This is my first PR to Rust itself, so I apologize if I've missed things!
|
|
|
|
|
|
Implement #85440 (Random test ordering)
This PR adds `--shuffle` and `--shuffle-seed` options to `libtest`. The options are similar to the [`-shuffle` option](https://github.com/golang/go/blob/c894b442d1e5e150ad33fa3ce13dbfab1c037b3a/src/testing/testing.go#L1482-L1499) that was recently added to Go.
Here are the relevant parts of the help message:
```
--shuffle Run tests in random order
--shuffle-seed SEED
Run tests in random order; seed the random number
generator with SEED
...
By default, the tests are run in alphabetical order. Use --shuffle or set
RUST_TEST_SHUFFLE to run the tests in random order. Pass the generated
"shuffle seed" to --shuffle-seed (or set RUST_TEST_SHUFFLE_SEED) to run the
tests in the same order again. Note that --shuffle and --shuffle-seed do not
affect whether the tests are run in parallel.
```
Is an RFC needed for this?
|
|
Enable AutoFDO.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Clink-arg='Wl,--no-rosegment' -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdo
The option -Clink-arg='Wl,--no-rosegment' is necessary to avoid lld
putting an extra RO segment before the executable code, which would make
the binary silently incompatible with create_llvm_prof.
|
|
|
|
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdo
Fixes #64892.
|
|
Rename `std::thread::available_conccurrency` to `std::thread::available_parallelism`
_Tracking issue: https://github.com/rust-lang/rust/issues/74479_
This PR renames `std::thread::available_conccurrency` to `std::thread::available_parallelism`.
## Rationale
The API was initially named `std::thread::hardware_concurrency`, mirroring the [C++ API of the same name](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency). We eventually decided to omit any reference to the word "hardware" after [this comment](https://github.com/rust-lang/rust/pull/74480#issuecomment-662045841). And so we ended up with `available_concurrency` instead.
---
For a talk I was preparing this week I was reading through ["Understanding and expressing scalable concurrency" (A. Turon, 2013)](http://aturon.github.io/academic/turon-thesis.pdf), and the following passage stood out to me (emphasis mine):
> __Concurrency is a system-structuring mechanism.__ An interactive system that deals with disparate asynchronous events is naturally structured by division into concurrent threads with disparate responsibilities. Doing so creates a better fit between problem and solution, and can also decrease the average latency of the system by preventing long-running computations from obstructing quicker ones.
> __Parallelism is a resource.__ A given machine provides a certain capacity for parallelism, i.e., a bound on the number of computations it can perform simultaneously. The goal is to maximize throughput by intelligently using this resource. For interactive systems, parallelism can decrease latency as well.
_Chapter 2.1: Concurrency is not Parallelism. Page 30._
---
_"Concurrency is a system-structuring mechanism. Parallelism is a resource."_ — It feels like this accurately captures the way we should be thinking about these APIs. What this API returns is not "the amount of concurrency available to the program" which is a property of the program, and thus even with just a single thread is effectively unbounded. But instead it returns "the amount of _parallelism_ available to the program", which is a resource hard-constrained by the machine's capacity (and can be further restricted by e.g. operating systems).
That's why I'd like to propose we rename this API from `available_concurrency` to `available_parallelism`. This still meets the criteria we previously established of not attempting to define what exactly we mean by "hardware", "threads", and other such words. Instead we only talk about "concurrency" as an abstract resource available to our program.
r? `@joshtriplett`
|
|
|
|
Co-authored-by: Jonah Petri <jonah@petri.us>
|
|
|
|
|
|
Refs: https://github.com/rust-lang/rust/pull/85223
|
|
platform-support.md: correct ARMv7+MUSL platform triple notes
This PR fixes two minor inconsistencies in the platform support list.
- use "with MUSL" suffix for "armv7-unknown-linux-musleabi"
- add "hardfloat" suffix for "armv7-unknown-linux-musleabihf"
r? `@steveklabnik`
|
|
rustdoc: Clarified the attribute which prompts the warning
The example call was lacking clarification of the `#![warn(rustdoc::invalid_codeblock_attributes)]` attribute which generates the specified warning.
|
|
Add `pie` as another `relocation-model` value
MCP: https://github.com/rust-lang/compiler-team/issues/461
|
|
|
|
|
|
`std::thread::available_parallelism`
|
|
SOLID[1] is an embedded development platform provided by Kyoto
Microcomputer Co., Ltd. This commit introduces a basic Tier 3 support
for SOLID.
# New Targets
The following targets are added:
- `aarch64-kmc-solid_asp3`
- `armv7a-kmc-solid_asp3-eabi`
- `armv7a-kmc-solid_asp3-eabihf`
SOLID's target software system can be divided into two parts: an
RTOS kernel, which is responsible for threading and synchronization,
and Core Services, which provides filesystems, networking, and other
things. The RTOS kernel is a μITRON4.0[2][3]-derived kernel based on
the open-source TOPPERS RTOS kernels[4]. For uniprocessor systems
(more precisely, systems where only one processor core is allocated for
SOLID), this will be the TOPPERS/ASP3 kernel. As μITRON is
traditionally only specified at the source-code level, the ABI is
unique to each implementation, which is why `asp3` is included in the
target names.
More targets could be added later, as we support other base kernels
(there are at least three at the point of writing) and are interested
in supporting other processor architectures in the future.
# C Compiler
Although SOLID provides its own supported C/C++ build toolchain, GNU Arm
Embedded Toolchain seems to work for the purpose of building Rust.
# Unresolved Questions
A μITRON4 kernel can support `Thread::unpark` natively, but it's not
used by this commit's implementation because the underlying kernel
feature is also used to implement `Condvar`, and it's unclear whether
`std` should guarantee that parking tokens are not clobbered by other
synchronization primitives.
# Unsupported or Unimplemented Features
Most features are implemented. The following features are not
implemented due to the lack of native support:
- `fs::File::{file_attr, truncate, duplicate, set_permissions}`
- `fs::{symlink, link, canonicalize}`
- Process creation
- Command-line arguments
Backtrace generation is not really a good fit for embedded targets, so
it's intentionally left unimplemented. Unwinding is functional, however.
## Dynamic Linking
Dynamic linking is not supported. The target platform supports dynamic
linking, but enabling this in Rust causes several problems.
- The linker invocation used to build the shared object of `std` is
too long for the platform-provided linker to handle.
- A linker script with specific requirements is required for the
compiled shared object to be actually loadable.
As such, we decided to disable dynamic linking for now. Regardless, the
users can try to create shared objects by manually invoking the linker.
## Executable
Building an executable is not supported as the notion of "executable
files" isn't well-defined for these targets.
[1] https://solid.kmckk.com/SOLID/
[2] http://ertl.jp/ITRON/SPEC/mitron4-e.html
[3] https://en.wikipedia.org/wiki/ITRON_project
[4] https://toppers.jp/
|
|
Support `#[track_caller]` on closures and generators
## Lang team summary
This PR adds support for placing the `#[track_caller]` attribute on closure and generator expressions. This attribute's addition behaves identically (from a users perspective) to the attribute being placed on the method in impl Fn/FnOnce/FnMut for ... generated by compiler.
The attribute is currently "double" feature gated -- both `stmt_expr_attributes` (preexisting) and `closure_track_caller` (newly added) must be enabled in order to place these attributes on closures.
As the Fn* traits lack a `#[track_caller]` attribute in their definition, caller information does not propagate when invoking closures through dyn Fn*. There is no limitation that this PR adds in supporting this; it can be added in the future.
# Implementation details
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
|
|
This PR allows applying a `#[track_caller]` attribute to a
closure/generator expression. The attribute as interpreted as applying
to the compiler-generated implementation of the corresponding trait
method (`FnOnce::call_once`, `FnMut::call_mut`, `Fn::call`, or
`Generator::resume`).
This feature does not have its own feature gate - however, it requires
`#![feature(stmt_expr_attributes)]` in order to actually apply
an attribute to a closure or generator.
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
|
|
|
|
|
|
|
|
|
|
Update clobber_abi list to include k[1-7] regs
|
|
Update books
## rust-by-example
1 commits in 04f489c889235fe3b6dfe678ae5410d07deda958..9d4132b56c4999cd3ce1aeca5f1b2f2cb0d11c24
2021-08-17 08:01:20 -0300 to 2021-09-14 06:56:00 -0300
- Fix link to "integration testing" page (rust-lang/rust-by-example#1458)
## rustc-dev-guide
17 commits in 95f1acf9a39d6f402f654e917e2c1dfdb779c5fc..9198465b6ca8bed669df0cbb67c0e6d0b140803c
2021-08-31 12:38:30 -0500 to 2021-09-12 11:50:44 -0500
- Clarify difference of a help vs note diagnostic.
- remove ctag section
- Update suggested.md
- Update SUMMARY.md
- Move ctag section to "Suggested Workflow"
- Delete ctags.md
- Clarify paragraph in "Keeping things up to date"
- Docs: added section on rustdoc
- Docs: made suggested fix
- Docs: deleted copy
- Docs: added section discussing core ideas
- Docs: delete redundant use of correctness
- Docs: consolidated parallelism information
- Add links to overview.md (rust-lang/rustc-dev-guide#1202)
- Spelling change intermidiate to intermediate
- Fix a typo (rust-lang/rustc-dev-guide#1200)
- Documenting diagnostic items with their usage and naming conventions (rust-lang/rustc-dev-guide#1192)
## embedded-book
1 commits in c3a51e23859554369e6bbb5128dcef0e4f159fb5..4c76da9ddb4650203c129fceffdea95a3466c205
2021-08-26 07:04:58 +0000 to 2021-09-12 12:43:03 +0000
- 2.1(QEMU): update app name for git project init (rust-embedded/book#301)
|
|
Introduce -Z remap-cwd-prefix switch
This switch remaps any absolute paths rooted under the current
working directory to a new value. This includes remapping the
debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`.
Importantly, this flag does not require passing the current working
directory to the compiler, such that the command line can be
run on any machine (with the same input files) and produce the
same results. This is critical property for debugging compiler
issues that crop up on remote machines.
This is based on adetaylor's https://github.com/rust-lang/rust/commit/dbc4ae7cba0ba8d650b91ddd459b86a02a2d05c5
Major Change Proposal: https://github.com/rust-lang/compiler-team/issues/450
Discussed on #38322. Would resolve issue #87325.
|
|
|
|
|
|
|