| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
|
|
|
|
|
|
I found some areas for improvement while attempting to debug the
SegFault issue when running rust programs compiled using MSVC, with
coverage instrumentation.
I discovered that LLVM's coverage writer was generating incomplete
function name variable names (that's not a typo: the name of the
variable that holds a function name).
The existing implementation used one-up numbers to distinguish
variables, and correcting the names did not fix the MSVC coverage bug,
but the fix in this PR makes the names and resulting LLVM IR easier to
follow and more consistent with Clang's implementation.
I also changed the way the `-Zinstrument-coverage` option is supported
in symbol_export.rs. The original implementation was incorrect, and the
corrected version matches the handling for `-Zprofile-generate`, as it
turns out.
(An argument could be made that maybe `-Zinstrument-coverage` should
automatically enable `-Cprofile-generate`. In fact, if
`-Cprofile-generate` is analagous to Clang's `-fprofile-generate`, as
some documentation implies, Clang always requires this flag for its
implementation of source-based code coverage. This would require a
little more validation, and if implemented, would probably require
updating some of the user-facing messages related to
`-Cprofile-generate` to not be so specific to the PGO use case.)
None of these changes fixed the MSVC coverage problems, but they should
still be welcome improvements.
Lastly, I added some additional FIXME comments in instrument_coverage.rs
describing issues I found with the generated LLVM IR that would be
resolved if the coverage instrumentation is injected with a `Statement`
instead of as a new `BasicBlock`. I describe seven advantages of this
change, but it requires some discussion before making a change like
this.
|
|
|
|
|
|
Completes support for coverage in external crates
Follow-up to #74959 :
The prior PR corrected for errors encountered when trying to generate
the coverage map on source code inlined from external crates (including
macros and generics) by avoiding adding external DefIds to the coverage
map.
This made it possible to generate a coverage report including external
crates, but the external crate coverage was incomplete (did not include
coverage for the DefIds that were eliminated.
The root issue was that the coverage map was converting Span locations
to source file and locations, using the SourceMap for the current crate,
and this would not work for spans from external crates (compliled with a
different SourceMap).
The solution was to convert the Spans to filename and location during
MIR generation instead, so precompiled external crates would already
have the correct source code locations embedded in their MIR, when
imported into another crate.
@wesleywiser FYI
r? @tmandry
|
|
The prior PR corrected for errors encountered when trying to generate
the coverage map on source code inlined from external crates (including
macros and generics) by avoiding adding external DefIds to the coverage
map.
This made it possible to generate a coverage report including external
crates, but the external crate coverage was incomplete (did not include
coverage for the DefIds that were eliminated.
The root issue was that the coverage map was converting Span locations
to source file and locations, using the SourceMap for the current crate,
and this would not work for spans from external crates (compliled with a
different SourceMap).
The solution was to convert the Spans to filename and location during
MIR generation instead, so precompiled external crates would already
have the correct source code locations embedded in their MIR, when
imported into another crate.
|
|
rustc: Improving safe wasm float->int casts
This commit improves code generation for WebAssembly targets when
translating floating to integer casts. This improvement is only relevant
when the `nontrapping-fptoint` feature is not enabled, but the feature
is not enabled by default right now. Additionally this improvement only
affects safe casts since unchecked casts were improved in #74659.
Some more background for this issue is present on #73591, but the
general gist of the issue is that in LLVM the `fptosi` and `fptoui`
instructions are defined to return an `undef` value if they execute on
out-of-bounds values; they notably do not trap. To implement these
instructions for WebAssembly the LLVM backend must therefore generate
quite a few instructions before executing `i32.trunc_f32_s` (for
example) because this WebAssembly instruction traps on out-of-bounds
values. This codegen into wasm instructions happens very late in the
code generator, so what ends up happening is that rustc inserts its own
codegen to implement Rust's saturating semantics, and then LLVM also
inserts its own codegen to make sure that the `fptosi` instruction
doesn't trap. Overall this means that a function like this:
#[no_mangle]
pub unsafe extern "C" fn cast(x: f64) -> u32 {
x as u32
}
will generate this WebAssembly today:
(func $cast (type 0) (param f64) (result i32)
(local i32 i32)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.gt
local.set 1
block ;; label = @1
block ;; label = @2
local.get 0
f64.const 0x0p+0 (;=0;)
local.get 0
f64.const 0x0p+0 (;=0;)
f64.gt
select
local.tee 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@2;)
local.get 0
i32.trunc_f64_u
local.set 2
br 1 (;@1;)
end
i32.const 0
local.set 2
end
i32.const -1
local.get 2
local.get 1
select)
This PR improves the situation by updating the code generation for
float-to-int conversions in rustc, specifically only for WebAssembly
targets and only for some situations (float-to-u8 still has not great
codegen). The fix here is to use basic blocks and control flow to avoid
speculatively executing `fptosi`, and instead LLVM's raw intrinsic for
the WebAssembly instruction is used instead. This effectively extends
the support added in #74659 to checked casts. After this commit the
codegen for the above Rust function looks like:
(func $cast (type 0) (param f64) (result i32)
(local i32)
block ;; label = @1
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
local.tee 1
i32.const 1
i32.xor
br_if 0 (;@1;)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.le
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const -1
i32.const 0
local.get 1
select)
For reference, in Rust 1.44, which did not have saturating
float-to-integer casts, the codegen LLVM would emit is:
(func $cast (type 0) (param f64) (result i32)
block ;; label = @1
local.get 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const 0)
So we're relatively close to the original codegen, although it's
slightly different because the semantics of the function changed where
we're emulating the `i32.trunc_sat_f32_s` instruction rather than always
replacing out-of-bounds values with zero.
There is still work that could be done to improve casts such as `f32` to
`u8`. That form of cast still uses the `fptosi` instruction which
generates lots of branch-y code. This seems less important to tackle now
though. In the meantime this should take care of most use cases of
floating-point conversion and as a result I'm going to speculate that
this...
Closes #73591
|
|
Fixed a known issue in the coverage map where some regions had
nonsensical source code locations. External crate functions are already
included in their own coverage maps, per library, and don't need to also
be added to the importing crate's coverage map. (In fact, their source
start and end byte positions are not relevant to the importing crate's
SourceMap.)
The fix was to simply skip trying to add imported coverage info to the
coverage map if the instrumented function is not "local".
The injected counters are still relevant, however, and the LLVM
`instrprof.increment` intrinsic call parameters will map those counters
to the external crates' coverage maps, when generating runtime coverage
data.
|
|
This commit improves code generation for WebAssembly targets when
translating floating to integer casts. This improvement is only relevant
when the `nontrapping-fptoint` feature is not enabled, but the feature
is not enabled by default right now. Additionally this improvement only
affects safe casts since unchecked casts were improved in #74659.
Some more background for this issue is present on #73591, but the
general gist of the issue is that in LLVM the `fptosi` and `fptoui`
instructions are defined to return an `undef` value if they execute on
out-of-bounds values; they notably do not trap. To implement these
instructions for WebAssembly the LLVM backend must therefore generate
quite a few instructions before executing `i32.trunc_f32_s` (for
example) because this WebAssembly instruction traps on out-of-bounds
values. This codegen into wasm instructions happens very late in the
code generator, so what ends up happening is that rustc inserts its own
codegen to implement Rust's saturating semantics, and then LLVM also
inserts its own codegen to make sure that the `fptosi` instruction
doesn't trap. Overall this means that a function like this:
#[no_mangle]
pub unsafe extern "C" fn cast(x: f64) -> u32 {
x as u32
}
will generate this WebAssembly today:
(func $cast (type 0) (param f64) (result i32)
(local i32 i32)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.gt
local.set 1
block ;; label = @1
block ;; label = @2
local.get 0
f64.const 0x0p+0 (;=0;)
local.get 0
f64.const 0x0p+0 (;=0;)
f64.gt
select
local.tee 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@2;)
local.get 0
i32.trunc_f64_u
local.set 2
br 1 (;@1;)
end
i32.const 0
local.set 2
end
i32.const -1
local.get 2
local.get 1
select)
This PR improves the situation by updating the code generation for
float-to-int conversions in rustc, specifically only for WebAssembly
targets and only for some situations (float-to-u8 still has not great
codegen). The fix here is to use basic blocks and control flow to avoid
speculatively executing `fptosi`, and instead LLVM's raw intrinsic for
the WebAssembly instruction is used instead. This effectively extends
the support added in #74659 to checked casts. After this commit the
codegen for the above Rust function looks like:
(func $cast (type 0) (param f64) (result i32)
(local i32)
block ;; label = @1
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
local.tee 1
i32.const 1
i32.xor
br_if 0 (;@1;)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.le
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const -1
i32.const 0
local.get 1
select)
For reference, in Rust 1.44, which did not have saturating
float-to-integer casts, the codegen LLVM would emit is:
(func $cast (type 0) (param f64) (result i32)
block ;; label = @1
local.get 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const 0)
So we're relatively close to the original codegen, although it's
slightly different because the semantics of the function changed where
we're emulating the `i32.trunc_sat_f32_s` instruction rather than always
replacing out-of-bounds values with zero.
There is still work that could be done to improve casts such as `f32` to
`u8`. That form of cast still uses the `fptosi` instruction which
generates lots of branch-y code. This seems less important to tackle now
though. In the meantime this should take care of most use cases of
floating-point conversion and as a result I'm going to speculate that
this...
Closes #73591
|
|
Found some problems with the coverage map encoding when testing with
more than one counter per function.
While debugging, I realized some better ways to structure the Rust
implementation of the coverage mapping generator. I refactored somewhat,
resulting in less code overall, expanded coverage of LLVM Coverage Map
capabilities, and much closer alignment with LLVM data structures, APIs,
and naming.
This should be easier to follow and easier to maintain.
|
|
This commit improves codegen for unchecked casts on WebAssembly targets
to use the singluar `iNN.trunc_fMM_{u,s}` instructions. Previously rustc
would codegen a bare `fptosi` and `fptoui` for float casts but for
WebAssembly targets the codegen for these instructions is quite large.
This large codegen is due to the fact that LLVM can speculate these
instructions so the trapping behavior of WebAssembly needs to be
protected against in case they're speculated.
The change here is to update the codegen for the unchecked cast
intrinsics to have a wasm-specific case where they call the appropriate
LLVM intrinsic to generate the right wasm instruction. The intrinsic is
explicitly opting-in to undefined behavior so a trap here for
out-of-bounds inputs on wasm should be acceptable.
cc #73591
|
|
This commit implements the `unused_generic_params` query, an initial
version of polymorphization which detects when an item does not use
generic parameters and is being needlessly monomorphized as a result.
Signed-off-by: David Wood <david@davidtw.co>
|
|
rustc now generates the coverage map and can support (limited)
coverage report generation, at the function level.
Example:
$ BUILD=$HOME/rust/build/x86_64-unknown-linux-gnu
$ $BUILD/stage1/bin/rustc -Zinstrument-coverage \
$HOME/rust/src/test/run-make-fulldeps/instrument-coverage/main.rs
$ LLVM_PROFILE_FILE="main.profraw" ./main
called
$ $BUILD/llvm/bin/llvm-profdata merge -sparse main.profraw -o main.profdata
$ $BUILD/llvm/bin/llvm-cov show --instr-profile=main.profdata main
1| 1|pub fn will_be_called() {
2| 1| println!("called");
3| 1|}
4| |
5| 0|pub fn will_not_be_called() {
6| 0| println!("should not have been called");
7| 0|}
8| |
9| 1|fn main() {
10| 1| let less = 1;
11| 1| let more = 100;
12| 1|
13| 1| if less < more {
14| 1| will_be_called();
15| 1| } else {
16| 1| will_not_be_called();
17| 1| }
18| 1|}
|
|
|
|
This eliminates a bunch of `Symbol::intern()` and `Symbol::as_str()`
calls, which is good, because they require locking the interner.
Note that the unsafety in `from_cycle_error()` is identical to the
unsafety on other adjacent impls.
|
|
Note that the output of `unpretty-debug.stdout` has changed. In that
test the hash values are normalized from a symbol numbers to small
numbers like "0#0" and "0#1". The increase in the number of static
symbols must have caused the original numbers to contain more digits,
resulting in different pretty-printing prior to normalization.
|
|
added regions with counter expressions and counters.
Added codegen_llvm/coverageinfo mod for upcoming coverage map
Move coverage region collection to CodegenCx finalization
Moved from `query coverageinfo` (renamed from `query coverage_data`),
as discussed in the PR at:
https://github.com/rust-lang/rust/pull/73684#issuecomment-649882503
Address merge conflict in MIR instrument_coverage test
The MIR test output format changed for int types.
moved debug messages out of block.rs
This makes the block.rs calls to add coverage mapping data to the
CodegenCx much more concise and readable.
move coverage intrinsic handling into llvm impl
I realized that having half of the coverage intrinsic handling in
`rustc_codegen_ssa` and half in `rustc_codegen_llvm` meant that any
non-llvm backend would be bound to the same decisions about how the
coverage-related MIR terminators should be handled.
To fix this, I moved the non-codegen portion of coverage intrinsic
handling into its own trait, and implemented it in `rustc_codegen_llvm`
alongside `codegen_intrinsic_call`.
I also added the (required?) stubs for the new intrinsics to
`IntrepretCx::emulate_intrinsic()`, to ensure calls to this function do
not fail if called with these new but known intrinsics.
address PR Feedback on 28 June 2020 2:48pm PDT
|
|
Add unstable `core::mem::variant_count` intrinsic
Adds a new `const fn` intrinsic which can be used to determine the number of variants in an `enum`.
I've shown this to a couple of people and they invariably ask 'why on earth?', but there's actually a very neat use case:
At the moment, if you want to create an opaque array type that's indexed by an `enum` with one element for each variant, you either have to hard-code the number of variants, add a `LENGTH` variant or use a `Vec`, none of which are suitable in general (number of variants could change; pattern matching `LENGTH` becomes frustrating; might not have `alloc`). By including this intrinsic, it becomes possible to write the following:
```rust
#[derive(Copy, Clone)]
enum OpaqueIndex {
A = 0,
B,
C,
}
struct OpaqueVec<T>(Box<[T; std::mem::num_variants::<OpaqueIndex>()]>);
impl<T> std::ops::Index<OpaqueIndex> for OpaqueVec<T> {
type Output = T;
fn index(&self, idx: OpaqueIndex) -> &Self::Output {
&self.0[idx as usize]
}
}
```
(We even have a use cases for this in `rustc` and I plan to use it to re-implement the lang-items table.)
|
|
|
|
code coverage foundation for hash and num_counters
This PR is the next iteration after PR #73011 (which is still waiting on bors to merge).
@wesleywiser - PTAL
r? @tmandry
(FYI, I'm also working on injecting the coverage maps, in another branch, while waiting for these to merge.)
Thanks!
|
|
Also added FIXME comments to note the possible need to accommodate
counter increment calls in source-based functions that differ from the
function context of the caller instance (e.g., inline functions).
|
|
|
|
This commit adds a query that allows the CoverageData to be pulled from
a call on tcx, avoiding the need to change the
`codegen_intrinsic_call()` signature (no need to pass in the FunctionCx
or any additional arguments.
The commit does not change where/when the CoverageData is computed. It's
still done in the `pass`, and saved in the MIR `Body`.
See discussion (in progress) here:
https://github.com/rust-lang/rust/pull/73488#discussion_r443825646
|
|
|
|
|
|
Replaced dummy values for hash and num_counters with computed values,
and refactored InstrumentCoverage pass to simplify injecting more
counters per function in upcoming versions.
Improved usage documentation and error messaging.
|
|
|
|
|
|
This initial version only injects counters at the top of each function.
Rust Coverage will require injecting additional counters at each
conditional code branch.
|
|
|
|
|
|
LLVM 8 was released on March 20, 2019, over a year ago.
|
|
|
|
Stabilize float::to_int_unchecked
This renames and stabilizes unsafe floating point to integer casts, which are intended to be the substitute for the currently unsound `as` behavior, once that changes to safe-but-slower saturating casts. As such, I believe this also likely unblocks #10184 (our oldest I-unsound issue!), as once this rolls out to stable it would be far easier IMO to change the behavior of `as` to be safe by default.
This does not stabilize the trait or the associated method, as they are deemed internal implementation details (and consumers should not, generally, want to expose them, as in practice all callers likely know statically/without generics what the return type is).
Closes #67058
|
|
|
|
|
|
This renames and stabilizes unsafe floating point to integer casts, which are
intended to be the substitute for the currently unsound `as` behavior, once that
changes to safe-but-slower saturating casts.
|
|
|
|
librustc_codegen_llvm: Replace deprecated API usage
|
|
implement zeroed and uninitialized with MaybeUninit
This is the second attempt of doing such a change (first PR: https://github.com/rust-lang/rust/pull/62150). The last change [got reverted](https://github.com/rust-lang/rust/pull/63343) because it [caused](https://github.com/rust-lang/rust/issues/62825) some [issues](https://github.com/rust-lang/rust/issues/52898#issuecomment-512182438) in [code that incorrectly used these functions](https://github.com/erlepereira/x11-rs/issues/99).
Since then, the [problematic code has been fixed](https://github.com/erlepereira/x11-rs/pull/101), and rustc [gained a lint](https://github.com/rust-lang/rust/pull/63346) that is able to detect many misuses of these functions statically and a [dynamic check that panics](https://github.com/rust-lang/rust/pull/66059) instead of causing UB for some incorrect uses.
Fixes https://github.com/rust-lang/rust/issues/62825
|
|
|
|
|
|
|
|
|
|
|
|
|