| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
|
|
I found some areas for improvement while attempting to debug the
SegFault issue when running rust programs compiled using MSVC, with
coverage instrumentation.
I discovered that LLVM's coverage writer was generating incomplete
function name variable names (that's not a typo: the name of the
variable that holds a function name).
The existing implementation used one-up numbers to distinguish
variables, and correcting the names did not fix the MSVC coverage bug,
but the fix in this PR makes the names and resulting LLVM IR easier to
follow and more consistent with Clang's implementation.
I also changed the way the `-Zinstrument-coverage` option is supported
in symbol_export.rs. The original implementation was incorrect, and the
corrected version matches the handling for `-Zprofile-generate`, as it
turns out.
(An argument could be made that maybe `-Zinstrument-coverage` should
automatically enable `-Cprofile-generate`. In fact, if
`-Cprofile-generate` is analagous to Clang's `-fprofile-generate`, as
some documentation implies, Clang always requires this flag for its
implementation of source-based code coverage. This would require a
little more validation, and if implemented, would probably require
updating some of the user-facing messages related to
`-Cprofile-generate` to not be so specific to the PGO use case.)
None of these changes fixed the MSVC coverage problems, but they should
still be welcome improvements.
Lastly, I added some additional FIXME comments in instrument_coverage.rs
describing issues I found with the generated LLVM IR that would be
resolved if the coverage instrumentation is injected with a `Statement`
instead of as a new `BasicBlock`. I describe seven advantages of this
change, but it requires some discussion before making a change like
this.
|
|
Completes support for coverage in external crates
Follow-up to #74959 :
The prior PR corrected for errors encountered when trying to generate
the coverage map on source code inlined from external crates (including
macros and generics) by avoiding adding external DefIds to the coverage
map.
This made it possible to generate a coverage report including external
crates, but the external crate coverage was incomplete (did not include
coverage for the DefIds that were eliminated.
The root issue was that the coverage map was converting Span locations
to source file and locations, using the SourceMap for the current crate,
and this would not work for spans from external crates (compliled with a
different SourceMap).
The solution was to convert the Spans to filename and location during
MIR generation instead, so precompiled external crates would already
have the correct source code locations embedded in their MIR, when
imported into another crate.
@wesleywiser FYI
r? @tmandry
|
|
The prior PR corrected for errors encountered when trying to generate
the coverage map on source code inlined from external crates (including
macros and generics) by avoiding adding external DefIds to the coverage
map.
This made it possible to generate a coverage report including external
crates, but the external crate coverage was incomplete (did not include
coverage for the DefIds that were eliminated.
The root issue was that the coverage map was converting Span locations
to source file and locations, using the SourceMap for the current crate,
and this would not work for spans from external crates (compliled with a
different SourceMap).
The solution was to convert the Spans to filename and location during
MIR generation instead, so precompiled external crates would already
have the correct source code locations embedded in their MIR, when
imported into another crate.
|
|
rustc: Improving safe wasm float->int casts
This commit improves code generation for WebAssembly targets when
translating floating to integer casts. This improvement is only relevant
when the `nontrapping-fptoint` feature is not enabled, but the feature
is not enabled by default right now. Additionally this improvement only
affects safe casts since unchecked casts were improved in #74659.
Some more background for this issue is present on #73591, but the
general gist of the issue is that in LLVM the `fptosi` and `fptoui`
instructions are defined to return an `undef` value if they execute on
out-of-bounds values; they notably do not trap. To implement these
instructions for WebAssembly the LLVM backend must therefore generate
quite a few instructions before executing `i32.trunc_f32_s` (for
example) because this WebAssembly instruction traps on out-of-bounds
values. This codegen into wasm instructions happens very late in the
code generator, so what ends up happening is that rustc inserts its own
codegen to implement Rust's saturating semantics, and then LLVM also
inserts its own codegen to make sure that the `fptosi` instruction
doesn't trap. Overall this means that a function like this:
#[no_mangle]
pub unsafe extern "C" fn cast(x: f64) -> u32 {
x as u32
}
will generate this WebAssembly today:
(func $cast (type 0) (param f64) (result i32)
(local i32 i32)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.gt
local.set 1
block ;; label = @1
block ;; label = @2
local.get 0
f64.const 0x0p+0 (;=0;)
local.get 0
f64.const 0x0p+0 (;=0;)
f64.gt
select
local.tee 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@2;)
local.get 0
i32.trunc_f64_u
local.set 2
br 1 (;@1;)
end
i32.const 0
local.set 2
end
i32.const -1
local.get 2
local.get 1
select)
This PR improves the situation by updating the code generation for
float-to-int conversions in rustc, specifically only for WebAssembly
targets and only for some situations (float-to-u8 still has not great
codegen). The fix here is to use basic blocks and control flow to avoid
speculatively executing `fptosi`, and instead LLVM's raw intrinsic for
the WebAssembly instruction is used instead. This effectively extends
the support added in #74659 to checked casts. After this commit the
codegen for the above Rust function looks like:
(func $cast (type 0) (param f64) (result i32)
(local i32)
block ;; label = @1
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
local.tee 1
i32.const 1
i32.xor
br_if 0 (;@1;)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.le
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const -1
i32.const 0
local.get 1
select)
For reference, in Rust 1.44, which did not have saturating
float-to-integer casts, the codegen LLVM would emit is:
(func $cast (type 0) (param f64) (result i32)
block ;; label = @1
local.get 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const 0)
So we're relatively close to the original codegen, although it's
slightly different because the semantics of the function changed where
we're emulating the `i32.trunc_sat_f32_s` instruction rather than always
replacing out-of-bounds values with zero.
There is still work that could be done to improve casts such as `f32` to
`u8`. That form of cast still uses the `fptosi` instruction which
generates lots of branch-y code. This seems less important to tackle now
though. In the meantime this should take care of most use cases of
floating-point conversion and as a result I'm going to speculate that
this...
Closes #73591
|
|
This commit improves code generation for WebAssembly targets when
translating floating to integer casts. This improvement is only relevant
when the `nontrapping-fptoint` feature is not enabled, but the feature
is not enabled by default right now. Additionally this improvement only
affects safe casts since unchecked casts were improved in #74659.
Some more background for this issue is present on #73591, but the
general gist of the issue is that in LLVM the `fptosi` and `fptoui`
instructions are defined to return an `undef` value if they execute on
out-of-bounds values; they notably do not trap. To implement these
instructions for WebAssembly the LLVM backend must therefore generate
quite a few instructions before executing `i32.trunc_f32_s` (for
example) because this WebAssembly instruction traps on out-of-bounds
values. This codegen into wasm instructions happens very late in the
code generator, so what ends up happening is that rustc inserts its own
codegen to implement Rust's saturating semantics, and then LLVM also
inserts its own codegen to make sure that the `fptosi` instruction
doesn't trap. Overall this means that a function like this:
#[no_mangle]
pub unsafe extern "C" fn cast(x: f64) -> u32 {
x as u32
}
will generate this WebAssembly today:
(func $cast (type 0) (param f64) (result i32)
(local i32 i32)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.gt
local.set 1
block ;; label = @1
block ;; label = @2
local.get 0
f64.const 0x0p+0 (;=0;)
local.get 0
f64.const 0x0p+0 (;=0;)
f64.gt
select
local.tee 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@2;)
local.get 0
i32.trunc_f64_u
local.set 2
br 1 (;@1;)
end
i32.const 0
local.set 2
end
i32.const -1
local.get 2
local.get 1
select)
This PR improves the situation by updating the code generation for
float-to-int conversions in rustc, specifically only for WebAssembly
targets and only for some situations (float-to-u8 still has not great
codegen). The fix here is to use basic blocks and control flow to avoid
speculatively executing `fptosi`, and instead LLVM's raw intrinsic for
the WebAssembly instruction is used instead. This effectively extends
the support added in #74659 to checked casts. After this commit the
codegen for the above Rust function looks like:
(func $cast (type 0) (param f64) (result i32)
(local i32)
block ;; label = @1
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
local.tee 1
i32.const 1
i32.xor
br_if 0 (;@1;)
local.get 0
f64.const 0x1.fffffffep+31 (;=4.29497e+09;)
f64.le
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const -1
i32.const 0
local.get 1
select)
For reference, in Rust 1.44, which did not have saturating
float-to-integer casts, the codegen LLVM would emit is:
(func $cast (type 0) (param f64) (result i32)
block ;; label = @1
local.get 0
f64.const 0x1p+32 (;=4.29497e+09;)
f64.lt
local.get 0
f64.const 0x0p+0 (;=0;)
f64.ge
i32.and
i32.eqz
br_if 0 (;@1;)
local.get 0
i32.trunc_f64_u
return
end
i32.const 0)
So we're relatively close to the original codegen, although it's
slightly different because the semantics of the function changed where
we're emulating the `i32.trunc_sat_f32_s` instruction rather than always
replacing out-of-bounds values with zero.
There is still work that could be done to improve casts such as `f32` to
`u8`. That form of cast still uses the `fptosi` instruction which
generates lots of branch-y code. This seems less important to tackle now
though. In the meantime this should take care of most use cases of
floating-point conversion and as a result I'm going to speculate that
this...
Closes #73591
|
|
Found some problems with the coverage map encoding when testing with
more than one counter per function.
While debugging, I realized some better ways to structure the Rust
implementation of the coverage mapping generator. I refactored somewhat,
resulting in less code overall, expanded coverage of LLVM Coverage Map
capabilities, and much closer alignment with LLVM data structures, APIs,
and naming.
This should be easier to follow and easier to maintain.
|
|
functions
This patch extends the existing `type_i8p` method so that it requires an
explicit address space to be specified. Before this patch, the
`type_i8p` method implcitily assumed the default address space, which is
not a safe transformation on all targets, namely AVR.
The Rust compiler already has support for tracking the "instruction
address space" on a per-target basis. This patch extends the code
generation routines so that an address space must always be specified.
In my estimation, around 15% of the callers of `type_i8p` produced
invalid code on AVR due to the loss of address space prior to LLVM final
code generation. This would lead to unavoidable assertion errors
relating to invalid bitcasts.
With this patch, the address space is always either 1) explicitly set to
the instruction address space because the logic is dealing with functions
which must be placed there, or 2) explicitly set to the default address
space 0 because the logic can only operate on data space pointers and thus
we keep the existing semantics of assuming the default, "data" address space.
|
|
rustc now generates the coverage map and can support (limited)
coverage report generation, at the function level.
Example:
$ BUILD=$HOME/rust/build/x86_64-unknown-linux-gnu
$ $BUILD/stage1/bin/rustc -Zinstrument-coverage \
$HOME/rust/src/test/run-make-fulldeps/instrument-coverage/main.rs
$ LLVM_PROFILE_FILE="main.profraw" ./main
called
$ $BUILD/llvm/bin/llvm-profdata merge -sparse main.profraw -o main.profdata
$ $BUILD/llvm/bin/llvm-cov show --instr-profile=main.profdata main
1| 1|pub fn will_be_called() {
2| 1| println!("called");
3| 1|}
4| |
5| 0|pub fn will_not_be_called() {
6| 0| println!("should not have been called");
7| 0|}
8| |
9| 1|fn main() {
10| 1| let less = 1;
11| 1| let more = 100;
12| 1|
13| 1| if less < more {
14| 1| will_be_called();
15| 1| } else {
16| 1| will_not_be_called();
17| 1| }
18| 1|}
|
|
Note that the output of `unpretty-debug.stdout` has changed. In that
test the hash values are normalized from a symbol numbers to small
numbers like "0#0" and "0#1". The increase in the number of static
symbols must have caused the original numbers to contain more digits,
resulting in different pretty-printing prior to normalization.
|
|
|
|
Use WASM's saturating casts if they are available
WebAssembly supports saturating floating point to integer casts behind a target feature. The feature is already available on many browsers. Beginning with 1.45 Rust will start defining the behavior of floating point to integer casts to be saturating as well. For this Rust constructs additional checks on top of the `fptoui` / `fptosi` instructions it emits. Here we introduce the possibility for the codegen backend to construct saturating casts itself and only fall back to constructing the checks ourselves if that is not possible.
Resolves part of #73591
|
|
WebAssembly supports saturating floating point to integer casts behind a
target feature. The feature is already available on many browsers.
Beginning with 1.45 Rust will start defining the behavior of floating
point to integer casts to be saturating as well. For this Rust
constructs additional checks on top of the `fptoui` / `fptosi`
instructions it emits. Here we introduce the possibility for the codegen
backend to construct saturating casts itself and only fall back to
constructing the checks ourselves if that is not possible.
|
|
added regions with counter expressions and counters.
Added codegen_llvm/coverageinfo mod for upcoming coverage map
Move coverage region collection to CodegenCx finalization
Moved from `query coverageinfo` (renamed from `query coverage_data`),
as discussed in the PR at:
https://github.com/rust-lang/rust/pull/73684#issuecomment-649882503
Address merge conflict in MIR instrument_coverage test
The MIR test output format changed for int types.
moved debug messages out of block.rs
This makes the block.rs calls to add coverage mapping data to the
CodegenCx much more concise and readable.
move coverage intrinsic handling into llvm impl
I realized that having half of the coverage intrinsic handling in
`rustc_codegen_ssa` and half in `rustc_codegen_llvm` meant that any
non-llvm backend would be bound to the same decisions about how the
coverage-related MIR terminators should be handled.
To fix this, I moved the non-codegen portion of coverage intrinsic
handling into its own trait, and implemented it in `rustc_codegen_llvm`
alongside `codegen_intrinsic_call`.
I also added the (required?) stubs for the new intrinsics to
`IntrepretCx::emulate_intrinsic()`, to ensure calls to this function do
not fail if called with these new but known intrinsics.
address PR Feedback on 28 June 2020 2:48pm PDT
|
|
|
|
This initial version only injects counters at the top of each function.
Rust Coverage will require injecting additional counters at each
conditional code branch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
asm! is left as a wrapper around llvm_asm! to maintain compatibility.
|
|
|
|
|
|
Optimize catch_unwind to match C++ try/catch
This refactors the implementation of catching unwinds to allow LLVM to inline the "try" closure directly into the happy path, avoiding indirection. This means that the catch_unwind implementation is (after this PR) zero-cost unless a panic is thrown.
https://rust.godbolt.org/z/cZcUSB is an example of the current codegen in a simple case. Notably, the codegen is *exactly the same* if `-Cpanic=abort` is passed, which is clearly not great.
This PR, on the other hand, generates the following assembly:
```asm
# -Cpanic=unwind:
push rbx
mov ebx,0x2a
call QWORD PTR [rip+0x1c53c] # <happy>
mov eax,ebx
pop rbx
ret
mov rdi,rax
call QWORD PTR [rip+0x1c537] # cleanup function call
call QWORD PTR [rip+0x1c539] # <unfortunate>
mov ebx,0xd
mov eax,ebx
pop rbx
ret
# -Cpanic=abort:
push rax
call QWORD PTR [rip+0x20a1] # <happy>
mov eax,0x2a
pop rcx
ret
```
Fixes #64224, and resolves #64222.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|