| Age | Commit message (Collapse) | Author | Lines |
|
For better throughput during parallel processing by LLVM, we used to sort
CGUs largest to smallest. This would lead to better thread utilization
by, for example, preventing a large CGU from being processed last and
having only one LLVM thread working while the rest remained idle.
However, this strategy would lead to high memory usage, as it meant the
LLVM-IR for all of the largest CGUs would be resident in memory at once.
Instead, we can compromise by ordering CGUs such that the largest and
smallest are first, second largest and smallest are next, etc. If there
are large size variations, this can reduce memory usage significantly.
|
|
Indicate both start and end of pass RSS in time-passes output
Previously, only the end of pass RSS was indicated. This could easily
lead one to believe that the change in RSS from one pass to the next was
attributable to the second pass, when in fact it occurred between the
end of the first pass and the start of the second.
Also, improve alignment of columns.
Sample of output:
```
time: 0.739; rss: 607MB -> 637MB item_types_checking
time: 8.429; rss: 637MB -> 775MB item_bodies_checking
time: 11.063; rss: 470MB -> 775MB type_check_crate
time: 0.232; rss: 775MB -> 777MB match_checking
time: 0.139; rss: 777MB -> 779MB liveness_and_intrinsic_checking
time: 0.372; rss: 775MB -> 779MB misc_checking_2
time: 8.188; rss: 779MB -> 1019MB MIR_borrow_checking
time: 0.062; rss: 1019MB -> 1021MB MIR_effect_checking
```
|
|
codegen: assume constants cannot fail to evaluate
https://github.com/rust-lang/rust/pull/80579 landed, so we can finally remove this old hack from codegen and instead assume that consts never fail to evaluate. :)
r? `@oli-obk`
|
|
also don't submit code to LLVM when the session has errors
|
|
Previously, only the end of pass RSS was indicated. This could easily
lead one to believe that the change in RSS from one pass to the next was
attributable to the second pass, when in fact it occurred between the
end of the first pass and the start of the second.
Also, improve alignment of columns.
|
|
clean up some const error reporting around promoteds
These are some error reporting simplifications enabled by https://github.com/rust-lang/rust/pull/80579.
Further simplifications are possible but could be blocked on making `const_err` a hard error.
r? ``````@oli-obk``````
|
|
Use -target when linking binaries for Mac Catalyst
When running `rustc` with `-target x86_64-apple-ios-macabi`, the linker
eventually gets run with `-arch x86_64`, because the linker back end splits the
LLVM target triple and uses the first token as the target architecture. However,
this does not work for the Mac Catalyst ABI, which is a separate target from
Darwin.
Specifying the full target triple with `-target` allows Mac Catalyst binaries to
link and run.
closes #80202
|
|
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo`
This commit adds a new stable codegen option to rustc,
`-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now
subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also
subsumed by this flag but still requires `-Zunstable-options` to
actually activate. The `-Csplit-debuginfo` flag takes one of
three values:
* `off` - This indicates that split-debuginfo from the final artifact is
not desired. This is not supported on Windows and is the default on
Unix platforms except macOS. On macOS this means that `dsymutil` is
not executed.
* `packed` - This means that debuginfo is desired in one location
separate from the main executable. This is the default on Windows
(`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes
`-Zsplit-dwarf=single` and produces a `*.dwp` file.
* `unpacked` - This means that debuginfo will be roughly equivalent to
object files, meaning that it's throughout the build directory
rather than in one location (often the fastest for local development).
This is not the default on any platform and is not supported on Windows.
Each target can indicate its own default preference for how debuginfo is
handled. Almost all platforms default to `off` except for Windows and
macOS which default to `packed` for historical reasons.
Some equivalencies for previous unstable flags with the new flags are:
* `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed`
* `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked`
* `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed`
* `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked`
Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for
non-macOS platforms since split-dwarf support was *just* implemented in
rustc.
There's some more rationale listed on #79361, but the main gist of the
motivation for this commit is that `dsymutil` can take quite a long time
to execute in debug builds and provides little benefit. This means that
incremental compile times appear that much worse on macOS because the
compiler is constantly running `dsymutil` over every single binary it
produces during `cargo build` (even build scripts!). Ideally rustc would
switch to not running `dsymutil` by default, but that's a problem left
to get tackled another day.
Closes #79361
|
|
This commit adds a new stable codegen option to rustc,
`-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now
subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also
subsumed by this flag but still requires `-Zunstable-options` to
actually activate. The `-Csplit-debuginfo` flag takes one of
three values:
* `off` - This indicates that split-debuginfo from the final artifact is
not desired. This is not supported on Windows and is the default on
Unix platforms except macOS. On macOS this means that `dsymutil` is
not executed.
* `packed` - This means that debuginfo is desired in one location
separate from the main executable. This is the default on Windows
(`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes
`-Zsplit-dwarf=single` and produces a `*.dwp` file.
* `unpacked` - This means that debuginfo will be roughly equivalent to
object files, meaning that it's throughout the build directory
rather than in one location (often the fastest for local development).
This is not the default on any platform and is not supported on Windows.
Each target can indicate its own default preference for how debuginfo is
handled. Almost all platforms default to `off` except for Windows and
macOS which default to `packed` for historical reasons.
Some equivalencies for previous unstable flags with the new flags are:
* `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed`
* `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked`
* `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed`
* `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked`
Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for
non-macOS platforms since split-dwarf support was *just* implemented in
rustc.
There's some more rationale listed on #79361, but the main gist of the
motivation for this commit is that `dsymutil` can take quite a long time
to execute in debug builds and provides little benefit. This means that
incremental compile times appear that much worse on macOS because the
compiler is constantly running `dsymutil` over every single binary it
produces during `cargo build` (even build scripts!). Ideally rustc would
switch to not running `dsymutil` by default, but that's a problem left
to get tackled another day.
Closes #79361
|
|
Refractor a few more types to `rustc_type_ir`
In the continuation of #79169, ~~blocked on that PR~~.
This PR:
- moves `IntVarValue`, `FloatVarValue`, `InferTy` (and friends) and `Variance`
- creates the `IntTy`, `UintTy` and `FloatTy` enums in `rustc_type_ir`, based on their `ast` and `chalk_ir` equilavents, and uses them for types in the rest of the compiler.
~~I will split up that commit to make this easier to review and to have a better commit history.~~
EDIT: done, I split the PR in commits of 200-ish lines each
r? `````@nikomatsakis````` cc `````@jackh726`````
|
|
rustc_codegen_ssa: use wall time for codegen_to_LLVM_IR time-passes entry
Use elapsed wall time spent on codegen_to_LLVM_IR for all CGUs as a
whole, rather than the sum for each CGU (the distinction matters for
parallel builds, where some CGUs are processed in parallel).
|
|
Use elapsed wall time spent on codegen_to_LLVM_IR for all CGUs as a
whole, rather than the sum for each CGU (the distinction matters for
parallel builds, where some CGUs are processed in parallel).
|
|
bjorn3:no_extern_backend_optimization_level_query_provider, r=cjgillot
Don't provide backend_optimization_level query for extern crates
Fixes #71291
|
|
Fix sysroot option not being honored across rustc
Change link_sanitizer_runtime() to check if the sanitizer library exists in the specified/session sysroot, and if it doesn't exist, use the default sysroot. (See #79253.)
|
|
|
|
PlaceRef::ty: use method call syntax
|
|
|
|
Skip linking if it is not required
This allows to use `--emit=metadata,obj` and other metadata + non-link combinations.
Fixes #81117.
|
|
Change link_sanitizer_runtime() to check if the sanitizer library exists
in the specified/session sysroot, and if it doesn't exist, use the
default sysroot.
|
|
|
|
This allows to use `--emit=metadata,obj` and other metadata
+ non-link combinations.
Fixes #81117.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Remove is_dllimport_foreign_item definition from cg_ssa
It overwrites the definition from rustc_metadata.
cc https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/query.20provided.20twice/near/218927806
Marked as draft to test if this breaks anything.
|
|
|
|
|
|
It overwrites the definition from rustc_metadata
|
|
remove redundant closures (clippy::redundant_closure)
|
|
Emit a reactor for cdylib target on wasi
Fixes #79199, and relevant to #73432
Implements wasi reactors, as described in WebAssembly/WASI#13 and [`design/application-abi.md`](https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md)
Empty `lib.rs`, `lib.crate-type = ["cdylib"]`:
```shell
$ cargo +reactor build --release --target wasm32-wasi
Compiling wasm-reactor v0.1.0 (/home/coolreader18/wasm-reactor)
Finished release [optimized] target(s) in 0.08s
$ wasm-dis target/wasm32-wasi/release/wasm_reactor.wasm >reactor.wat
```
`reactor.wat`:
```wat
(module
(type $none_=>_none (func))
(type $i32_=>_none (func (param i32)))
(type $i32_i32_=>_i32 (func (param i32 i32) (result i32)))
(type $i32_=>_i32 (func (param i32) (result i32)))
(type $i32_i32_i32_=>_i32 (func (param i32 i32 i32) (result i32)))
(import "wasi_snapshot_preview1" "fd_prestat_get" (func $__wasi_fd_prestat_get (param i32 i32) (result i32)))
(import "wasi_snapshot_preview1" "fd_prestat_dir_name" (func $__wasi_fd_prestat_dir_name (param i32 i32 i32) (result i32)))
(import "wasi_snapshot_preview1" "proc_exit" (func $__wasi_proc_exit (param i32)))
(import "wasi_snapshot_preview1" "environ_sizes_get" (func $__wasi_environ_sizes_get (param i32 i32) (result i32)))
(import "wasi_snapshot_preview1" "environ_get" (func $__wasi_environ_get (param i32 i32) (result i32)))
(memory $0 17)
(table $0 1 1 funcref)
(global $global$0 (mut i32) (i32.const 1048576))
(global $global$1 i32 (i32.const 1049096))
(global $global$2 i32 (i32.const 1049096))
(export "memory" (memory $0))
(export "_initialize" (func $_initialize))
(export "__data_end" (global $global$1))
(export "__heap_base" (global $global$2))
(func $__wasm_call_ctors
(call $__wasilibc_initialize_environ_eagerly)
(call $__wasilibc_populate_preopens)
)
(func $_initialize
(call $__wasm_call_ctors)
)
(func $malloc (param $0 i32) (result i32)
(call $dlmalloc
(local.get $0)
)
)
;; lots of dlmalloc, memset/memcpy, & libpreopen code
)
```
I went with repurposing cdylib because I figured that it doesn't make much sense to have a wasi shared library that can't be initialized, and even if someone was using it adding an `_initialize` export is a very small change.
|
|
|
|
Optimize DST field access
For
struct X<T: ?Sized>(T)
struct Y<T: ?Sized>(u8, T)
the offset of the unsized field is
0
mem::align_of_val(&self.1)
respectively. This patch changes the expression used to compute these
offsets so that the optimizer can perform this optimization.
Consider
```rust
fn f(x: &X<dyn Any>) -> &dyn Any {
&x.0
}
```
Before:
```asm
test:
movq %rsi, %rdx
movq 16(%rsi), %rax
leaq -1(%rax), %rcx
negq %rax
andq %rcx, %rax
addq %rdi, %rax
retq
```
After:
```asm
test:
movq %rsi, %rdx
movq %rdi, %rax
retq
```
|
|
|
|
|
|
|
|
Update and improve `rustc_codegen_{llvm,ssa}` docs
Fixes #75342.
These docs were very out of date and misleading. They even said that
they codegen'd the *AST*!
For some reason, the `rustc_codegen_ssa::base` docs were exactly
identical to the `rustc_codegen_llvm::base` docs. They didn't really
make sense, because they had LLVM-specific information even though
`rustc_codegen_ssa` is supposed to be somewhat generic. So I removed
them as they were misleading.
r? ``@pnkfelix`` maybe?
|
|
Rename kw::Invalid -> kw::Empty
See https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Is.20there.20a.20symbol.20for.20the.20empty.20string.3F/near/220054471
for context.
r? `@petrochenkov`
|
|
See https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Is.20there.20a.20symbol.20for.20the.20empty.20string.3F/near/220054471
for context.
|
|
where possible, pass slices instead of &Vec or &String (clippy::ptr_arg)
|
|
|
|
|
|
use matches!() macro in more places
|
|
Move some more code out of CodegenBackend::{codegen_crate,link}
Kind of a follow up to #77795
|
|
|
|
These docs were very out of date and misleading. They even said that
they codegen'd the *AST*!
For some reason, the `rustc_codegen_ssa::base` docs were exactly
identical to the `rustc_codegen_llvm::base` docs. They didn't really
make sense, because they had LLVM-specific information even though
`rustc_codegen_ssa` is supposed to be somewhat generic. So I removed
them as they were misleading.
|
|
`foreign_module` and `wasm_import_module` are not needed for linking,
and hence can be removed from CodegenResults.
|
|
|
|
|
|
For
struct X<T: ?Sized>(T)
struct Y<T: ?Sized>(u8, T)
the offset of the unsized field is
0
mem::align_of_val(&self.1)
respectively. This patch changes the expression used to compute these
offsets so that the optimizer can perform this optimization.
Consider
```rust
fn test(x: &X<dyn Any>) -> &dyn Any {
&x.0
}
```
Before:
```asm
test:
movq %rsi, %rdx
movq 16(%rsi), %rax
leaq -1(%rax), %rcx
negq %rax
andq %rcx, %rax
addq %rdi, %rax
retq
```
After:
```asm
test:
movq %rsi, %rdx
movq %rdi, %rax
retq
```
|
|
Always run intrinsics lowering pass
Move intrinsics lowering pass from the optimization phase (where it
would not run if -Zmir-opt-level=0), to the drop lowering phase where it
runs unconditionally.
The implementation of those intrinsics in code generation and
interpreter is unnecessary. Remove it.
|
|
|
|
This commit adds a Split DWARF compare mode to compiletest so that
debuginfo tests are also tested using Split DWARF in split mode (and
manually in single mode).
Signed-off-by: David Wood <david@davidtw.co>
|
|
This commit implements Split DWARF support, wiring up the flag (added in
earlier commits) to the modified FFI wrapper (also from earlier
commits).
Signed-off-by: David Wood <david@davidtw.co>
|