| Age | Commit message (Collapse) | Author | Lines |
|
This obviously doesn't work when cross-compiling from Linux.
Split out from: https://github.com/rust-lang/rust/pull/140772
|
|
Co-authored-by: Ookiineko <chiisaineko@protonmail.com>
|
|
This LLVM-C binding replaces the existing `LLVMRustInlineAsm` function.
|
|
llvm/llvm-project@f137c3d592e96330e450a8fd63ef7e8877fc1908
In LLVM 21 PR https://github.com/llvm/llvm-project/pull/130940
`TargetRegistry::createTargetMachine` was changed to take a `const
Triple&` and has deprecated the old `StringRef` method.
@rustbot label llvm-main
|
|
PassWrapper: adapt for llvm/llvm-project@d3d856ad8469
LLVM 21 moves to making it more explicit what this function call is doing, but nothing has changed behaviorally, so for now we just adjust to using the new name of the function.
`@rustbot` label llvm-main
|
|
add autodiff inline
closes: #138920
r? ```@ZuseZ4```
try-job: dist-aarch64-linux
|
|
LLVM 21 moves to making it more explicit what this function call is
doing, but nothing has changed behaviorally, so for now we just adjust
to using the new name of the function.
@rustbot label llvm-main
|
|
|
|
Add XtensaAsmPrinter
See https://github.com/rust-lang/rust/pull/133601. The PR was closed because it required LLVM 19 in CI added with (https://github.com/rust-lang/rust/commit/12167d7064597993355e41d3a8c20654bccaf0be)
|
|
|
|
|
|
|
|
|
|
Autodiff batching
Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.
Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of
```rs
for i in 0..100 {
df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.
There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.
I will also add more tests for both modes.
For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.
r? ghost
Tracking:
- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
|
|
|
|
llvm/llvm-project@94122d58fc77079a291a3d008914006cb509d9db
We also have to remove the LLVM argument in cast-target-abi.rs for LLVM
21. I'm not really sure what the best approach here is since that test
already uses revisions. We could also fork the test into a copy for LLVM
19-20 and another for LLVM 21, but what I did for now was drop the
lint-abort-on-error flag to LLVM figuring that some coverage was better
than none, but I'm happy to change this if that was a bad direction.
The above also applies for ffi-out-of-bounds-loads.rs.
r? dianqk
@rustbot label llvm-main
|
|
This is currently unused, but paves the way for future work on expansion
regions without having to worry about the FFI parts.
|
|
Promote ohos targets to tier2 with host tools.
### What does this PR try to resolve?
Try to promote the following [[Tier 2 without Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools) targets to [[Tier 2 with Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools):
- `aarch64-unknown-linux-ohos`
- `armv7-unknown-linux-ohos`
- `x86_64-unknown-linux-ohos`
### More Information?
see MCP: https://github.com/rust-lang/compiler-team/issues/811
### Blockage to be solved?
- [x] Submit an MCP
- [x] Submit code of promote ohos targets
- [x] Resolve related dependencies (`measureme`)
The modified code of the measureme has been merged (see https://github.com/rust-lang/measureme/pull/238). [done]
The new version will was released (https://github.com/rust-lang/measureme/pull/240). [done]
|
|
Adapt to LLVM dropping CfiFunctionIndex::begin()/end()
After https://github.com/llvm/llvm-project/pull/130382, RustWrapper needs to call CfiFunctionIndex::symbols() instead.
|
|
After https://github.com/llvm/llvm-project/pull/130382, RustWrapper
needs to call CfiFunctionIndex::symbols() instead.
|
|
It's no longer necessary now that `-Wunreachable_pub` is being passed.
|
|
Revert <https://github.com/rust-lang/rust/pull/138084> to buy time to
consider options that avoids breaking downstream usages of cargo on
distributed `rustc-src` artifacts, where such cargo invocations fail due
to inability to inherit `lints` from workspace root manifest's
`workspace.lints` (this is only valid for the source rust-lang/rust
workspace, but not really the distributed `rustc-src` artifacts).
This breakage was reported in
<https://github.com/rust-lang/rust/issues/138304>.
This reverts commit 48caf81484b50dca5a5cebb614899a3df81ca898, reversing
changes made to c6662879b27f5161e95f39395e3c9513a7b97028.
|
|
Use workspace lints for crates in `compiler/`
This is nicer and hopefully less error prone than specifying lints via bootstrap.
r? ``@jieyouxu``
|
|
setTargetTriple now accepts Triple rather than string
https://github.com/llvm/llvm-project/pull/129868 updated `setTargetTriple`
|
|
(Except for `rustc_codegen_cranelift`.)
It's no longer necessary now that `unreachable_pub` is in the workspace
lints.
|
|
By naming them in `[workspace.lints.rust]` in the top-level
`Cargo.toml`, and then making all `compiler/` crates inherit them with
`[lints] workspace = true`. (I omitted `rustc_codegen_{cranelift,gcc}`,
because they're a bit different.)
The advantages of this over the current approach:
- It uses a standard Cargo feature, rather than special handling in
bootstrap. So, easier to understand, and less likely to get
accidentally broken in the future.
- It works for proc macro crates.
It's a shame it doesn't work for rustc-specific lints, as the comments
explain.
|
|
|
|
|
|
|
|
|
|
The embedded bitcode should always be prepared for LTO/ThinLTO
Fixes #115344. Fixes #117220.
There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.
When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.
This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.
r? nikic
|
|
See <https://github.com/rust-lang/rust/issues/137733>.
|
|
Emit getelementptr inbounds nuw for pointer::add()
Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.
Fixes https://github.com/rust-lang/rust/issues/137217.
|
|
|
|
Rollup of 9 pull requests
Successful merges:
- #136910 (Implement feature `isolate_most_least_significant_one` for integer types)
- #137183 (Prune dead regionck code)
- #137333 (Use `edition = "2024"` in the compiler (redux))
- #137356 (Ferris 🦀 Identifier naming conventions)
- #137362 (Add build step log for `run-make-support`)
- #137377 (Always allow reusing cratenum in CrateLoader::load)
- #137388 (Fix(lib/fs/tests): Disable rename POSIX semantics FS tests under Windows 7)
- #137410 (Use StableHasher + Hash64 for dep_tracking_hash)
- #137413 (jubilee cleared out the review queue)
r? `@ghost`
`@rustbot` modify labels: rollup
|
|
|
|
|
|
|
|
This API allows us to set the nuw flag as well.
|
|
The formatting of the command line arguments has been moved to the
frontend in:
https://github.com/llvm/llvm-project/commit/e190d074a0a77c9f8a7d7938a8187a7e2076e290
However, the Rust logic introduced in
https://github.com/rust-lang/rust/commit/ad0ecebf432fcb80cb666034ea44f75b81e55f95
did not replicate the previous argument quoting behavior.
|
|
adding autodiff tests
I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion?
Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as
`%0 = fadd fast double %x.0.val, %x.0.val`
(I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag)
r? ``@jieyouxu``
Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
|
|
Bump `cc` to v1.2.13 for the compiler workspace
|
|
|
|
|
|
|
|
|
|
|
|
The LLVM-C binding takes an explicit context, whereas our binding obtained the
context from the scope argument.
|
|
|
|
|