about summary refs log tree commit diff
path: root/compiler/rustc_mir_transform/src/dest_prop.rs
blob: 1f38433fa5a4158e1e6f9dcac3b50e25c874b100 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
//! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
//!
//! # Motivation
//!
//! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
//! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
//! MIR building for constants in particular tends to create additional locals that are only used
//! inside a single block to shuffle a value around unnecessarily.
//!
//! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
//! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
//! that we can regain by implementing an optimization for removing these assign statements in rustc
//! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
//! and code generation phases of rustc due to the reduced number of statements and locals.
//!
//! # The Optimization
//!
//! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
//! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
//! values or the return place `_0`. On a very high level, independent of the actual implementation
//! details, it does the following:
//!
//! 1) Identify `dest = src;` statements with values for `dest` and `src` whose storage can soundly
//!    be merged.
//! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
//!    backwards).
//! 3) Delete the `dest = src;` statement (by making it a `nop`).
//!
//! Step 1) is by far the hardest, so it is explained in more detail below.
//!
//! ## Soundness
//!
//! We have a pair of places `p` and `q`, whose memory we would like to merge. In order for this to
//! be sound, we need to check a number of conditions:
//!
//! * `p` and `q` must both be *constant* - it does not make much sense to talk about merging them
//!   if they do not consistently refer to the same place in memory. This is satisfied if they do
//!   not contain any indirection through a pointer or any indexing projections.
//!
//! * `p` and `q` must have the **same type**. If we replace a local with a subtype or supertype,
//!   we may end up with a different vtable for that local. See the `subtyping-impacts-selection`
//!   tests for an example where that causes issues.
//!
//! * We need to make sure that the goal of "merging the memory" is actually structurally possible
//!   in MIR. For example, even if all the other conditions are satisfied, there is no way to
//!   "merge" `_5.foo` and `_6.bar`. For now, we ensure this by requiring that both `p` and `q` are
//!   locals with no further projections. Future iterations of this pass should improve on this.
//!
//! * Finally, we want `p` and `q` to use the same memory - however, we still need to make sure that
//!   each of them has enough "ownership" of that memory to continue "doing its job." More
//!   precisely, what we will check is that whenever the program performs a write to `p`, then it
//!   does not currently care about what the value in `q` is (and vice versa). We formalize the
//!   notion of "does not care what the value in `q` is" by checking the *liveness* of `q`.
//!
//!   Because of the difficulty of computing liveness of places that have their address taken, we do
//!   not even attempt to do it. Any places that are in a local that has its address taken is
//!   excluded from the optimization.
//!
//! The first two conditions are simple structural requirements on the `Assign` statements that can
//! be trivially checked. The third requirement however is more difficult and costly to check.
//!
//! ## Current implementation
//!
//! The current implementation relies on live range computation to check for conflicts. We only
//! allow to merge locals that have disjoint live ranges. The live range are defined with
//! half-statement granularity, so as to make all writes be live for at least a half statement.
//!
//! ## Future Improvements
//!
//! There are a number of ways in which this pass could be improved in the future:
//!
//! * Merging storage liveness ranges instead of removing storage statements completely. This may
//!   improve stack usage.
//!
//! * Allow merging locals into places with projections, eg `_5` into `_6.foo`.
//!
//! * Liveness analysis with more precision than whole locals at a time. The smaller benefit of this
//!   is that it would allow us to dest prop at "sub-local" levels in some cases. The bigger benefit
//!   of this is that such liveness analysis can report more accurate results about whole locals at
//!   a time. For example, consider:
//!
//!   ```ignore (syntax-highlighting-only)
//!   _1 = u;
//!   // unrelated code
//!   _1.f1 = v;
//!   _2 = _1.f1;
//!   ```
//!
//!   Because the current analysis only thinks in terms of locals, it does not have enough
//!   information to report that `_1` is dead in the "unrelated code" section.
//!
//! * Liveness analysis enabled by alias analysis. This would allow us to not just bail on locals
//!   that ever have their address taken. Of course that requires actually having alias analysis
//!   (and a model to build it on), so this might be a bit of a ways off.
//!
//! * Various perf improvements. There are a bunch of comments in here marked `PERF` with ideas for
//!   how to do things more efficiently. However, the complexity of the pass as a whole should be
//!   kept in mind.
//!
//! ## Previous Work
//!
//! A [previous attempt][attempt 1] at implementing an optimization like this turned out to be a
//! significant regression in compiler performance. Fixing the regressions introduced a lot of
//! undesirable complexity to the implementation.
//!
//! A [subsequent approach][attempt 2] tried to avoid the costly computation by limiting itself to
//! acyclic CFGs, but still turned out to be far too costly to run due to suboptimal performance
//! within individual basic blocks, requiring a walk across the entire block for every assignment
//! found within the block. For the `tuple-stress` benchmark, which has 458745 statements in a
//! single block, this proved to be far too costly.
//!
//! [Another approach after that][attempt 3] was much closer to correct, but had some soundness
//! issues - it was failing to consider stores outside live ranges, and failed to uphold some of the
//! requirements that MIR has for non-overlapping places within statements. However, it also had
//! performance issues caused by `O(l² * s)` runtime, where `l` is the number of locals and `s` is
//! the number of statements and terminators.
//!
//! Since the first attempt at this, the compiler has improved dramatically, and new analysis
//! frameworks have been added that should make this approach viable without requiring a limited
//! approach that only works for some classes of CFGs:
//! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
//!   analyses efficiently.
//! - Layout optimizations for coroutines have been added to improve code generation for
//!   async/await, which are very similar in spirit to what this optimization does.
//!
//! [The next approach][attempt 4] computes a conflict matrix between locals by forbidding merging
//! locals with competing writes or with one write while the other is live.
//!
//! ## Pre/Post Optimization
//!
//! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
//! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
//!
//! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
//! [attempt 1]: https://github.com/rust-lang/rust/pull/47954
//! [attempt 2]: https://github.com/rust-lang/rust/pull/71003
//! [attempt 3]: https://github.com/rust-lang/rust/pull/72632
//! [attempt 4]: https://github.com/rust-lang/rust/pull/96451

use rustc_data_structures::union_find::UnionFind;
use rustc_index::bit_set::DenseBitSet;
use rustc_index::interval::SparseIntervalMatrix;
use rustc_index::{IndexVec, newtype_index};
use rustc_middle::mir::visit::{MutVisitor, PlaceContext, VisitPlacesWith, Visitor};
use rustc_middle::mir::*;
use rustc_middle::ty::TyCtxt;
use rustc_mir_dataflow::impls::{DefUse, MaybeLiveLocals};
use rustc_mir_dataflow::points::DenseLocationMap;
use rustc_mir_dataflow::{Analysis, Results};
use tracing::{debug, trace};

pub(super) struct DestinationPropagation;

impl<'tcx> crate::MirPass<'tcx> for DestinationPropagation {
    fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
        sess.mir_opt_level() >= 2
    }

    #[tracing::instrument(level = "trace", skip(self, tcx, body))]
    fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
        let def_id = body.source.def_id();
        trace!(?def_id);

        let borrowed = rustc_mir_dataflow::impls::borrowed_locals(body);

        let candidates = Candidates::find(body, &borrowed);
        trace!(?candidates);
        if candidates.c.is_empty() {
            return;
        }

        let live = MaybeLiveLocals.iterate_to_fixpoint(tcx, body, Some("MaybeLiveLocals-DestProp"));

        let points = DenseLocationMap::new(body);
        let mut relevant = RelevantLocals::compute(&candidates, body.local_decls.len());
        let mut live = save_as_intervals(&points, body, &relevant, live.results);

        dest_prop_mir_dump(tcx, body, &points, &live, &relevant);

        let mut merged_locals = DenseBitSet::new_empty(body.local_decls.len());

        for (src, dst) in candidates.c.into_iter() {
            trace!(?src, ?dst);

            let Some(mut src) = relevant.find(src) else { continue };
            let Some(mut dst) = relevant.find(dst) else { continue };
            if src == dst {
                continue;
            }

            let Some(src_live_ranges) = live.row(src) else { continue };
            let Some(dst_live_ranges) = live.row(dst) else { continue };
            trace!(?src, ?src_live_ranges);
            trace!(?dst, ?dst_live_ranges);

            if src_live_ranges.disjoint(dst_live_ranges) {
                // We want to replace `src` by `dst`.
                let mut orig_src = relevant.original[src];
                let mut orig_dst = relevant.original[dst];

                // The return place and function arguments are required and cannot be renamed.
                // This check cannot be made during candidate collection, as we may want to
                // unify the same non-required local with several required locals.
                match (is_local_required(orig_src, body), is_local_required(orig_dst, body)) {
                    // Renaming `src` is ok.
                    (false, _) => {}
                    // Renaming `src` is wrong, but renaming `dst` is ok.
                    (true, false) => {
                        std::mem::swap(&mut src, &mut dst);
                        std::mem::swap(&mut orig_src, &mut orig_dst);
                    }
                    // Neither local can be renamed, so skip this case.
                    (true, true) => continue,
                }

                trace!(?src, ?dst, "merge");
                merged_locals.insert(orig_src);
                merged_locals.insert(orig_dst);

                // Replace `src` by `dst`.
                let head = relevant.union(src, dst);
                live.union_rows(/* read */ src, /* write */ head);
                live.union_rows(/* read */ dst, /* write */ head);
            }
        }
        trace!(?merged_locals);
        trace!(?relevant.renames);

        if merged_locals.is_empty() {
            return;
        }

        apply_merges(body, tcx, relevant, merged_locals);
    }

    fn is_required(&self) -> bool {
        false
    }
}

//////////////////////////////////////////////////////////
// Merging
//
// Applies the actual optimization

fn apply_merges<'tcx>(
    body: &mut Body<'tcx>,
    tcx: TyCtxt<'tcx>,
    relevant: RelevantLocals,
    merged_locals: DenseBitSet<Local>,
) {
    let mut merger = Merger { tcx, relevant, merged_locals };
    merger.visit_body_preserves_cfg(body);
}

struct Merger<'tcx> {
    tcx: TyCtxt<'tcx>,
    relevant: RelevantLocals,
    merged_locals: DenseBitSet<Local>,
}

impl<'tcx> MutVisitor<'tcx> for Merger<'tcx> {
    fn tcx(&self) -> TyCtxt<'tcx> {
        self.tcx
    }

    fn visit_local(&mut self, local: &mut Local, _: PlaceContext, _location: Location) {
        if let Some(relevant) = self.relevant.find(*local) {
            *local = self.relevant.original[relevant];
        }
    }

    fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) {
        match &statement.kind {
            // FIXME: Don't delete storage statements, but "merge" the storage ranges instead.
            StatementKind::StorageDead(local) | StatementKind::StorageLive(local)
                if self.merged_locals.contains(*local) =>
            {
                statement.make_nop(true);
                return;
            }
            _ => (),
        };
        self.super_statement(statement, location);
        match &statement.kind {
            StatementKind::Assign(box (dest, rvalue)) => {
                match rvalue {
                    Rvalue::CopyForDeref(place)
                    | Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => {
                        // These might've been turned into self-assignments by the replacement
                        // (this includes the original statement we wanted to eliminate).
                        if dest == place {
                            debug!("{:?} turned into self-assignment, deleting", location);
                            statement.make_nop(true);
                        }
                    }
                    _ => {}
                }
            }

            _ => {}
        }
    }
}

//////////////////////////////////////////////////////////
// Relevant locals
//
// Small utility to reduce size of the conflict matrix by only considering locals that appear in
// the candidates

newtype_index! {
    /// Represent a subset of locals which appear in candidates.
    struct RelevantLocal {}
}

#[derive(Debug)]
struct RelevantLocals {
    original: IndexVec<RelevantLocal, Local>,
    shrink: IndexVec<Local, Option<RelevantLocal>>,
    renames: UnionFind<RelevantLocal>,
}

impl RelevantLocals {
    #[tracing::instrument(level = "trace", skip(candidates, num_locals), ret)]
    fn compute(candidates: &Candidates, num_locals: usize) -> RelevantLocals {
        let mut original = IndexVec::with_capacity(candidates.c.len());
        let mut shrink = IndexVec::from_elem_n(None, num_locals);

        // Mark a local as relevant and record it into the maps.
        let mut declare = |local| {
            shrink.get_or_insert_with(local, || original.push(local));
        };

        for &(src, dest) in candidates.c.iter() {
            declare(src);
            declare(dest)
        }

        let renames = UnionFind::new(original.len());
        RelevantLocals { original, shrink, renames }
    }

    fn find(&mut self, src: Local) -> Option<RelevantLocal> {
        let src = self.shrink[src]?;
        let src = self.renames.find(src);
        Some(src)
    }

    fn union(&mut self, lhs: RelevantLocal, rhs: RelevantLocal) -> RelevantLocal {
        let head = self.renames.unify(lhs, rhs);
        // We need to ensure we keep the original local of the RHS, as it may be a required local.
        self.original[head] = self.original[rhs];
        head
    }
}

/////////////////////////////////////////////////////
// Candidate accumulation

#[derive(Debug, Default)]
struct Candidates {
    /// The set of candidates we are considering in this optimization.
    ///
    /// Whether a place ends up in the key or the value does not correspond to whether it appears as
    /// the lhs or rhs of any assignment. As a matter of fact, the places in here might never appear
    /// in an assignment at all. This happens because if we see an assignment like this:
    ///
    /// ```ignore (syntax-highlighting-only)
    /// _1.0 = _2.0
    /// ```
    ///
    /// We will still report that we would like to merge `_1` and `_2` in an attempt to allow us to
    /// remove that assignment.
    c: Vec<(Local, Local)>,
}

// We first implement some utility functions which we will expose removing candidates according to
// different needs. Throughout the liveness filtering, the `candidates` are only ever accessed
// through these methods, and not directly.
impl Candidates {
    /// Collects the candidates for merging.
    ///
    /// This is responsible for enforcing the first and third bullet point.
    fn find(body: &Body<'_>, borrowed: &DenseBitSet<Local>) -> Candidates {
        let mut visitor = FindAssignments { body, candidates: Default::default(), borrowed };
        visitor.visit_body(body);

        Candidates { c: visitor.candidates }
    }
}

struct FindAssignments<'a, 'tcx> {
    body: &'a Body<'tcx>,
    candidates: Vec<(Local, Local)>,
    borrowed: &'a DenseBitSet<Local>,
}

impl<'tcx> Visitor<'tcx> for FindAssignments<'_, 'tcx> {
    fn visit_statement(&mut self, statement: &Statement<'tcx>, _: Location) {
        if let StatementKind::Assign(box (
            lhs,
            Rvalue::CopyForDeref(rhs) | Rvalue::Use(Operand::Copy(rhs) | Operand::Move(rhs)),
        )) = &statement.kind
            && let Some(src) = lhs.as_local()
            && let Some(dest) = rhs.as_local()
        {
            // As described at the top of the file, we do not go near things that have
            // their address taken.
            if self.borrowed.contains(src) || self.borrowed.contains(dest) {
                return;
            }

            // As described at the top of this file, we do not touch locals which have
            // different types.
            let src_ty = self.body.local_decls()[src].ty;
            let dest_ty = self.body.local_decls()[dest].ty;
            if src_ty != dest_ty {
                // FIXME(#112651): This can be removed afterwards. Also update the module description.
                trace!("skipped `{src:?} = {dest:?}` due to subtyping: {src_ty} != {dest_ty}");
                return;
            }

            // We may insert duplicates here, but that's fine
            self.candidates.push((src, dest));
        }
    }
}

/// Some locals are part of the function's interface and can not be removed.
///
/// Note that these locals *can* still be merged with non-required locals by removing that other
/// local.
fn is_local_required(local: Local, body: &Body<'_>) -> bool {
    match body.local_kind(local) {
        LocalKind::Arg | LocalKind::ReturnPointer => true,
        LocalKind::Temp => false,
    }
}

/////////////////////////////////////////////////////////
// MIR Dump

fn dest_prop_mir_dump<'tcx>(
    tcx: TyCtxt<'tcx>,
    body: &Body<'tcx>,
    points: &DenseLocationMap,
    live: &SparseIntervalMatrix<RelevantLocal, TwoStepIndex>,
    relevant: &RelevantLocals,
) {
    let locals_live_at = |location| {
        live.rows()
            .filter(|&r| live.contains(r, location))
            .map(|rl| relevant.original[rl])
            .collect::<Vec<_>>()
    };

    if let Some(dumper) = MirDumper::new(tcx, "DestinationPropagation-dataflow", body) {
        let extra_data = &|pass_where, w: &mut dyn std::io::Write| {
            if let PassWhere::BeforeLocation(loc) = pass_where {
                let location = TwoStepIndex::new(points, loc, Effect::Before);
                let live = locals_live_at(location);
                writeln!(w, "        // before: {:?} => {:?}", location, live)?;
            }
            if let PassWhere::AfterLocation(loc) = pass_where {
                let location = TwoStepIndex::new(points, loc, Effect::After);
                let live = locals_live_at(location);
                writeln!(w, "        // after: {:?} => {:?}", location, live)?;
            }
            Ok(())
        };

        dumper.set_extra_data(extra_data).dump_mir(body)
    }
}

#[derive(Copy, Clone, Debug)]
enum Effect {
    Before,
    After,
}

rustc_index::newtype_index! {
    /// A reversed `PointIndex` but with the lower bit encoding early/late inside the statement.
    /// The reversed order allows to use the more efficient `IntervalSet::append` method while we
    /// iterate on the statements in reverse order.
    #[orderable]
    #[debug_format = "TwoStepIndex({})"]
    struct TwoStepIndex {}
}

impl TwoStepIndex {
    fn new(elements: &DenseLocationMap, location: Location, effect: Effect) -> TwoStepIndex {
        let point = elements.point_from_location(location);
        let effect = match effect {
            Effect::Before => 0,
            Effect::After => 1,
        };
        let max_index = 2 * elements.num_points() as u32 - 1;
        let index = 2 * point.as_u32() + (effect as u32);
        // Reverse the indexing to use more efficient `IntervalSet::append`.
        TwoStepIndex::from_u32(max_index - index)
    }
}

/// Add points depending on the result of the given dataflow analysis.
fn save_as_intervals<'tcx>(
    elements: &DenseLocationMap,
    body: &Body<'tcx>,
    relevant: &RelevantLocals,
    results: Results<DenseBitSet<Local>>,
) -> SparseIntervalMatrix<RelevantLocal, TwoStepIndex> {
    let mut values = SparseIntervalMatrix::new(2 * elements.num_points());
    let mut state = MaybeLiveLocals.bottom_value(body);
    let reachable_blocks = traversal::reachable_as_bitset(body);

    let two_step_loc = |location, effect| TwoStepIndex::new(elements, location, effect);
    let append_at =
        |values: &mut SparseIntervalMatrix<_, _>, state: &DenseBitSet<Local>, twostep| {
            for (relevant, &original) in relevant.original.iter_enumerated() {
                if state.contains(original) {
                    values.append(relevant, twostep);
                }
            }
        };

    // Iterate blocks in decreasing order, to visit locations in decreasing order. This
    // allows to use the more efficient `append` method to interval sets.
    for block in body.basic_blocks.indices().rev() {
        if !reachable_blocks.contains(block) {
            continue;
        }

        state.clone_from(&results[block]);

        let block_data = &body.basic_blocks[block];
        let loc = Location { block, statement_index: block_data.statements.len() };

        let term = block_data.terminator();
        let mut twostep = two_step_loc(loc, Effect::After);
        append_at(&mut values, &state, twostep);
        // Ensure we have a non-zero live range even for dead stores. This is done by marking all
        // the written-to locals as live in the second half of the statement.
        // We also ensure that operands read by terminators conflict with writes by that terminator.
        // For instance a function call may read args after having written to the destination.
        VisitPlacesWith(|place: Place<'tcx>, ctxt| {
            if let Some(relevant) = relevant.shrink[place.local] {
                match DefUse::for_place(place, ctxt) {
                    DefUse::Def | DefUse::Use | DefUse::PartialWrite => {
                        values.insert(relevant, twostep);
                    }
                    DefUse::NonUse => {}
                }
            }
        })
        .visit_terminator(term, loc);

        twostep = TwoStepIndex::from_u32(twostep.as_u32() + 1);
        debug_assert_eq!(twostep, two_step_loc(loc, Effect::Before));
        MaybeLiveLocals.apply_early_terminator_effect(&mut state, term, loc);
        MaybeLiveLocals.apply_primary_terminator_effect(&mut state, term, loc);
        append_at(&mut values, &state, twostep);

        for (statement_index, stmt) in block_data.statements.iter().enumerate().rev() {
            let loc = Location { block, statement_index };
            twostep = TwoStepIndex::from_u32(twostep.as_u32() + 1);
            debug_assert_eq!(twostep, two_step_loc(loc, Effect::After));
            append_at(&mut values, &state, twostep);
            // Like terminators, ensure we have a non-zero live range even for dead stores.
            // Some rvalues interleave reads and writes, for instance `Rvalue::Aggregate`, see
            // https://github.com/rust-lang/rust/issues/146383. By precaution, treat statements
            // as behaving so by default.
            // We make an exception for simple assignments `_a.stuff = {copy|move} _b.stuff`,
            // as marking `_b` live here would prevent unification.
            let is_simple_assignment = match stmt.kind {
                StatementKind::Assign(box (
                    lhs,
                    Rvalue::CopyForDeref(rhs)
                    | Rvalue::Use(Operand::Copy(rhs) | Operand::Move(rhs)),
                )) => lhs.projection == rhs.projection,
                _ => false,
            };
            VisitPlacesWith(|place: Place<'tcx>, ctxt| {
                if let Some(relevant) = relevant.shrink[place.local] {
                    match DefUse::for_place(place, ctxt) {
                        DefUse::Def | DefUse::PartialWrite => {
                            values.insert(relevant, twostep);
                        }
                        DefUse::Use if !is_simple_assignment => {
                            values.insert(relevant, twostep);
                        }
                        DefUse::Use | DefUse::NonUse => {}
                    }
                }
            })
            .visit_statement(stmt, loc);

            twostep = TwoStepIndex::from_u32(twostep.as_u32() + 1);
            debug_assert_eq!(twostep, two_step_loc(loc, Effect::Before));
            MaybeLiveLocals.apply_early_statement_effect(&mut state, stmt, loc);
            MaybeLiveLocals.apply_primary_statement_effect(&mut state, stmt, loc);
            // ... but reads from operands are marked as live here so they do not conflict with
            // the all the writes we manually marked as live in the second half of the statement.
            append_at(&mut values, &state, twostep);
        }
    }

    values
}