| Age | Commit message (Collapse) | Author | Lines |
|
|
|
|
|
|
|
Use assoc int consts3
Define module level int consts with associated constants instead of `min_value()` and `max_value()`. So the code become consistent with what the docs recommend etc. Seems natural.
Also remove the last usages of the int module constants from this repo (except src/test/ directory which I have still not really done anything in). Some places were missed in the previous PRs because the code uses `crate::<IntTy>` to reach the constants.
This is a continuation of #70857
r? @dtolnay
|
|
|
|
|
|
|
|
|
|
Constructing a Searcher in strip_prefix and strip_suffix is
unnecessarily slow when the pattern is a fixed-length string. Add
strip_prefix and strip_suffix methods to the Pattern trait, and add
optimized implementations of these methods in the str implementation.
The old implementation is retained as the default for these methods.
|
|
negatives fixed recently)
|
|
|
|
|
|
|
|
|
|
Improve code generated for `starts_with(<literal char>)`
This PR includes two minor improvements to the code generated when checking for string prefix/suffix.
The first commit simplifies the str/str operation, by taking advantage of the raw UTF-8 representation.
The second commit replaces the current str/char matching logic with a char->str encoding and then the previous method.
The resulting code should be equivalent in the generic case (one char is being encoded versus one char being decoded), but it becomes easy to optimize in the case of a literal char, which in most cases a developer might expect to be at least as simple as that of a literal string.
This PR should fix #41993
|
|
|
|
This enables constant folding when matching a literal char.
Fixes #41993.
|
|
The comparison can be performed on the raw bytes, as the chars can
only match if their UTF8 encoding matches.
This avoids the `is_char_boundary` checks and translates to a straight
`u8` slice comparison which is optimized to a memcmp or inline
comparison where appropriate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Like #43008 (f668999), but _much more aggressive_.
|
|
|
|
This PR cuts down on a large number of `#[inline(always)]` and `#[inline]`
annotations in libcore for various core functions. The `#[inline(always)]`
annotation is almost never needed and is detrimental to debug build times as it
forces LLVM to perform inlining when it otherwise wouldn't need to in debug
builds. Additionally `#[inline]` is an unnecessary annoation on almost all
generic functions because the function will already be monomorphized into other
codegen units and otherwise rarely needs the extra "help" from us to tell LLVM
to inline something.
Overall this PR cut the compile time of a [microbenchmark][1] by 30% from 1s to
0.7s.
[1]: https://gist.github.com/alexcrichton/a7d70319a45aa60cf36a6a7bf540dd3a
|
|
|
|
|
|
|
|
|
|
This makes it clearer that we're not just looking for a lower bound but
rather know that the iterator is an `ExactSizeIterator`.
|
|
|
|
in order to allow a bit more felixibility in how to use it.
|
|
StrSearcher: Implement the complete reverse case for the two way algorithm
Fix quadratic behavior in StrSearcher in reverse search with periodic
needles.
This commit adds the missing pieces for the "short period" case in
reverse search. The short case will show up when the needle is literally
periodic, for example "abababab".
Two way uses a "critical factorization" of the needle: x = u v.
Searching matches v first, if mismatch at character k, skip k forward.
Matching u, if mismatch, skip period(x) forward.
To avoid O(mn) behavior after mismatch in u, memorize the already
matched prefix.
The short period case requires that |u| < period(x).
For the reverse search we need to compute a different critical
factorization x = u' v' where |v'| < period(x), because we are searching
for the reversed needle. A short v' also benefits the algorithm in
general.
The reverse critical factorization is computed quickly by using the same
maximal suffix algorithm, but terminating as soon as we have a location
with local period equal to period(x).
This adds extra fields crit_pos_back and memory_back for the reverse
case. The new overhead for TwoWaySearcher::new is low, and additionally
I think the "short period" case is uncommon in many applications of
string search.
The maximal_suffix methods were updated in documentation and the
algorithms updated to not use !0 and wrapping add, variable left is now
1 larger, offset 1 smaller.
Use periodicity when computing byteset: in the periodic case, just
iterate over one period instead of the whole needle.
Example before (rfind) after (twoway_rfind) benchmark shows the removal
of quadratic behavior.
needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100
```
test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s
test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s
```
|