diff options
| author | Matthias Krüger <476013+matthiaskrgr@users.noreply.github.com> | 2025-06-26 15:47:17 +0200 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-06-26 15:47:17 +0200 |
| commit | 158340f5617f7e6a555046802133536ab61764ba (patch) | |
| tree | dff498ed9af7d3ed9884615633c91cd423d8a5c2 /tests/codegen/src-hash-algorithm/src-hash-algorithm-md5.rs | |
| parent | d5d5eb471e7ff3a2876569f229703274f70cbb49 (diff) | |
| parent | 1dfc8406dcb742453b36daf0ce7486183b1da79c (diff) | |
| download | rust-158340f5617f7e6a555046802133536ab61764ba.tar.gz rust-158340f5617f7e6a555046802133536ab61764ba.zip | |
Rollup merge of #141311 - folkertdev:tidy-natural-sort, r=jieyouxu
make `tidy-alphabetical` use a natural sort
The idea here is that these lines should be correctly sorted, even though a naive string comparison would say they are not:
```
foo2
foo10
```
This is the ["natural sort order"](https://en.wikipedia.org/wiki/Natural_sort_order).
There is more discussion in [#t-compiler/help > tidy natural sort](https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/tidy.20natural.20sort/with/519111079)
Unfortunately, no standard sorting tools are smart enough to to this automatically (casting some doubt on whether we should make this change). Here are some sort outputs:
```
> cat foo.txt | sort
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -n
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -V
foo
foo1
foo2
foo10
mp
mp1e2
np1e2",
np",
```
Disappointingly, "numeric" sort does not actually have the behavior we want. It only sorts by numeric value if the line starts with a number. The "version" sort looks promising, but does something very unintuitive if you look at the final 4 values. None of the other options seem to have the desired behavior in all cases:
```
-b, --ignore-leading-blanks ignore leading blanks
-d, --dictionary-order consider only blanks and alphanumeric characters
-f, --ignore-case fold lower case to upper case characters
-g, --general-numeric-sort compare according to general numerical value
-i, --ignore-nonprinting consider only printable characters
-M, --month-sort compare (unknown) < 'JAN' < ... < 'DEC'
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)
-n, --numeric-sort compare according to string numerical value
-R, --random-sort shuffle, but group identical keys. See shuf(1)
--random-source=FILE get random bytes from FILE
-r, --reverse reverse the result of comparisons
--sort=WORD sort according to WORD:
general-numeric -g, human-numeric -h, month -M,
numeric -n, random -R, version -V
-V, --version-sort natural sort of (version) numbers within text
```
r? ```@Noratrieb``` (it sounded like you know this code?)
Diffstat (limited to 'tests/codegen/src-hash-algorithm/src-hash-algorithm-md5.rs')
0 files changed, 0 insertions, 0 deletions
