Auto merge of #60261 - matklad:one-escape, r=petrochenkov - rust

diff options

author	bors <bors@rust-lang.org>	2019-05-06 00:16:16 +0000
committer	bors <bors@rust-lang.org>	2019-05-06 00:16:16 +0000
commit	46d0ca00ad60710cd3c46398b7a6ea080a9447ed (patch)
tree	669941c7808326bac651d6f42f9eb4cddc90b56c /src/libsyntax/ext
parent	40bd145cbef28d279aacc903907c3945f83a6296 (diff)
parent	1835cbeb6574997ec5188cb22b9538c61976d2b4 (diff)
download	rust-46d0ca00ad60710cd3c46398b7a6ea080a9447ed.tar.gz rust-46d0ca00ad60710cd3c46398b7a6ea080a9447ed.zip

Auto merge of #60261 - matklad:one-escape, r=petrochenkov

introduce unescape module

A WIP PR to gauge early feedback

Currently, we deal with escape sequences twice: once when we [lex](https://github.com/rust-lang/rust/blob/112f7e9ac564e2cfcfc13d599c8376a219fde1bc/src/libsyntax/parse/lexer/mod.rs#L928-L1065) a string, and a second time when we [unescape](https://github.com/rust-lang/rust/blob/112f7e9ac564e2cfcfc13d599c8376a219fde1bc/src/libsyntax/parse/mod.rs#L313-L366) literals. Note that we also produce different sets of diagnostics in these two cases.

This PR aims to remove this duplication, by introducing a new `unescape` module as a single source of truth for character escaping rules.

I think this would be a useful cleanup by itself, but I also need this for https://github.com/rust-lang/rust/pull/59706.

In the current state, the PR has `unescape` module which fully (modulo bugs) deals with string and char literals. I am quite happy about the state of this module

What this PR doesn't have yet are:
* [x] handling of byte and byte string literals (should be simple to add)
* [x] good diagnostics
* [x] actual removal of code from lexer (giant `scan_char_or_byte` should go away completely)
* [x] performance check
* [x] general cleanup of the new code

Diagnostics will be the most labor-consuming bit here, but they are mostly a question of just correctly adjusting spans to sub-tokens. The current setup for diagnostics is that `unescape` produces a plain old `enum` with various problems, and they are rendered into `Handler` separately. This bit is not actually required (it is possible to just pass the `Handler` in), but I like the separation between diagnostics and logic this approach imposes, and such separation should again be useful for #59706

cc @eddyb , @petrochenkov

Diffstat (limited to 'src/libsyntax/ext')

-rw-r--r--

src/libsyntax/ext/base.rs

1 files changed, 1 insertions, 0 deletions


context:
space:
mode: