Unexpected unicode character class in HIR generated without unicode support

#### What version of regex are you using?

Using `regex-syntax` 0.7.5.

#### Describe the bug at a high level.

When generating the HIR for certain regular expressions with unicode support turned off, the resulting HIR may contain Unicode character classes. I'm not sure if this is a bug or the intended behaviour, but the documentation seems to suggest that this is not expected. Specifically, the documentation for [hir::Class](https://docs.rs/regex-syntax/latest/regex_syntax/hir/enum.Class.html) says: 

> A character class corresponds to a set of characters. A character is either defined by a Unicode scalar value or a byte. Unicode characters are used by default, while bytes are used when Unicode mode (via the u flag) is disabled.

I assumed that the HIR produced without unicode support will contain character classes of the `Class::Bytes` variant alone. However this is not the case.

#### What are the steps to reproduce the behavior?

Consider this example:

```rust
use regex_syntax; // 0.7.5

fn main() {
    let mut parser = regex_syntax::ParserBuilder::new()
        .utf8(false)
        .unicode(false)
        .build();
        
    let hir = parser.parse(r"(a|\xc2\xa0)");

    println!("{:?}", hir);
}

```

It produces the following output:

```text
Ok(Capture(Capture { index: 1, name: None, sub: Class({'a'..='a', '\u{a0}'..='\u{a0}'}) }))
```

Here `sub` is a class of the `Class::Unicode` variant.

#### What is the expected behavior?

I was expecting that `(a|\xc2\xa0)` is represented as an alternation of two literals, not as a `Class::Unicode`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected unicode character class in HIR generated without unicode support #1088

What version of regex are you using?

Describe the bug at a high level.

What are the steps to reproduce the behavior?

What is the expected behavior?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected unicode character class in HIR generated without unicode support #1088

Description

What version of regex are you using?

Describe the bug at a high level.

What are the steps to reproduce the behavior?

What is the expected behavior?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions