BUG: read_csv skips leading space where it shouldn't

- [X] I have checked that this issue has not already been reported.

- [X] I have confirmed this bug exists on the latest version of pandas.

- [X] (optional) I have confirmed this bug exists on the master branch of pandas.

---

**Note**: Please read [this guide](https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports) detailing how to provide the necessary information for us to reproduce your bug.

#### Code Sample, a copy-pastable example

I wasn't able to isolate a (more) minimal example, so I'll just share what I was working on. `nltk` here is version `3.4.5`.

```python
import csv
import string

import nltk
import pandas as pd


UNFORMATTED = set(string.ascii_lowercase)
PUNCTUATION = set(" !\"&'(),-.:;?[]_`")
ALLOWED = UNFORMATTED | set(string.ascii_uppercase) | PUNCTUATION

EMPTY = '<NONE>'
CAPITALIZE = '<CAP>'


def generate_sequences(text: str, k: int):
    """
    Yields tuples of subsequence of k characters, next character
    (if within a special set)
    """
    for i in range(len(text) - k):
        seq = text[i:i + k]
        next_char = text[i + k]
        punct_char = (next_char if next_char in PUNCTUATION else
                      CAPITALIZE if next_char.isupper() else EMPTY)
        yield seq, punct_char


gutenberg = nltk.corpus.gutenberg
gutenberg.ensure_loaded()

sample_file = gutenberg.fileids()[0]
sample = ' '.join(gutenberg.raw(sample_file).split())

with open('seq.txt', 'w') as file:
    file.writelines(f"{seq}|{punct}\n" for seq, punct in generate_sequences(sample, k=10))

df = pd.read_csv('seq.txt', sep='|', quoting=csv.QUOTE_NONE,
                 names=['sequence', 'next_char'], skipinitialspace=False,
                 dtype=str, na_filter=False)

seq_length = len(df.at[0, 'sequence'])
lengths = df['sequence'].apply(len)
assert (lengths == seq_length).all()
```

#### Problem description

`AssertionError` is raised because there is one element in the `sequence` column that isn't of length 10, even though the text file was manually crafted to contain sequences of exactly 10 characters, followed by the separator `|`, followed by another value.

Upon inspection:

```python
>>> df.assign(length=lengths)[lengths != seq_length]
          sequence next_char  length
763047  it could     <NONE>       9
```

but

```python
>>> with open('seq.txt') as file:
...    lines = file.readlines()
...
>>> lines[763047]
' it could |<NONE>\n'
>>> len(lines[763047].split('|')[0])
10
```

This is unexpected behaviour because `skipinitialspace=False`, `quoting=csv.QUOTE_NONE`, `dtype=str` and `na_filter=False` were all passed to `pd.read_csv`, meaning that the values should be interpreted as raw as they come (i.e. including any leading space).

Moreover, this behaviour is inconsistent since there are plenty other examples in `seq.txt` of values with leading spaces that *do* get parsed correctly.

What's even weirder (and probably near to the crux of the problem) is that setting `EMPTY` to `'<EMPTY>'` or something else instead of `'<NONE>'` in the script above makes the problem disappear. Furthermore, any value of `EMPTY` with *exactly* four characters enclosed in angle brackets starting with `NA` produces the error. That is, `'<NAAA>'`, `'<NAZZ>'`, *do* produce the error, but `'<NA>'`, `'<NAAA'`, `'<NAA>'` do *not*.

Probably this has to do with NA parsing? Though I thought passing `na_filter=False` should have fixed that.


#### Expected Output

All elements of `df['sequence']` are strings of the same length (10 in this case), so no `AssertionError`.

#### Output of ``pd.show_versions()``

For installed environment:

<details>

```
INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.7.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
machine          : AMD64
processor        : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : None.None
pandas           : 1.0.3
numpy            : 1.18.1
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.0.2
setuptools       : 46.1.3.post20200330
Cython           : None
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.13.0
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : None
matplotlib       : 3.2.1
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : None
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None
numba            : None
```

</details>


For test on `master`:

<details>

```text
INSTALLED VERSIONS
------------------
commit           : 998a0deea39f11fa06071af77cc1afba65900330
python           : 3.8.2.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
machine          : AMD64
processor        : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : English_United Kingdom.1252
pandas           : 1.0.3
numpy            : 1.18.4
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1
setuptools       : 46.1.3
Cython           : 0.29.17
pytest           : 5.4.2
hypothesis       : 5.11.0
sphinx           : 3.0.3
blosc            : 1.9.1
feather          : None
xlsxwriter       : 1.2.8
lxml.etree       : 4.5.0
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.14.0
pandas_datareader: None
bs4              : 4.9.0
bottleneck       : 1.3.2
fastparquet      : 0.3.3
gcsfs            : None
lxml.etree       : 4.5.0
matplotlib       : 3.2.1
numexpr          : 2.7.1
odfpy            : None
openpyxl         : 3.0.3
pandas_gbq       : None
pyarrow          : 0.17.0
pytables         : None
pytest           : 5.4.2
pyxlsb           : None
s3fs             : 0.4.2
scipy            : 1.4.1
sqlalchemy       : 1.3.16
tables           : 3.6.1
tabulate         : None
xarray           : 0.15.1
xlrd             : 1.2.0
xlwt             : 1.3.0
xlsxwriter       : 1.2.8
numba            : 0.49.1
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: read_csv skips leading space where it shouldn't #34085

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: read_csv skips leading space where it shouldn't #34085

Description

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`