Skip to content

urllib sets wrong Content-Length for pseudo files on Linux #93296

Closed as not planned
@illia-v

Description

@illia-v

Bug report

A value of the Content-length header returned by urllib.request.FileHandler.open_local_file may mismatch the length of data on Linux.

This happens when a file from a special file system (e.g., procfs or sysfs) is requested.

open_local_file relies on st_size; st_size is equal to zero for pseudo files on Linux.

cpython/Lib/urllib/request.py

Lines 1506 to 1511 in 8a0d9a6

size = stats.st_size
modified = email.utils.formatdate(stats.st_mtime, usegmt=True)
mtype = mimetypes.guess_type(filename)[0]
headers = email.message_from_string(
'Content-type: %s\nContent-length: %d\nLast-modified: %s\n' %
(mtype or 'text/plain', size, modified))

Example

>>> import urllib.request
>>> url = "file:///proc/cpuinfo"
>>> handler = urllib.request.FileHandler()
>>> response = handler.file_open(urllib.request.Request(url))
>>> data = response.read()
>>> headers = response.info()
>>> assert int(headers["Content-length"]) == len(data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError
>>> print(headers["Content-length"])
0
>>> print(len(data))
18294

Your environment

  • CPython versions tested on: 3.12.0 alpha 0
  • Operating system and architecture: Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions