Skip to content

RFC: add support for closeness comparison #170

Closed as not planned
Closed as not planned
@pmeier

Description

@pmeier

Due to the limitations of floating point arithmetic, comparing floating point values for bitwise equality is only required in very few situations. In usual sitatuations, for example comparing the output of a function against an expected result, it has thus become best practice to compare the values for closeness rather than equality. Python added built-in support for closeness comparisons (math.isclose) with PEP485 which was introduced in Python 3.5.

With this I'm proposing to add an elementwise isclose operator:

def isclose(x1, x2, *, rtol: float, atol: float):
    pass

Similar to equal, x1 and x2 as well as the return value are arrays. The returned array will be of type bool.

The relative tolerance rtol and absolute tolerance atol should have default values which are discussed below.

Status quo

All actively considered libraries already at least partially support closeness comparisons. In addition to the elementwise isclose operation, usually also allclose is defined. Since allclose(a, b) == all(isclose(a, b)) and all is already part of the standard, I don't think adding allclose is helpful. Otherwise, we would also need to consider allequal and so on.

Library isclose allclose
NumPy numpy.isclose numpy.allclose
TensorFlow tensorflow.experimental.numpy.isclose tensorflow.experimental.numpy.allclose
PyTorch torch.isclose torch.allclose
MXNet mxnet.contrib.ndarray.allclose
JAX jax.numpy.isclose jax.numpy.allclose
Dask dask.array.isclose dask.array.allclose
CuPy cupy.isclose cupy.allclose

Closeness definition

All the libraries above define closeness like this:

abs(actual - expected) <= atol + rtol * abs(expected)

PEP485 states about this:

In this approach, the absolute and relative tolerance are added together, rather than the or method used in [math.isclose]. This is computationally more simple, and if relative tolerance is larger than the absolute tolerance, then the addition will have no effect. However, if the absolute and relative tolerances are of similar magnitude, then the allowed difference will be about twice as large as expected.
[...]
Even more critically, if the values passed in are small compared to the absolute tolerance, then the relative tolerance will be completely swamped, perhaps unexpectedly.

math.isclose overcomes this and additionally is symmetric:

abs(actual - expected) <= max(atol, rtol * max(abs(actual, expected)))

Thus, in addition to adding the isclose operator, I think it should stick to the objectively better definition of math.isclose.

Non-finite numbers

In addition to finite numbers, the standard should also define how non-finite numbers (NaN, inf, and -inf) are to be handled. Again, I propose to stick to the rationale of PEP485, which in turn is based on IEEE 754:

  • NaN is never close to anything. All library implementations add a equal_nan: bool = False flag to the functions. If True two NaN values are considered close. Still, comparison between any other value and a NaN is never considered close.
  • inf, and -inf are only close to themselves.

Default tolerances

In addition to fixed default values (math.isclose: rtol=1e-9, atol=0.0, all libraries: rtol=1e-5, atol=1e-8) the default tolerances could also be varied by the promoted dtype. For example, arrays of dtype float64 could use stricter default tolerances as float32.

For integer dtypes, I propose using rtol = atol = 0.0 which would be identical to comparing them for equality. For floating point dtypes I would use the rationale of PEP485 as base:

  • rtol: Approximately half the precision of the promoted dtype
  • atol: 0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    API extensionAdds new functions or objects to the API.Needs DiscussionNeeds further discussion.RFCRequest for comments. Feature requests and proposed changes.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions