Skip to content

Rotated bboxes transforms #9084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

AntoineSimoulin
Copy link
Member

@AntoineSimoulin AntoineSimoulin commented May 26, 2025

Add Transforms support for Rotated Boxes

This PR is the first of a series to add support for rotated boxes transforms. In particular this PR implements the following functionalities:

  • Add support to draw rotated boxes by modifying the function draw_bounding_boxes to accept BoundingBoxes with format BoundingBoxFormat.XYXYXYXY;
  • Add utility function is_rotated_bounding_format to infer whether a BoundingBoxFormat corresponds to a rotated format or not. To preserve torchscript support, this function is not added as a method from the Enum definition of BoundingBoxFormat;
  • Create function to clamp rotated boxes. For now the function only return a clone of the input boxes. The clamping function is applied after each transform. To pass the test we created the function while the functionalities are not implemented yet;
  • Add support for vertical and horizontal flip for rotated boxes.

Test plan

Please run the following tests:

pytest test/test_tv_tensors.py -vvv -k "test_bbox_format"
pytest test/test_transforms_v2.py -vvv -k "TestHorizontalFlip and test_kernel_bounding_boxes"
pytest test/test_transforms_v2.py -vvv -k "TestHorizontalFlip and test_bounding_boxes_correctness"
pytest test/test_transforms_v2.py -vvv -k "TestVerticalFlip and test_kernel_bounding_boxes"
pytest test/test_transforms_v2.py -vvv -k "TestVerticalFlip and test_bounding_boxes_correctness"

Plot function

The plot function can be used as follow

import torch
from gallery.transforms.helpers import plot
from test.common_utils import make_bounding_boxes
from torchvision import transforms, tv_tensors

img = torch.ones(3, 360, 360)
boxes = make_bounding_boxes(
    canvas_size=(360, 360),
    format=tv_tensors.BoundingBoxFormat.XYXYXYXY,
    num_boxes=2,
)
plot([(img, boxes)])

image

Future work

This PR implements only two transforms for rotated boxes. However, it is intended to validate the implementation before releasing other transforms. Please also note that the clamping transforms is for now just a placeholder to make sure we do pass the tests for rotated boxes.

Test Plan:
Run unit tests:`pytest test/test_tv_tensors.py -vvv -k "test_bbox_format"`
Test Plan:
Run unit tests: `pytest test/test_transforms_v2.py -vvv -k "TestHorizontalFlip and test_kernel_bounding_boxes"` and `pytest test/test_transforms_v2.py -vvv -k "TestHorizontalFlip and test_bounding_boxes_correctness"`
Test Plan:
Run unit tests: `pytest test/test_transforms_v2.py -vvv -k "TestVerticalFlip and test_kernel_bounding_boxes"`
Copy link

pytorch-bot bot commented May 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9084

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit a15a057 with merge base 966da7e (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @AntoineSimoulin , I made a first round of comments

@@ -205,7 +258,7 @@ def draw_bounding_boxes(
raise ValueError("Pass individual images, not batches")
elif image.size(0) not in {1, 3}:
raise ValueError("Only grayscale and RGB images are supported")
elif (boxes[:, 0] > boxes[:, 2]).any() or (boxes[:, 1] > boxes[:, 3]).any():
elif boxes.shape[-1] == 4 and ((boxes[:, 0] > boxes[:, 2]).any() or (boxes[:, 1] > boxes[:, 3]).any()):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since we seem to use both boxes.shape[-1] == 4 and len(bbox) == 4 as condition checks, it might help to create a unified boolean variable e.g.

is_rotated = boxes.shape[-1] == 4

and use is_rotated for the remainder of the function. It also makes the conditions slightly more explicit about what they're checking against.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to create a variable before the testing at the beginning of the function. But I simplified the core of the function given your other comments!

# operation
dtype = output.dtype

return output.to(dtype=dtype, device=device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we cast back to the input dtype unconditionally? In general the transforms should preserve the input dtype, but here's it's not clear that we are?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, we are creating an intermediate tensor and therefore make sure to cast it to the correct dtype.

Copy link
Member Author

@AntoineSimoulin AntoineSimoulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug thanks for reviewing! I should have addressed all comments here. Additional tests for visualization can be run with pytest test/test_utils.py -vvv -k "test_draw_rotatated_boxes". Let me know if you have additional comment to add in this review.

# operation
dtype = output.dtype

return output.to(dtype=dtype, device=device)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, we are creating an intermediate tensor and therefore make sure to cast it to the correct dtype.

@@ -38,6 +38,14 @@ class BoundingBoxFormat(Enum):
XYXYXYXY = "XYXYXYXY"


# TODO: Once torchscript supports Enums with staticmethod
# this can be put into BoundingBoxFormat as staticmethod
def is_rotated_bounding_format(format: BoundingBoxFormat) -> bool:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @NicolasHug, I tried to do that. Unfortunately it is not compatible with TorchScript. Indeed BoundingBoxes is directly inheriting from torch.Tensor and TorchScript does not fully support inheritance from built-in PyTorch types like torch.Tensor and has specific rules and limitations regarding which methods and attributes are accessible when scripting custom classes that inherit from these types.

boxes (Tensor): Tensor of size (N, 4) containing bounding boxes in (xmin, ymin, xmax, ymax) format. Note that
the boxes are absolute coordinates with respect to the image. In other words: `0 <= xmin < xmax < W` and
`0 <= ymin < ymax < H`.
boxes (Tensor): Tensor of size (N, 4) or (N, 8) containing bounding boxes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. I propose we do address this in a subsequent PR!

@@ -205,7 +258,7 @@ def draw_bounding_boxes(
raise ValueError("Pass individual images, not batches")
elif image.size(0) not in {1, 3}:
raise ValueError("Only grayscale and RGB images are supported")
elif (boxes[:, 0] > boxes[:, 2]).any() or (boxes[:, 1] > boxes[:, 3]).any():
elif boxes.shape[-1] == 4 and ((boxes[:, 0] > boxes[:, 2]).any() or (boxes[:, 1] > boxes[:, 3]).any()):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to create a variable before the testing at the beginning of the function. But I simplified the core of the function given your other comments!

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @AntoineSimoulin !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants