Skip to content

Diff PDFs by converting them to HTML and diffing that #9

Open
@Mr0grog

Description

@Mr0grog

This should not necessarily replace a differ for actual PDF content, but it could potentially be a lot more useful when it works well: instead of trying to diff two PDF files, convert the PDF to HTML (there are at least a few open-source libraries for this) and feed that through the HTML differ.

Not sure what the right name for this is.

Lizz brought this up in Slack and, though I remember having a short discussion about the idea before, I can’t find anywhere we’ve written it down, hence this issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions