Skip to content

Performance: speed of getting blob.data for large files (as compared to GitPython) #752

Closed
@jnareb

Description

@jnareb

I have compared speed of equivalent to git show <revision>:<pathname> in both pygit2 and GitPython (the pure-Python implementation). In all other cases that I have tested pygit2 is faster, but for very large files git show / git cat-file equivalent is slower.

pygit2 code:

blob = repo.revparse_single(commit + ':' + path)
result = blob.data

GitPython code:

blob = repo.rev_parse(commit + ':' + path)
result = blob.data_stream.read()

Do you have any ideas why pygit2 is slower here?

P.S. would it be difficult to add streaming access?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions