-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[WIP] Proposal: mime type shortcuts #10172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10172 +/- ##
==========================================
+ Coverage 43.42% 43.65% +0.22%
==========================================
Files 577 577
Lines 79666 79740 +74
==========================================
+ Hits 34594 34807 +213
+ Misses 40795 40636 -159
- Partials 4277 4297 +20
Continue to review full report at Codecov.
|
I think this is a good idea, however, realistically dropzone should be doing this sort of matching as it's there that the mimetype variation is really being generated. I've not looked but it's dropzone another one of our out of date dead upstream projects that we have picked up? PS I couldn't see what the problem was causing your code to fail to build earlier!! |
But we need a MIME type list on the server side anyway, no matter what version of dropzone we use. This method would simplify building the server side list by providing aliases.
Just sloppiness from my part. 😅 |
I think the approach using MIME types is flawed. Users generally do not know mime types (which are ambiguous in many cases and different browsers send different mime types for the same filename) but file extensions (which are never ambiguous). So instead of building on a flawed concept, I'd propose we just support both like browsers already do. Dropzone is still maintained at https://github.com/enyo/dropzone but our version is out of date. I think it's acceptedFiles option basically functions the same as the Another change I'd like to see is to set |
Do we really need it? The server does see the filename as part of the multipart header, so the extension could just be extracted from there so we can match on both extension or MIME type:
|
@silverwind Sadly, the problem of accepting any kind of files is JavaScript injection. |
Yeah, I see those checks for
|
zip files alone can be any of the following
I'm not sure under what conditions any of the above are true, but it seems like MIME checking is somewhat fragile. 🙃 |
I think mime type lookup may in some cases even be OS or distribution-specific in the case of Linux. It really depends what Mime type database is used. IIRC, there is a official one from IANA, but it's so vastly lacking in data, everyone just uses their own databases. It's a failed "standard". |
That's why I think we should provide the option for a "curated" list. Bad as it is, MIME types are the only option ATM. |
Why? I see no issue supporting both extensions and mime (and in a later release, drop support for mime). |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 months. Thank you for your contributions. |
From the author of dropzone, just for context: https://stackoverflow.com/a/17275873/225979 |
I guess ideally we'd want to match Dropzone's implementation exactly on the backend. |
Sounds like a plan. I've been browsing the Dropzone docs and code and I have the impression that the actual MIME type is reported by the browser itself (not calculated by the library), which leaves us again the problem of validating whether the file matches the reported type. I mean, we could believe the UA about it, but not much else. Personally I could live with that; we're already serving files that say to be pictures and such and we're not really validating it's actual content. |
@guillep2k A user could send a POST that has whatever mime type header they choose, so you can't rely on the HTTP header mime type value for security. The reason for having a mime check in dropzone (or any upload script) is as mostly as a slight convenience for both the user and the server that it doesn't let normal users go through the process of uploading files that it already knows the server will then reject after validation. Malicious users can still send whatever mime type they want in a request so a check should still be done on the server and not just relying on what the user sent (and then ideally you'd want both checks to have the same results on legitimate files). https://github.com/gabriel-vasile/mimetype seems like it could be much nicer than what we use now on the backed too which is https://golang.org/pkg/net/http/#DetectContentType and seems limited in comparison. |
The pedantic expositionThe problem comes from mixing file extensions with MIME types. There are multiple file types that share the same file extension, and most of them specialized can only be matched to
There are two additional difficulties to consider:
These problems are related but no the same, and require different, even conflicting approaches. The pollWe should decide on what kind of service we want to give, so we can take a better decision on the approach to take. So I propose a vote on the IMHO most important aspects (please note that some choices are easy to implement and other very difficult):
|
This will always be error prone as mentioned already; many types share same extension. The only proper solution here is to show generic icon, otherwise whether we depend on Content-Type or extension, there is always potential to exploit it by having Gitea mislead user with icon.
An executable file is desired on Release; ZIP files are no less dangerous if properly crafted. Additionally we should never rely on Content-Type sent by browser, a file should be inspected after upload by us, to determine its Content-Type. |
#12465 should make this obsolete. It will allow combinations of mime type and file extensions like
That seecond example is straight from MDN and by supporing the same format as browsers do we can efficiently delegate the job of identifying mime types to the user without having tediously maintain shortcuts like proposed in this PR. |
- Add support for file extensions, matching the `accept` attribute of `<input type="file">` - Add support for type wildcard mime types, e.g. `image/*` - Create repository.release.ALLOWED_TYPES setting (default unrestricted) - Change default for attachment.ALLOWED_TYPES to a list of extensions - Split out POST /attachments into two endpoints for issue/pr and releases to prevent circumvention of allowed types check Fixes: go-gitea#10172 Fixes: go-gitea#7266 Fixes: go-gitea#12460 Ref: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/file#Unique_file_type_specifiers
Codecov Report
@@ Coverage Diff @@
## master #10172 +/- ##
==========================================
+ Coverage 43.63% 43.65% +0.01%
==========================================
Files 576 577 +1
Lines 79716 79740 +24
==========================================
+ Hits 34783 34807 +24
+ Misses 40638 40636 -2
- Partials 4295 4297 +2
Continue to review full report at Codecov.
|
) * Attachments: Add extension support, allow all types for releases - Add support for file extensions, matching the `accept` attribute of `<input type="file">` - Add support for type wildcard mime types, e.g. `image/*` - Create repository.release.ALLOWED_TYPES setting (default unrestricted) - Change default for attachment.ALLOWED_TYPES to a list of extensions - Split out POST /attachments into two endpoints for issue/pr and releases to prevent circumvention of allowed types check Fixes: #10172 Fixes: #7266 Fixes: #12460 Ref: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/file#Unique_file_type_specifiers * rename function * extract GET routes out of RepoMustNotBeArchived Co-authored-by: Lauris BH <[email protected]>
Related #5750, #2619 (comment), #7266 (comment)
Since users sometimes have a hard time configuring MIME types (e.g. different notations from dropzone, browsers, Gitea), I thought we could "pre-cook" a list of common cases. These can be specified in
ALLOWED_TYPES
as well as the normal MIME types:So:
Could be written as:
And:
Could be written as:
The idea is that these notations can be combined, so the following notation is valid:
The above example will expand to the list of pre-defined
compressed
andimage
types, and the three additional specific (formal) MIME types:application/java-archive
,application/vnd.google-earth.kml+xml
andapplication/vnd.google-earth.kmz
.This exploits the fact that MIME types are always formatted as "name/name" (or "name/", "/*"), so we could take advantage and use simple names for aliases.
This PR can be considered a work in progress because the small list of aliases I made should have some discussion. I've found that it was easier submitting a PR to showcase my idea.