Skip to content

Allow proxying to remote host #154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Nov 15, 2019

Conversation

jacobtomlinson
Copy link
Contributor

This PR fixes #152 by allowing you to proxy to another host using the format:

/proxy/{host}:{port}/{path}

I appreciate that there may be some security concerns around this but considering that Jupyter already provides a remote execution environment and a bad actor could simply use nc along with the existing proxy to acheive port forwarding to any host within the network I don't think this increases the existing attack surface.

Feedback and discussion very welcome!

@ian-r-rose
Copy link
Collaborator

I can take a look at this later, but for the moment there is some discussion of the security implications of this in #128.

@yuvipanda
Copy link
Contributor

Thanks for opening this, @jacobtomlinson!

I'm mostly worried about bypassing cross origin protection in other third party websites through this from malicious js that can't otherwise run backend code.

I'd like us to whitelist domains with a configurable here to whitelist domains. How does that sound?

@jacobtomlinson
Copy link
Contributor Author

Ah I see.

My only concern is that the dask-kubernetes service names are random UUIDs and so will be unique for every cluster. Could there be a way to dynamically whitelist? A fixed whitelist wouldn't work for this use case.

@manics
Copy link
Member

manics commented Oct 1, 2019

How about allowing a regex or glob in the whitelist?

@jacobtomlinson
Copy link
Contributor Author

That would work, are you happy with that @yuvipanda?

@todo
Copy link

todo bot commented Oct 2, 2019

Get whitelist from config

# TODO Get whitelist from config
whitelist = [r'localhost']
return any([bool(re.match(pattern, host)) for pattern in whitelist])
@web.authenticated
async def proxy(self, host, port, proxied_path):


This comment was generated by todo based on a TODO comment in 87d4955 in #154. cc @jacobtomlinson.

@jacobtomlinson
Copy link
Contributor Author

I've quickly implemented this using a regex based whitelist. I'm not familiar enough with the project to have a feeling for the best way to get the config from within the handler.

https://github.com/jupyterhub/jupyter-server-proxy/pull/154/files#diff-bc4411a876f25fa78f564f274f779fd9R171

Coud someone please point me in the right direction here?

@mrocklin
Copy link

mrocklin commented Oct 7, 2019

Checking in, what is the current status on this? @jacobtomlinson is this blocking the dask-kubernetes rewrite and release?

@manics
Copy link
Member

manics commented Oct 7, 2019

I've just looked through some of the related issues, I'm a bit confused as to how in your use case (internally hosted dask dashboard) a user will know the /proxy/{host}:{port}/{path} to go to since it's an external process, not managed by jupyter-server-proxy's simpervisor.

For config I'd guess @yuvipanda meant adding a new top-level config property to

class ServerProxy(Configurable):

@jacobtomlinson
Copy link
Contributor Author

@manics Dask knows the address of the dashboard. This is provided programmatically to the Dask JL Extension and also the widget (with a link the user clicks).

Thanks for the pointer to the config!

@mrocklin yeah this is blocking the release.

@yuvipanda
Copy link
Contributor

When me and @ian-r-rose had this conversation last, we ended up with https://github.com/dask/dask-labextension/blob/master/dask_labextension/dashboardhandler.py#L16. You can proxy to remote hosts by overriding this in your own extension. Does your rewrite have your own extension to put that code in? If so, IMO that's the 'right' thing to do - you control the hosts, but you also control the URL path to be whatever you want. If it's in config, you still have to find a way to add it to the jupyter_notebook_config.py file.

If you do want this to be a config, I think having that be function that can dynamically take a hostname and return True / False is better than a RegEx.

Thank you for working on this! I'm sorry about the tardy responses. Happy to spend some time in a more immediate medium (chat / voice) if that'll help.

@yuvipanda
Copy link
Contributor

As for creating config, you'll have to use traitlets

nbresuse offers useful example:

  1. https://github.com/yuvipanda/nbresuse/blob/master/nbresuse/__init__.py#L56 creates a Config object, and defines the aspects that can be configurable
  2. https://github.com/yuvipanda/nbresuse/blob/master/nbresuse/__init__.py#L96 it's populated from jupyter_notebook_config.py (and other files Jupyter checks) and put in a place that web handlers can see
  3. Accessed in the handlers https://github.com/yuvipanda/nbresuse/blob/master/nbresuse/__init__.py#L17

@jacobtomlinson
Copy link
Contributor Author

@yuvipanda This isn't for an extension so I don't think I have anywhere to write a custom handler.

Not used traitlets yet, looks neat. I'll have a go at adding it, but if I struggle I may ping you for a sync.

@rcthomas
Copy link
Contributor

rcthomas commented Nov 1, 2019

@jacobtomlinson is the only thing needed here the traitlets config stuff for the whitelist function?

@jacobtomlinson
Copy link
Contributor Author

Yes but this has dropped down my priority list slightly.

@rcthomas
Copy link
Contributor

rcthomas commented Nov 5, 2019

@jacobtomlinson if you review+merge the PR I just sent you then I think that might address the configuration question and then @yuvipanda can review.

Make whitelist hook configurable
@jacobtomlinson
Copy link
Contributor Author

@rcthomas done, many thanks for pushing this along.

@rcthomas
Copy link
Contributor

rcthomas commented Nov 7, 2019

There might be another thing to fix here I realized. The handlers need to be swapped (make Remote handler first) or the regex match in the handlers refined more. When I was testing my fix I was using a setup that had a Docker network and a name like "service1:8000" and that matched the proper handler. But ordering matters, and when I tried late last night with "10.128.0.20:8787" it matched the Local one (because \d's matched). I did a live fix swapping the handlers and that made the behavior correct.

@rcthomas
Copy link
Contributor

rcthomas commented Nov 8, 2019

I think that looks good. @yuvipanda What do you think?

Copy link
Contributor

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traitlet nitpicking, but otherwise +1 I think.

@yuvipanda
Copy link
Contributor

Thanks, @rcthomas and @jacobtomlinson! I think this is good to go once the merge conflict is resolved.

@rcthomas
Copy link
Contributor

I think it's just here https://github.com/jupyterhub/jupyter-server-proxy/blob/master/jupyter_server_proxy/config.py#L12 @jacobtomlinson do you want to handle that? I probably should have just submitted a different PR directly, but we're almost done here...

@yuvipanda yuvipanda merged commit 0eda26c into jupyterhub:master Nov 15, 2019
@yuvipanda
Copy link
Contributor

w00t, thanks @rcthomas and @jacobtomlinson!

I'll try make a release shortly

@mrocklin
Copy link

mrocklin commented Nov 15, 2019 via email

@jacobtomlinson
Copy link
Contributor Author

Thanks all!

@jacobtomlinson jacobtomlinson deleted the allow-remote-proxy branch November 15, 2019 18:41
@yuvipanda
Copy link
Contributor

Released!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proxy to host other than localhost
6 participants