Skip to content

Commit 53c3a00

Browse files
authored
Merge pull request #501 from jwindgassen/native-proxy
Integrating native-proxy
2 parents 8c5906a + cb4766d commit 53c3a00

File tree

9 files changed

+876
-59
lines changed

9 files changed

+876
-59
lines changed

docs/source/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ install
3030
server-process
3131
launchers
3232
arbitrary-ports-hosts
33+
standalone
3334
```
3435

3536
## Convenience packages for popular applications

docs/source/standalone.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
(standalone)=
2+
3+
# Spawning and proxying a web service from JupyterHub
4+
5+
The `standalone` feature of Jupyter Server Proxy enables JupyterHub Admins to launch and proxy arbitrary web services
6+
directly, instead of JupyterLab or Notebook. You can use Jupyter Server Proxy to spawn a single proxy,
7+
without it being attached to a Jupyter server. The proxy securely authenticates and restricts access to authorized
8+
users through JupyterHub, providing a unified way to access arbitrary applications securely.
9+
10+
This works similarly to {ref}`proxying Server Processes <server-process>`, where a server process is started and proxied.
11+
The Proxy is usually started from the command line, often by modifying the `Spawner.cmd` in your
12+
[JupyterHub Configuration](https://jupyterhub.readthedocs.io/en/stable/tutorial/getting-started/spawners-basics.html).
13+
14+
This feature builds upon the work of [Dan Lester](https://github.com/danlester), who originally developed it in the
15+
[jhsingle-native-proxy](https://github.com/ideonate/jhsingle-native-proxy) package.
16+
17+
## Installation
18+
19+
This feature has a dependency on JupyterHub and must be explicitly installed via an optional dependency:
20+
21+
```shell
22+
pip install jupyter-server-proxy[standalone]
23+
```
24+
25+
## Usage
26+
27+
The standalone proxy is controlled with the `jupyter standaloneproxy` command. You always need to specify the
28+
{ref}`command <server-process:cmd>` of the web service that will be launched and proxied. Let's use
29+
[voilà](https://github.com/voila-dashboards/voila) as an example here:
30+
31+
```shell
32+
jupyter standaloneproxy -- voila --no-browser --port={port} /path/to/some/Notebook.ipynb
33+
```
34+
35+
Executing this command will spawn a new HTTP Server, creating the voilà dashboard and rendering the notebook.
36+
Any template strings (like the `--port={port}`) inside the command will be automatically replaced when the command is
37+
executed.
38+
39+
The CLI has multiple advanced options to customize the proxy behavior. Execute `jupyter standaloneproxy --help`
40+
to get a complete list of all arguments.
41+
42+
### Specify the address and port
43+
44+
The proxy will try to extract the address and port from the `JUPYTERHUB_SERVICE_URL` environment variable. This variable
45+
will be set by JupyterHub. Otherwise, the server will be launched on `127.0.0.1:8888`.
46+
You can also explicitly overwrite these values:
47+
48+
```shell
49+
jupyter standaloneproxy --address=localhost --port=8000 ...
50+
```
51+
52+
### Disable Authentication
53+
54+
For testing, it can be useful to disable the authentication with JupyterHub. Passing `--skip-authentication` will
55+
not trigger the login process when accessing the application.
56+
57+
```{warning} Disabling authentication will leave the application open to anyone! Be careful with it,
58+
especially on multi-user systems.
59+
```
60+
61+
### Configuration via traitlets
62+
63+
Instead of using the commandline, a standalone proxy can also be configured via a `traitlets` configuration file.
64+
The configuration file can be loaded by running `jupyter standaloneproxy --config path/to/config.py`.
65+
66+
The options mentioned above can also be configured in the config file:
67+
68+
```python
69+
# Specify the command to execute
70+
c.StandaloneProxyServer.command = [
71+
"voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb"
72+
]
73+
74+
# Specify address and port
75+
c.StandaloneProxyServer.address = "localhost"
76+
c.StandaloneProxyServer.port = 8000
77+
78+
# Disable authentication
79+
c.StandaloneProxyServer.skip_authentication = True
80+
```
81+
82+
A default config file can be emitted by running `jupyter standaloneproxy --generate-config`
83+
84+
## Usage with JupyterHub
85+
86+
To launch a standalone proxy with JupyterHub, you need to customize the `Spawner` inside the configuration
87+
using `traitlets`:
88+
89+
```python
90+
c.Spawner.cmd = "jupyter-standaloneproxy"
91+
c.Spawner.args = ["--", "voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb"]
92+
```
93+
94+
This will hard-code JupyterHub to launch voilà instead of `jupyterhub-singleuser`. In case you want to give the users
95+
of JupyterHub the ability to select which application to launch (like selecting either JupyterLab or voilà),
96+
you will want to make this configuration optional:
97+
98+
```python
99+
# Let users select which application start
100+
c.Spawner.options_form = """
101+
<label for="select-application">Choose Application: </label>
102+
<select name="application" required>
103+
<option value="lab">JupyterLab</option>
104+
<option value="voila">voila</option>
105+
</select>
106+
"""
107+
108+
def select_application(spawner):
109+
application = spawner.user_options.get("application", ["lab"])[0]
110+
if application == "voila":
111+
spawner.cmd = "jupyter-standaloneproxy"
112+
spawner.args = ["--", "voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb"]
113+
114+
c.Spawner.pre_spawn_hook = select_application
115+
```
116+
117+
```{note} This is only a very basic implementation to show a possible approach. For a production setup, you can create
118+
a more rigorous implementation by creating a custom `Spawner` and overwriting the appropriate functions and/or
119+
creating a custom `spawner.html` page.
120+
```
121+
122+
## Technical Overview
123+
124+
The following section should serve as an explanation to developers of the standalone feature of jupyter-server-proxy.
125+
It outlines the basic functionality and will explain the different components of the code in more depth.
126+
127+
### JupyterHub and jupyterhub-singleuser
128+
129+
By default, JupyterHub will use the `jupyterhub-singleuser` executable when launching a new instance for a user.
130+
This executable is usually a wrapper around the `JupyterLab` or `Notebook` application, with some
131+
additions regarding authentication and multi-user systems.
132+
In the standalone feature, we try to mimic these additions, but instead of using `JupyterLab` or `Notebook`, we
133+
will wrap them around an arbitrary web application.
134+
This will ensure direct, authenticated access to the application, without needing a Jupyter server to be running
135+
in the background. The different additions will be discussed in more detail below.
136+
137+
### Structure
138+
139+
The standalone feature is built on top of the `SuperviseAndProxyhandler`, which will spawn a process and proxy
140+
requests to this server. While this process is called _Server_ in the documentation, the term _Application_ will be
141+
used here, to avoid confusion with the other server where the `SuperviseAndProxyhandler` is attached to.
142+
When using jupyter-server-proxy, the proxies are attached to the Jupyter server and will proxy requests
143+
to the application.
144+
Since we do not want to use the Jupyter server here, we instead require an alternative server, which will be used
145+
to attach the `SuperviseAndProxyhandler` and all the required additions from `jupyterhub-singleuser`.
146+
For that, we use tornado `HTTPServer`.
147+
148+
### Login and Authentication
149+
150+
One central component is the authentication with the JupyterHub Server.
151+
Any client accessing the application will need to authenticate with the JupyterHub API, which will ensure only
152+
users themselves (or otherwise allowed users, e.g., admins) can access the application.
153+
The Login process is started by deriving our `StandaloneProxyHandler` from
154+
[jupyterhub.services.auth.HubOAuthenticated](https://github.com/jupyterhub/jupyterhub/blob/5.0.0/jupyterhub/services/auth.py#L1541)
155+
and decorating any methods we want to authenticate with `tornado.web.authenticated`.
156+
For the proxy, we just decorate the `proxy` method with `web.authenticated`, which will authenticate all routes on all HTTP Methods.
157+
`HubOAuthenticated` will automatically provide the login URL for the authentication process and any
158+
client accessing any path of our server will be redirected to the JupyterHub API.
159+
160+
After a client has been authenticated with the JupyterHub API, they will be redirected back to our server.
161+
This redirect will be received on the `/oauth_callback` path, from where we need to redirect the client back to the
162+
root of the application.
163+
We use the [HubOAuthCallbackHandler](https://github.com/jupyterhub/jupyterhub/blob/5.0.0/jupyterhub/services/auth.py#L1547),
164+
another handler from the JupyterHub package, for this.
165+
It will also cache the received OAuth state from the login so that we can skip authentication for the next requests
166+
and do not need to go through the whole login process for each request.
167+
168+
### SSL certificates
169+
170+
In some JupyterHub configurations, the launched application will be configured to use an SSL certificate for requests
171+
between the JupyterLab / Notebook and the JupyterHub API. The path of the certificate is given in the
172+
`JUPYTERHUB_SSL_*` environment variables. We use these variables to create a new SSL Context for both
173+
the `AsyncHTTPClient` (used for Activity Notification, see below) and the `HTTPServer`.
174+
175+
### Activity Notifications
176+
177+
The `jupyterhub-singleuser` will periodically send an activity notification to the JupyterHub API and inform it that
178+
the currently running application is still active. Whether this information is used or not depends on the specific
179+
configuration of this JupyterHub.
180+
181+
### Environment Variables
182+
183+
JupyterHub uses a lot of environment variables to specify how the launched app should be run.
184+
This list is a small overview of all used variables and what they contain and are used for.
185+
186+
| Variable | Explanation | Typical Value |
187+
| ------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------ |
188+
| `JUPYTERHUB_SERVICE_URL` | URL where the server should be listening. Used to find the Address and Port to start the server on. | `http://127.0.0.1:5555` |
189+
| `JUPYTERHUB_SERVICE_PREFIX` | An URL Prefix where the root of the launched application should be hosted. E.g., when set to `/user/name/`, then the root of the proxied application should be `/user/name/index.html` | `/services/service-name/` or `/user/name/` |
190+
| `JUPYTERHUB_ACTIVITY_URL` | URL where to send activity notifications to. | `$JUPYTERHUB_API_URL/user/name/activity` |
191+
| `JUPYTERHUB_API_TOKEN` | Authorization Token for requests to the JupyterHub API. | |
192+
| `JUPYTERHUB_SERVER_NAME` | A name given to all apps launched by the JupyterHub. | |
193+
| `JUPYTERHUB_SSL_KEYFILE`, `JUPYTERHUB_SSL_CERTFILE`, `JUPYTERHUB_SSL_CLIENT_CA` | Paths to keyfile, certfile and client CA for the SSL configuration | |
194+
| `JUPYTERHUB_USER`, `JUPYTERHUB_GROUP` | Name and Group of the user for this application. Required for Authentication |

jupyter_server_proxy/config.py

Lines changed: 84 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
Traitlets based configuration for jupyter_server_proxy
33
"""
44

5+
from __future__ import annotations
6+
57
import sys
68
from textwrap import dedent, indent
79
from warnings import warn
@@ -260,60 +262,83 @@ def cats_only(response, path):
260262
""",
261263
).tag(config=True)
262264

265+
def get_proxy_base_class(self) -> tuple[type | None, dict]:
266+
"""
267+
Return the appropriate ProxyHandler Subclass and its kwargs
268+
"""
269+
if self.command:
270+
return (
271+
SuperviseAndRawSocketHandler
272+
if self.raw_socket_proxy
273+
else SuperviseAndProxyHandler
274+
), dict(state={})
275+
276+
if not (self.port or isinstance(self.unix_socket, str)):
277+
warn(
278+
f"""Server proxy {self.name} does not have a command, port number or unix_socket path.
279+
At least one of these is required."""
280+
)
281+
return None, dict()
282+
283+
return (
284+
RawSocketHandler if self.raw_socket_proxy else NamedLocalProxyHandler
285+
), dict()
263286

264-
def _make_proxy_handler(sp: ServerProcess):
265-
"""
266-
Create an appropriate handler with given parameters
267-
"""
268-
if sp.command:
269-
cls = (
270-
SuperviseAndRawSocketHandler
271-
if sp.raw_socket_proxy
272-
else SuperviseAndProxyHandler
273-
)
274-
args = dict(state={})
275-
elif not (sp.port or isinstance(sp.unix_socket, str)):
276-
warn(
277-
f"Server proxy {sp.name} does not have a command, port "
278-
f"number or unix_socket path. At least one of these is "
279-
f"required."
280-
)
281-
return
282-
else:
283-
cls = RawSocketHandler if sp.raw_socket_proxy else NamedLocalProxyHandler
284-
args = {}
285-
286-
# FIXME: Set 'name' properly
287-
class _Proxy(cls):
288-
kwargs = args
289-
290-
def __init__(self, *args, **kwargs):
291-
super().__init__(*args, **kwargs)
292-
self.name = sp.name
293-
self.command = sp.command
294-
self.proxy_base = sp.name
295-
self.absolute_url = sp.absolute_url
296-
if sp.command:
297-
self.requested_port = sp.port
298-
self.requested_unix_socket = sp.unix_socket
299-
else:
300-
self.port = sp.port
301-
self.unix_socket = sp.unix_socket
302-
self.mappath = sp.mappath
303-
self.rewrite_response = sp.rewrite_response
304-
self.update_last_activity = sp.update_last_activity
305-
306-
def get_request_headers_override(self):
307-
return self._realize_rendered_template(sp.request_headers_override)
308-
309-
# these two methods are only used in supervise classes, but do no harm otherwise
310-
def get_env(self):
311-
return self._realize_rendered_template(sp.environment)
312-
313-
def get_timeout(self):
314-
return sp.timeout
315-
316-
return _Proxy
287+
def get_proxy_attributes(self) -> dict:
288+
"""
289+
Return the required attributes, which will be set on the proxy handler
290+
"""
291+
attributes = {
292+
"name": self.name,
293+
"command": self.command,
294+
"proxy_base": self.name,
295+
"absolute_url": self.absolute_url,
296+
"mappath": self.mappath,
297+
"rewrite_response": self.rewrite_response,
298+
"update_last_activity": self.update_last_activity,
299+
"request_headers_override": self.request_headers_override,
300+
}
301+
302+
if self.command:
303+
attributes["requested_port"] = self.port
304+
attributes["requested_unix_socket"] = self.unix_socket
305+
attributes["environment"] = self.environment
306+
attributes["timeout"] = self.timeout
307+
else:
308+
attributes["port"] = self.port
309+
attributes["unix_socket"] = self.unix_socket
310+
311+
return attributes
312+
313+
def make_proxy_handler(self) -> tuple[type | None, dict]:
314+
"""
315+
Create an appropriate handler for this ServerProxy Configuration
316+
"""
317+
cls, proxy_kwargs = self.get_proxy_base_class()
318+
if cls is None:
319+
return None, proxy_kwargs
320+
321+
# FIXME: Set 'name' properly
322+
attributes = self.get_proxy_attributes()
323+
324+
class _Proxy(cls):
325+
def __init__(self, *args, **kwargs):
326+
super().__init__(*args, **kwargs)
327+
328+
for name, value in attributes.items():
329+
setattr(self, name, value)
330+
331+
def get_request_headers_override(self):
332+
return self._realize_rendered_template(self.request_headers_override)
333+
334+
# these two methods are only used in supervise classes, but do no harm otherwise
335+
def get_env(self):
336+
return self._realize_rendered_template(self.environment)
337+
338+
def get_timeout(self):
339+
return self.timeout
340+
341+
return _Proxy, proxy_kwargs
317342

318343

319344
def get_entrypoint_server_processes(serverproxy_config):
@@ -329,21 +354,21 @@ def get_entrypoint_server_processes(serverproxy_config):
329354
return sps
330355

331356

332-
def make_handlers(base_url, server_processes):
357+
def make_handlers(base_url: str, server_processes: list[ServerProcess]):
333358
"""
334359
Get tornado handlers for registered server_processes
335360
"""
336361
handlers = []
337-
for sp in server_processes:
338-
handler = _make_proxy_handler(sp)
362+
for server in server_processes:
363+
handler, kwargs = server.make_proxy_handler()
339364
if not handler:
340365
continue
341-
handlers.append((ujoin(base_url, sp.name, r"(.*)"), handler, handler.kwargs))
342-
handlers.append((ujoin(base_url, sp.name), AddSlashHandler))
366+
handlers.append((ujoin(base_url, server.name, r"(.*)"), handler, kwargs))
367+
handlers.append((ujoin(base_url, server.name), AddSlashHandler))
343368
return handlers
344369

345370

346-
def make_server_process(name, server_process_config, serverproxy_config):
371+
def make_server_process(name: str, server_process_config: dict, serverproxy_config):
347372
return ServerProcess(name=name, **server_process_config)
348373

349374

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
from .app import StandaloneProxyServer
2+
3+
4+
def main():
5+
StandaloneProxyServer.launch_instance()
6+
7+
8+
if __name__ == "__main__":
9+
main()

0 commit comments

Comments
 (0)