Allow compiling to assembly and friends in the orchestrated Docker container #914

adwinwhite · 2023-03-21T12:27:45Z

cc @shepmaster .
Finally make it usable at least for one use case - Show MIR.

The coordinator works as following:

We have a global COORDINATOR which allocates(spawn one if none is free) a Container as a WebSocket connection comes in.
One Container consists of two tasks de/encode messages from/to its underlying worker process stdio, and an event loop task which is central of functionality.
The event loop receives string requests from WebSocket, lowers them into low-level operations like writing file and executing command and sends them to worker.
After collecting operations' results from worker, it aggregates them into high-level response like CompileResponse and send stringified response to WebSocket.
Container only exposes two channels which are used for bridging WebSocket and event loop.
Recycling is realized by storing these two channels in a collection COORDINATOR after WebSocket disconnects.

Some notes:

Container pool use can enabled by URL flag websocket.
Concurrent requests works as one cancels another. It may be changed to handling multiple requests at the same time but I haven't find a use case.
worker doesn't clean temporary files after recycling currently. Another TODO. Easy to fix.
Only Docker image for stable channel is created and it contains no external crates.
WebSocket requests has different schema from responses on dealing with nested enum. Another TODO. Easy to fix.
Streaming stdio works in the backend, lacking frontend support. So currently we cannot run code. Another TODO.

shepmaster · 2023-03-24T14:42:16Z

Before I even get into the code, I'll respond to your overview — this means I might misunderstand something that is explained by the code I haven't read yet.

We have a global COORDINATOR which allocates(spawn one if none is free) a Container as a WebSocket connection comes in.

Seems reasonable.

Will this immediately start the Docker container? If so, it wouldn't surprise me if we need to decouple that in the future. I expect (but have not tested) that starting Docker even if we do nothing will be a noticeable resource usage. There are likely more people just idling on the playground page without needing an active container.

In addition to lazily starting the Docker container, I could see a world where we have a pool of idle Docker containers that a WebSocket connection grabs from when it is first needed — that way the initial response can be even quicker. In that case, the pool of coordinators may not be needed; I'm unclear on what resource the pools are attempting to conserve.

In the future, we may also need to support a cap on concurrent workers to be able to throttle.

One Container consists of two tasks de/encode messages from/to its underlying worker process stdio, and an event loop task which is central of functionality.

Fits with my expectations.

The event loop receives string requests from WebSocket, lowers them into low-level operations like writing file and executing command and sends them to worker.

"string requests" sounds odd. My expectation is that we will send structured JSON requests (e.g.

rust-playground/ui/src/server_axum/websocket.rs

Line 204 in eb1565b

let msg = serde_json::from_str(&txt).context(crate::DeserializationSnafu);

). Those are sent as WebSocket "text" payloads, so perhaps that's what you mean.

Lowering to smaller operations makes sense. My hope is that the worker doesn't know anything about cargo or rustc.

After collecting operations' results from worker, it aggregates them into high-level response like CompileResponse and send stringified response to WebSocket.

Hopefully this is flexible. Certain things will want to wait and others will not. For example, it makes sense to stream the compiler's warnings before we show the process output (which may take several seconds to compute).

However, for our backwards compatibility purposes (e.g. the existing HTTP endpoints), we will definitely need to aggregate everything.

Container only exposes two channels which are used for bridging WebSocket and event loop.

Recycling is realized by storing these two channels in a collection COORDINATOR after WebSocket disconnects.

I don't think there should be any recycling. If I run a program that writes to disk, then there should not be a way for you to read that file later.

Recall that the current implementation of the playground creates a brand new Docker container for every request, so it's not terribly expensive.

Container pool use can enabled by URL flag websocket.

✅

Concurrent requests works as one cancels another. It may be changed to handling multiple requests at the same time but I haven't find a use case.

I think that it makes sense to have classes of requests.

For example, performing two MIR requests back-to-back is useless and cancelling makes sense. In the future, we could even be smarter and detect if the code has changed and skip the second request (although this becomes tricky with non-pure programs).

However, it does make sense to be able to view the MIR and ASM concurrently, or run the program and look at the LLVM IR.

worker doesn't clean temporary files after recycling currently. Another TODO. Easy to fix.

Sounds like this can be addressed by trashing the entire container when the server is done with it, as mentioned above.

Only Docker image for stable channel is created and it contains no external crates.

I do the exact same thing when developing locally — building hundreds of crates is not fun for a quick cycle time!

WebSocket requests has different schema from responses on dealing with nested enum. Another TODO. Easy to fix.

I'm not really following this point. Do note that some of my separate work changes how the JS <-> backend messages are formed, so we'll likely need to synchronize anyway.

Streaming stdio works in the backend, lacking frontend support. So currently we cannot run code. Another TODO.

Yep, I've got some rough UI for that.

shepmaster

Ok, I read through the production code, but haven't started on the test code yet. My browser is acting up a bit with the comments in the queue, and it's lunch time, so I'll submit for now and try to circle back soon.

worker-message/Cargo.toml

worker-message/Dockerfile

worker-message/src/lib.rs

ui/src/coordinator.rs

shepmaster

Some higher-level thoughts; I didn't look too deeply at the exact code implementation of the tests.

ui/src/coordinator.rs

shepmaster · 2023-03-24T18:30:06Z

ui/src/coordinator.rs

+            match server_msg {
+                None => {
+                    panic!("Server disconnected");
+                }


This is Option::expect

adwinwhite · 2023-03-25T14:05:20Z

In addition to lazily starting the Docker container, I could see a world where we have a pool of idle Docker containers that a WebSocket connection grabs from when it is first needed — that way the initial response can be even quicker. In that case, the pool of coordinators may not be needed; I'm unclear on what resource the pools are attempting to conserve.

We do have a pool of idle Docker containers and coordinator is unique globally.
Recycling can save some Docker container start time. I have to set longer timeout to pass tests after using Docker containers.
But unless there is no idle container in pool, start time does not matter since we can start them asynchronously in advance.

Hopefully this is flexible. Certain things will want to wait and others will not. For example, it makes sense to stream the compiler's warnings before we show the process output (which may take several seconds to compute).

It works in this case. cargo run displays compiler's warnings via stderr while process output via stdout. These two streams are sent to frontend separately.

However, it does make sense to be able to view the MIR and ASM concurrently, or run the program and look at the LLVM IR.

Good use cases. We can queue requests to process one by one since there might be conflicts when two compilation command works in the same project directory at the same time.

orchestrator/Cargo.toml

orchestrator/src/main.rs

orchestrator/src/sandbox.rs

shepmaster · 2023-03-31T18:22:18Z

orchestrator/src/coordinator.rs

+#[derive(Debug)]
+pub enum PlaygroundMessage {
+    Request(RequestId, HighLevelRequest),
+    StdinPacket(CommandId, String),


I don't think that we need stdin support at all for compilation, so that should probably be removed from this PR and saved until later.

I decided against actually removing this. It's unused and untested, but probably won't be too much trouble.

orchestrator/src/worker.rs

orchestrator/src/coordinator.rs

shepmaster

Re-skimming; haven't made it through sandbox / worker / message yet.

orchestrator/src/lib.rs

shepmaster · 2023-04-17T16:59:47Z

orchestrator/src/main.rs

 #[tokio::main(flavor = "current_thread")]
-pub async fn main() {
+#[snafu::report]


We aren't actually looking at the stderr on the coordinator side yet. It's being streamed to the coordinator's stderr, but that won't be useful in a web context.

orchestrator/src/coordinator.rs

orchestrator/src/sandbox.rs

shepmaster · 2023-04-17T17:31:18Z

orchestrator/src/coordinator.rs

+    ) -> Result<ActiveCompilation, CompileError> {
+        use compile_error::*;
+
+        let output_path: &str = "compilation";


The current playground has to do some ugly tricks to find the right file; I wonder if we can avoid that now.

orchestrator/src/coordinator.rs

orchestrator/src/sandbox.rs

orchestrator/src/worker.rs

orchestrator/src/coordinator.rs

shepmaster · 2023-06-15T18:11:36Z

All right... I finally had a chance to get the last bit of this first step to what I think is a mergable and deployable state. I'm going to let CI run on this a little and maybe let it sit for a few days to see if any reviews come in.

The orchestrator allows starting up a Docker container and then communicating with it while it's running. We can then request files to be read or written in the container as well as running processes. Just enough has been done to connect the pieces needed to support compiling to assembly / LLVM / MIR, etc. Future work will add other operations including the ability to run the code. Co-authored-by: Jake Goulding <[email protected]>

shepmaster · 2023-06-23T14:18:41Z

Since this PR has changes to the CI (which I disallow for security reasons), I pushed a trusted branch and it passed CI. \

🚀

shepmaster reviewed Mar 24, 2023

View reviewed changes

adwinwhite force-pushed the container-pool branch from 91d3507 to 57b8e74 Compare March 31, 2023 13:48

shepmaster reviewed Apr 1, 2023

View reviewed changes

shepmaster reviewed Apr 17, 2023

View reviewed changes

shepmaster reviewed Apr 18, 2023

View reviewed changes

orchestrator/src/coordinator.rs Outdated Show resolved Hide resolved

shepmaster reviewed Apr 18, 2023

View reviewed changes

orchestrator/src/coordinator.rs Outdated Show resolved Hide resolved

shepmaster force-pushed the container-pool branch from 8405d65 to f2c545c Compare June 15, 2023 18:10

shepmaster force-pushed the container-pool branch from f2c545c to 778c7fc Compare June 15, 2023 18:15

shepmaster changed the title ~~Use container pool~~ Allow compiling to assembly and friends in the orchestrated Docker container Jun 15, 2023

shepmaster force-pushed the container-pool branch from 778c7fc to 2dabd5f Compare June 16, 2023 18:24

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 16, 2023

Tag tests that use external resources

2136e81

shepmaster force-pushed the container-pool branch from d7c2c1f to 44fb680 Compare June 21, 2023 14:52

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023

shepmaster force-pushed the container-pool branch from 44fb680 to 0c12e13 Compare June 21, 2023 15:14

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023

shepmaster force-pushed the container-pool branch from 0c12e13 to 9ec333b Compare June 21, 2023 15:46

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023

shepmaster force-pushed the container-pool branch from 9ec333b to 95a9c34 Compare June 21, 2023 17:55

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023

shepmaster force-pushed the container-pool branch from 95a9c34 to 4670840 Compare June 21, 2023 18:19

shepmaster added the CI: approved Allowed access to CI secrets label Jun 21, 2023

shepmaster and others added 6 commits June 21, 2023 21:33

Extract a modify-cargo-toml library for reuse

dae947f

Extract an asm-cleanup library for reuse

c665784

Put the Wasm output in the expected location without the extension

5b8e3ae

Build the orchestrator in Docker

4463451

Optionally use the orchestrator for the "compile" commands in the UI

8b8a3dd

shepmaster force-pushed the container-pool branch from 4670840 to 8b8a3dd Compare June 22, 2023 01:33

shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 22, 2023

shepmaster merged commit 486c431 into rust-lang:main Jun 23, 2023

Allow compiling to assembly and friends in the orchestrated Docker container #914

Allow compiling to assembly and friends in the orchestrated Docker container #914

Uh oh!

Conversation

adwinwhite commented Mar 21, 2023

Uh oh!

shepmaster commented Mar 24, 2023

Uh oh!

shepmaster left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shepmaster left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shepmaster Mar 24, 2023

Choose a reason for hiding this comment

Uh oh!

adwinwhite commented Mar 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shepmaster Mar 31, 2023

Choose a reason for hiding this comment

Uh oh!

shepmaster Apr 19, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shepmaster left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shepmaster Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shepmaster Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shepmaster commented Jun 15, 2023

Uh oh!

shepmaster commented Jun 23, 2023

Uh oh!

Uh oh!

adwinwhite commented Mar 25, 2023 •

edited

Loading