Skip to content

Allow compiling to assembly and friends in the orchestrated Docker container #914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 23, 2023

Conversation

adwinwhite
Copy link
Contributor

cc @shepmaster .
Finally make it usable at least for one use case - Show MIR.

The coordinator works as following:

  • We have a global COORDINATOR which allocates(spawn one if none is free) a Container as a WebSocket connection comes in.
  • One Container consists of two tasks de/encode messages from/to its underlying worker process stdio, and an event loop task which is central of functionality.
  • The event loop receives string requests from WebSocket, lowers them into low-level operations like writing file and executing command and sends them to worker.
  • After collecting operations' results from worker, it aggregates them into high-level response like CompileResponse and send stringified response to WebSocket.
  • Container only exposes two channels which are used for bridging WebSocket and event loop.
  • Recycling is realized by storing these two channels in a collection COORDINATOR after WebSocket disconnects.

Some notes:

  • Container pool use can enabled by URL flag websocket.
  • Concurrent requests works as one cancels another. It may be changed to handling multiple requests at the same time but I haven't find a use case.
  • worker doesn't clean temporary files after recycling currently. Another TODO. Easy to fix.
  • Only Docker image for stable channel is created and it contains no external crates.
  • WebSocket requests has different schema from responses on dealing with nested enum. Another TODO. Easy to fix.
  • Streaming stdio works in the backend, lacking frontend support. So currently we cannot run code. Another TODO.

@shepmaster
Copy link
Member

Before I even get into the code, I'll respond to your overview — this means I might misunderstand something that is explained by the code I haven't read yet.

  • We have a global COORDINATOR which allocates(spawn one if none is free) a Container as a WebSocket connection comes in.

Seems reasonable.

Will this immediately start the Docker container? If so, it wouldn't surprise me if we need to decouple that in the future. I expect (but have not tested) that starting Docker even if we do nothing will be a noticeable resource usage. There are likely more people just idling on the playground page without needing an active container.

In addition to lazily starting the Docker container, I could see a world where we have a pool of idle Docker containers that a WebSocket connection grabs from when it is first needed — that way the initial response can be even quicker. In that case, the pool of coordinators may not be needed; I'm unclear on what resource the pools are attempting to conserve.

In the future, we may also need to support a cap on concurrent workers to be able to throttle.

  • One Container consists of two tasks de/encode messages from/to its underlying worker process stdio, and an event loop task which is central of functionality.

Fits with my expectations.

  • The event loop receives string requests from WebSocket, lowers them into low-level operations like writing file and executing command and sends them to worker.

"string requests" sounds odd. My expectation is that we will send structured JSON requests (e.g.

let msg = serde_json::from_str(&txt).context(crate::DeserializationSnafu);
). Those are sent as WebSocket "text" payloads, so perhaps that's what you mean.

Lowering to smaller operations makes sense. My hope is that the worker doesn't know anything about cargo or rustc.

  • After collecting operations' results from worker, it aggregates them into high-level response like CompileResponse and send stringified response to WebSocket.

Hopefully this is flexible. Certain things will want to wait and others will not. For example, it makes sense to stream the compiler's warnings before we show the process output (which may take several seconds to compute).

However, for our backwards compatibility purposes (e.g. the existing HTTP endpoints), we will definitely need to aggregate everything.

  • Container only exposes two channels which are used for bridging WebSocket and event loop.
  • Recycling is realized by storing these two channels in a collection COORDINATOR after WebSocket disconnects.

I don't think there should be any recycling. If I run a program that writes to disk, then there should not be a way for you to read that file later.

Recall that the current implementation of the playground creates a brand new Docker container for every request, so it's not terribly expensive.

  • Container pool use can enabled by URL flag websocket.

  • Concurrent requests works as one cancels another. It may be changed to handling multiple requests at the same time but I haven't find a use case.

I think that it makes sense to have classes of requests.

For example, performing two MIR requests back-to-back is useless and cancelling makes sense. In the future, we could even be smarter and detect if the code has changed and skip the second request (although this becomes tricky with non-pure programs).

However, it does make sense to be able to view the MIR and ASM concurrently, or run the program and look at the LLVM IR.

  • worker doesn't clean temporary files after recycling currently. Another TODO. Easy to fix.

Sounds like this can be addressed by trashing the entire container when the server is done with it, as mentioned above.

  • Only Docker image for stable channel is created and it contains no external crates.

I do the exact same thing when developing locally — building hundreds of crates is not fun for a quick cycle time!

  • WebSocket requests has different schema from responses on dealing with nested enum. Another TODO. Easy to fix.

I'm not really following this point. Do note that some of my separate work changes how the JS <-> backend messages are formed, so we'll likely need to synchronize anyway.

  • Streaming stdio works in the backend, lacking frontend support. So currently we cannot run code. Another TODO.

Yep, I've got some rough UI for that.

Copy link
Member

@shepmaster shepmaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I read through the production code, but haven't started on the test code yet. My browser is acting up a bit with the comments in the queue, and it's lunch time, so I'll submit for now and try to circle back soon.

Copy link
Member

@shepmaster shepmaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some higher-level thoughts; I didn't look too deeply at the exact code implementation of the tests.

Comment on lines 1002 to 1005
match server_msg {
None => {
panic!("Server disconnected");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Option::expect

@adwinwhite
Copy link
Contributor Author

adwinwhite commented Mar 25, 2023

In addition to lazily starting the Docker container, I could see a world where we have a pool of idle Docker containers that a WebSocket connection grabs from when it is first needed — that way the initial response can be even quicker. In that case, the pool of coordinators may not be needed; I'm unclear on what resource the pools are attempting to conserve.

We do have a pool of idle Docker containers and coordinator is unique globally.
Recycling can save some Docker container start time. I have to set longer timeout to pass tests after using Docker containers.
But unless there is no idle container in pool, start time does not matter since we can start them asynchronously in advance.

Hopefully this is flexible. Certain things will want to wait and others will not. For example, it makes sense to stream the compiler's warnings before we show the process output (which may take several seconds to compute).

It works in this case. cargo run displays compiler's warnings via stderr while process output via stdout. These two streams are sent to frontend separately.

However, it does make sense to be able to view the MIR and ASM concurrently, or run the program and look at the LLVM IR.

Good use cases. We can queue requests to process one by one since there might be conflicts when two compilation command works in the same project directory at the same time.

#[derive(Debug)]
pub enum PlaygroundMessage {
Request(RequestId, HighLevelRequest),
StdinPacket(CommandId, String),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we need stdin support at all for compilation, so that should probably be removed from this PR and saved until later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided against actually removing this. It's unused and untested, but probably won't be too much trouble.

Copy link
Member

@shepmaster shepmaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-skimming; haven't made it through sandbox / worker / message yet.

#[tokio::main(flavor = "current_thread")]
pub async fn main() {
#[snafu::report]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't actually looking at the stderr on the coordinator side yet. It's being streamed to the coordinator's stderr, but that won't be useful in a web context.

) -> Result<ActiveCompilation, CompileError> {
use compile_error::*;

let output_path: &str = "compilation";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current playground has to do some ugly tricks to find the right file; I wonder if we can avoid that now.

@shepmaster
Copy link
Member

All right... I finally had a chance to get the last bit of this first step to what I think is a mergable and deployable state. I'm going to let CI run on this a little and maybe let it sit for a few days to see if any reviews come in.

@shepmaster shepmaster changed the title Use container pool Allow compiling to assembly and friends in the orchestrated Docker container Jun 15, 2023
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 16, 2023
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 21, 2023
@shepmaster shepmaster added the CI: approved Allowed access to CI secrets label Jun 21, 2023
shepmaster and others added 6 commits June 21, 2023 21:33
The orchestrator allows starting up a Docker container and then
communicating with it while it's running. We can then request files to
be read or written in the container as well as running processes.

Just enough has been done to connect the pieces needed to support
compiling to assembly / LLVM / MIR, etc. Future work will add other
operations including the ability to run the code.

Co-authored-by: Jake Goulding <[email protected]>
@shepmaster shepmaster added CI: approved Allowed access to CI secrets and removed CI: approved Allowed access to CI secrets labels Jun 22, 2023
@shepmaster
Copy link
Member

Since this PR has changes to the CI (which I disallow for security reasons), I pushed a trusted branch and it passed CI. \

🚀

@shepmaster shepmaster merged commit 486c431 into rust-lang:main Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: approved Allowed access to CI secrets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants