Skip to content

Commit aacbbb4

Browse files
committed
--amend
1 parent a265d38 commit aacbbb4

File tree

1 file changed

+108
-3
lines changed

1 file changed

+108
-3
lines changed

reps/2022-08-31-actor-affinity-apis.md

Lines changed: 108 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Yes, this will be a complement to ray core's ability to flexibly schedule actors
2121
## Stewardship
2222
### Required Reviewers
2323

24-
@wumuzi520 SenlinZhu @WangTaoTheTonic @scv119 (Chen Shen) @jjyao (Jiajun Yao)
24+
@wumuzi520 SenlinZhu @Chong Li @scv119 (Chen Shen) @jjyao (Jiajun Yao)
2525
### Shepherd of the Proposal (should be a senior committer)
2626

2727

@@ -335,7 +335,7 @@ ActorHandle<Counter> actor6 =
335335
### Implementation plan
336336
Now there are two modes of scheduling: GCS mode scheduling and raylet scheduling.
337337
It will be simpler to implement in GCS mode.
338-
#### GCS Scheduling Mode Implementation plan
338+
#### 1. GCS Scheduling Mode Implementation plan
339339

340340
1. Actor adds the Labels property. Stored in the GcsActor structure
341341
2. Gcs Server add GcsLabelManager. Add labels->node information to GcsLabelManager after per actor completes scheduling.
@@ -397,10 +397,107 @@ Main data structure :
397397
Map<label_key, Map<lable_value, Set<node_id>>> label_to_nodes_
398398
Map<node_id, Set<GcsActor>> node_to_actors_
399399
```
400-
#### Raylet Scheduling Mode Implementation plan
400+
#### 2. Raylet Scheduling Mode Implementation plan
401401
The implementation of Raylet scheduling mode is same as GCS scheduling above.
402402
Mainly, one more Labels information needs to be synchronized to all Raylet nodes
403403

404+
1. Add the actor_labels data structure to the resource synchronization data structure(ResourcesData and NodeResources).
405+
```
406+
message ResourcesData {
407+
// Node id.
408+
bytes node_id = 1;
409+
// Resource capacity currently available on this node manager.
410+
map<string, double> resources_available = 2;
411+
// Indicates whether available resources is changed. Only used when light
412+
// heartbeat enabled.
413+
bool resources_available_changed = 3;
414+
415+
// Map<key, Map<value, reference_count>> Actors scheduled to this node and actor labels information
416+
repeat Map<string, Map<string, int>> actor_labels = 15
417+
// Whether the actors of this node is changed.
418+
bool actor_labels_changed = 16,
419+
}
420+
421+
422+
NodeResources {
423+
ResourceRequest total;
424+
ResourceRequest available;
425+
/// Only used by light resource report.
426+
ResourceRequest load;
427+
/// Resources owned by normal tasks.
428+
ResourceRequest normal_task_resources
429+
/// Actors scheduled to this node and actor labels information
430+
absl::flat_hash_map<string, absl::flat_hash_map<string, int>> actor_labels;
431+
}
432+
```
433+
434+
2. Adapts where ResourcesData is constructed and used in the resource synchronization mechanism.
435+
a. NodeManager::HandleRequestResourceReport
436+
b. NodeManager::HandleUpdateResourceUsage
437+
438+
439+
3. Add ActorLabels information to NodeResources during Actor scheduling
440+
441+
a. When the Raylet is successfully scheduled, the ActorLabels information is added to the remote node scheduled in the ClusterResoucesManager.
442+
```
443+
void ClusterTaskManager::ScheduleAndDispatchTasks() {
444+
auto scheduling_node_id = cluster_resource_scheduler_->GetBestSchedulableNode(
445+
ScheduleOnNode(node_id, work);
446+
cluster_resource_scheduler_->AllocateRemoteTaskResources(node_id, resources)
447+
cluster_resource_scheduler_->GetClusterResourceManager().AddActorLabels(node_id, actor);
448+
```
449+
b. Add ActorLabels information to LocalResourcesManager when Actor is dispatched to Worker.
450+
```
451+
LocalTaskManager::DispatchScheduledTasksToWorkers()
452+
cluster_resource_scheduler_->GetLocalResourceManager().AllocateLocalTaskResources
453+
cluster_resource_scheduler_->GetLocalResourceManager().AddActorLabels(actor)
454+
worker_pool_.PopWorker()
455+
```
456+
457+
c. When the Actor is destroyed, the ActorLabels information of the LocalResourcesManager is also deleted.
458+
```
459+
NodeManager::HandleReturnWorker
460+
local_task_manager_->ReleaseWorkerResources(worker);
461+
local_resource_manager_->RemoveActorLabels(actor_id);
462+
```
463+
464+
Actor scheduling flowchart:
465+
![Actor scheduling flowchart](https://user-images.githubusercontent.com/11072802/202128385-f72609c5-308d-4210-84ff-bf3ba6df381c.png)
466+
467+
Node Resources synchronization mechanism:
468+
![Node Resources synchronization mechanism](https://user-images.githubusercontent.com/11072802/202128406-b4745e6e-3565-41a2-bfe3-78843379bf09.png)
469+
470+
4. Scheduling optimization through ActorLabels
471+
Now any node raylet has ActorLabels information for all nodes.
472+
However, when ActorAffinity schedules, if it traverses the Labels of all Actors of each node, the algorithm complexity is very large, and the performance will be poor.
473+
<b> Therefore, it is necessary to generate a full-cluster ActorLabels index table to improve scheduling performance. <b>
474+
475+
```
476+
class GcsLabelManager {
477+
public:
478+
absl::flat_hash_set<NodeID> GetNodesByKeyAndValue(const std::string &ray_namespace,
479+
const std::string &key, const absl::flat_hash_set<std::string> &values) const;
480+
481+
absl::flat_hash_set<NodeID> GetNodesByKey(const std::string &ray_namespace,
482+
const std::string &key) const;
483+
484+
void AddActorLabels(const std::shared_ptr<GcsActor> &actor);
485+
486+
void RemoveActorLabels(const std::shared_ptr<GcsActor> &actor);
487+
488+
private:
489+
<namespace, <label_key, <lable_value, [node_id]>>> labels_to_nodes_;
490+
<node_id, <namespace, [actor]>> nodes_to_actors_;
491+
}
492+
```
493+
494+
<b>Advantages:<b>
495+
1. Compared with the scheme of putting Labels in the coustom resource. This scheme can also reuse the resource synchronization mechanism. Then it won't destroy the concept of coustrom resouce.
496+
497+
<b>Defect
498+
1. Because there must be a delay in resource synchronization under raylet scheduling. So if actor affinity is Soft semantics, there will be inaccurate scheduling.
499+
500+
404501
### Failures and Special Scenarios
405502
#### 1、If the Match Expression Cannot be satisfied
406503
If the matching expression cannot be satisfied, The actor will be add to the pending actor queue. Util the matching expression all be statisfied。
@@ -428,3 +525,11 @@ All APIs will be fully unit tested. All specifications in this documentation wil
428525
429526
## (Optional) Follow-on Work
430527
528+
### Expression of "OR" semantics.
529+
Later, if necessary, you can extend the semantics of "OR" by adding "is_or_semantics" to ActorAffinitySchedulingStrategy.
530+
```
531+
class ActorAffinitySchedulingStrategy:
532+
def __init__(self, match_expressions: List[ActorAffinityMatchExpression], is_or_semantics = false):
533+
self.match_expressions = match_expressions
534+
self.is_or_semantics =
535+
```

0 commit comments

Comments
 (0)