The task pipeline is described using JSON format and consists of several nodes. Each node contains the following core attributes:
-
Task Triggering
- Start the task by specifying the entry node through the
tasker.post_task
interface
- Start the task by specifying the entry node through the
-
Sequential Detection:
- Perform sequential detection on the current node's
next
list - Attempt to identify the recognition features configured for each sub-node in sequence
- Perform sequential detection on the current node's
-
Interruption Mechanism:
- When a sub-node is successfully matched, immediately terminate subsequent node detections
- Execute the operation defined by the matched node's
action
-
Successor Processing:
- After the operation is completed, switch the active node to the current node
- Repeat the above detection process
The task process terminates when any of the following conditions are met:
- The current node's
next
list is empty - All subsequent node detections continuously fail until timeout
In a game scene, fruits (apple/orange/banana) appear randomly and require a click operation. When an orange is clicked, an additional animal (cat/dog) appears that also needs to be clicked.
{
"StartFruit": {
"next": [
"Apple",
"Orange",
"Banana"
]
},
"Apple": {
"recognition": XXX,
"action": "Click",
// ...
},
"Orange": {
"recognition": XXX,
"action": "Click",
"next": [
"Cat",
"Dog"
]
},
"Banana": {
// ...
},
"Cat": {
"action": "Swipe",
"next": []
},
// ...
}
- Start StartFruit
- Recognize Apple -> Failure
- Recognize Orange -> Success -> Execute click operation
- Switch to the
next
list of Orange - Recognize Cat -> Success -> Execute swipe operation
- Switch to the
next
list of Cat - The
next
list is empty, task completed
- Start StartFruit
- Recognize Apple -> Failure
- Recognize Orange -> Failure
- Recognize Banana -> Failure
- Check if recognition has timed out -> Not timed out
- Recognize Apple -> Success -> Execute click operation
- Switch to the
next
list of Apple - The
next
list is empty, task completed
Note: For mandatory fields, they may remain empty in the Pipeline JSON file, but must be set via the interface before actual execution.
-
recognition
: string
Recognition algorithm type. Optional, default isDirectHit
.
Possible values:DirectHit
|TemplateMatch
|FeatureMatch
|ColorMatch
|OCR
|NeuralNetworkClassify
|NeuralNetworkDetect
|Custom
.
See Algorithm Types for details. -
action
: string
Action to execute. Optional, default isDoNothing
.
Possible values:DoNothing
|Click
|Swipe
|MultiSwipe
|Key
|InputText
|StartApp
|StopApp
|StopTask
|Command
|Custom
.
See Action Types for details. -
next
: string | list<string, >
List of nodes to execute next. Optional, default is empty.
It recognizes each node in sequence and executes the first one it recognizes. -
interrupt
: string | list<string, >
The list of candidate nodes when all nodes innext
are not recognized, and similar interrupt operations will be performed. Optional, empty by default.
If all nodes innext
are not recognized, each node in the interrupt list will be recognized in order, and the first recognized one will be executed. After all subsequent nodes are executed, jump back to this node to try to recognize it again.
For example: A: { next: [B, C], interrupt: [D, E] }
When B and C are not recognized and D is recognized, D and D.next will be fully executed. But when the pipeline of D is fully executed. It will return to node A again and continue to try to recognize B, C, D, E.
This field is mostly used for exception handling. For example, D is to recognize the "network disconnection prompt box". After clicking confirm and waiting for the network connection to succeed, continue the previous node flow. -
is_sub
: bool
(Deprecated in version 2.x, but retains compatibility.interrupt
is recommended instead.)
Whether it is a sub-node. Optional, default is false.
If it's a sub-node, after completing this node (and subsequent nodes such as "next"), it will return to re-recognize the "next" list of this node.
For example: A.next = [B, Sub_C, D], where Sub_C.is_sub = true. If Sub_C is matched, after fully executing Sub_C and subsequent nodes, it will return to re-recognize [B, Sub_C, D] and execute the matching items and subsequent nodes. -
rate_limit
: uint
Identification rate limit, in milliseconds. Optional, default 1000.
Each round of identification "next" + "interrupt" consumes at leastrate_limit
milliseconds, and sleep will wait if the time is less than that. -
timeout
: uint
Timeout for recognizing "next" + "interrupt" nodes, in milliseconds. Optional, Default is 20,000 milliseconds (20 seconds).
The detailed logic iswhile(!timeout) { foreach(next + interrupt); sleep_until(rate_limit); }
-
on_error
: string | list<string, >
When recognition timeout or the action fails to execute, the nodes in this list will be executed next. Optional, empty by default. -
timeout_next
: string | list<string, >
(Deprecated in version 2.x, but retains compatibility.on_error
is recommended instead.)
List of nodes to execute after a timeout. Optional, default is empty. -
inverse
: bool
Reverse the recognition result: recognized as not recognized, and not recognized as recognized. Optional, default is false.
Please note that nodes recognized through this setting will have their own clicking actions disabled (because nothing was actually recognized). If there is a need, you can set thetarget
separately. -
enabled
: bool
Whether to enable this node. Optional, default is true.
If set to false, this node will be skipped when it appears in the "next" lists of other nodes, meaning it won't be recognized or executed. -
pre_delay
: uint
Delay in milliseconds between recognizing a node and executing the action. Optional, default is 200 milliseconds.
It is recommended to add intermediate nodes whenever possible and use less delay to maintain both speed and stability. -
post_delay
: uint
Delay in milliseconds between executing the action and recognizing the "next" nodes. Optional, default is 200 milliseconds.
It is recommended to add intermediate nodes whenever possible and use less delay to maintain both speed and stability. -
pre_wait_freezes
: uint | object
Time in milliseconds to wait for the screen to stop changing between recognizing a node and executing the action. Optional, default is 0 (no waiting).
It will exit the action only when the screen has not had significant changes for "pre_wait_freezes" milliseconds in a row.
If it's an object, more parameters can be set, see Waiting for the Screen to Stabilize for details. The specific order ispre_wait_freezes
-pre_delay
-action
-post_wait_freezes
-post_delay
. -
post_wait_freezes
: uint | object
Time in milliseconds to wait for the screen to stop changing between executing the action and recognizing the "next" nodes. Optional, default is 0 (no waiting).
Other logic is the same aspre_wait_freezes
. -
focus
: any
Focus on the node, resulting in additional callback messages. Optional, default is null (no messages).
See Node Notifications for details.
Please refer to default_pipeline.json
Default
can set the default values of all fields. And the object of algorithm/action name can set the default parameter value of the corresponding algorithm/action.
Direct hit, meaning no recognition is performed, and the action is executed directly.
Template matching, also known as "find image."
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Recognition area coordinates. Optional, default [0, 0, 0, 0], i.e. full screen.- array<int, 4>: Recognition area coordinates, [x, y, w, h], if you want full screen, you can set it to [0, 0, 0, 0].
- string: Fill in the node name, and identify within the target range identified by a previously executed node.
-
roi_offset
: array<int, 4>
Move additionally based onroi
as the range, and add the four values separately. Optional, default [0, 0, 0, 0]. -
template
: string | list<string, >
Path to the template image, relative to the "image" folder. Required. The images used need to be cropped from the lossless original image and scaled to 720p. Reference to here. -
threshold
: double | list<double, >
Template matching threshold. Optional, default is 0.7.
If it's an array, its length should match the length of thetemplate
array. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Score
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification. -
method
: int
Template matching algorithm, equivalent to cv::TemplateMatchModes. Optional, default is 5.
Only supports 1, 3, and 5, with higher values providing greater accuracy but also taking more time.
For more details, refer to the OpenCV official documentation. -
green_mask
: bool
Whether to apply a green mask. Optional, default is false.
If set to true, you can paint the unwanted parts in the image green with RGB: (0, 255, 0), and those green parts won't be matched.
Feature matching, a more powerful "find image" with better generalization, resistant to perspective and size changes.
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
. -
roi_offset
: array<int, 4>
Same asTemplateMatch
.roi_offset
. -
template
: string | list<string, >
Path to the template image, relative to the "image" folder. Required. -
count
: int
The number of required matching feature points (threshold), default is 4. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Score
|Area
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification. -
green_mask
: bool
Whether to apply a green mask. Optional, default is false.
If set to true, you can paint the unwanted parts in the image green with RGB: (0, 255, 0), and those green parts won't be matched. -
detector
: string
Feature detector. Optional, default isSIFT
.
Currently, it supports the following algorithms:- SIFT
High computational complexity, scale invariance, and rotation invariance. Best performance. - KAZE
Suitable for 2D and 3D images, scale invariance, and rotation invariance. - AKAZE
Faster computation speed, scale invariance, and rotation invariance. - BRISK
Very fast computation speed, scale invariance, and rotation invariance. - ORB
Very fast computation speed, rotation invariance, but lacks scale invariance.
You can look up detailed characteristics of each algorithm on your own.
- SIFT
-
ratio
: double
The distance ratio for KNN matching, [0 - 1.0], where larger values make the matching more lenient (easier to connect). Optional, default is 0.6.
Color matching, also known as "find color."
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
. -
roi_offset
: array<int, 4>
Same asTemplateMatch
.roi_offset
. -
method
: int
Color matching method, equivalent to cv::ColorConversionCodes. Optional, default is 4 (RGB).
Common values are 4 (RGB, 3 channels), 40 (HSV, 3 channels), and 6 (GRAY, 1 channel).
For more details, refer to the OpenCV official documentation. -
lower
: list<int, > | list<list<int, >>
Lower bound for colors. Required. The innermost list length should match the number of channels in themethod
. -
upper
: list<int, > | list<list<int, >>
Upper bound for colors. Required. The innermost list length should match the number of channels in themethod
. -
count
: int
The threshold for the number of matching points required. Optional, default is 1. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Score
|Area
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification. -
connected
: bool
Whether to count only connected points. Optional, default is false.
If set to true, after applying color filtering, it will only count the maximum connected block of pixels. If set to false, it won't consider whether these pixels are connected.
Text recognition.
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
. -
roi_offset
: array<int, 4>
Same asTemplateMatch
.roi_offset
. -
expected
: string | list<string, >
The expected results, supports regular expressions. Required. -
threshold
: double
Model confidence threshold. Optional, default is 0.3. -
replace
: array<string, 2> | list<array<string, 2>>
Some text recognition results may not be accurate, so replacements are performed. Optional. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Area
|Length
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification. -
only_rec
: bool
Whether to recognize only (without detection, requires preciseroi
). Optional, default is false. -
model
: string
Model folder path. Use a relative path to the "model/ocr" folder. Optional, default is empty.
If empty, it will use the models in the root of the "model/ocr" folder. The folder should include three files:rec.onnx
,det.onnx
, andkeys.txt
.
Deep learning classification, to determine if the image in a fixed position matches the expected "category."
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
. -
roi_offset
: array<int, 4>
Same asTemplateMatch
.roi_offset
. -
labels
: list<string, >
Labels, meaning the names of each category. Optional.
It only affects debugging images and logs. If not filled, it will be filled with "Unknown." -
model
: string
Model file path. Use a relative path to the "model/classify" folder. Required.
Currently, only ONNX models are supported. -
expected
: int | list<int, >
The expected category index. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.
For example, if you want to recognize whether a cat or a mouse appears in a fixed position in the image, and you've trained a model that supports this three-category classification, and you want to click when it recognizes a cat or a mouse but not when it recognizes a dog, the relevant fields would be:
{
"labels": ["Cat", "Dog", "Mouse"],
"expected": [0, 2]
}
Please note that these values should match the actual model output.
Deep learning object detection, an advanced version of "find image."
The main difference from classification is the flexibility to find objects at arbitrary positions. However, this often requires more complex models, more training data, longer training times, and significantly higher resource usage during inference.
This algorithm property requires additional fields:
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
. -
roi_offset
: array<int, 4>
Same asTemplateMatch
.roi_offset
. -
labels
: list<string, >
Labels, meaning the names of each category. Optional.
It only affects debugging images and logs. If not filled, it will be filled with "Unknown." -
model
: string
Model file path. Use a relative path to the "model/detect" folder. Required.
Currently, only YoloV8 ONNX models are supported. -
expected
: int | list<int, >
The expected category index. -
threshold
: double | list<double, >
Model confidence threshold. Optional, default is 0.3.
If it's an array, its length should match the length of theexpected
array. -
order_by
: string
How the results are sorted. Optional, default isHorizontal
Possible values:Horizontal
|Vertical
|Area
|Random
You can use it with theindex
field. -
index
: int
Index to hit. Optional, default is0
.
If there are N results in total, the value range ofindex
is [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.
For example, if you want to detect cats, dogs, and mice in an image and only click when a cat or a mouse is detected but not when a dog is detected, the relevant fields would be:
{
"labels": ["Cat", "Dog", "Mouse"],
"expected": [0, 2]
}
Please note that these values should match the actual model output.
Execute the recognition handle passed in through the MaaResourceRegisterCustomRecognition
interface
This algorithm property requires additional fields:
-
custom_recognition
: string
Recognition name, same as the one passed in through the registration interface. It will also be passed throughMaaCustomRecognitionCallback
.custom_recognition_name
. Required. -
custom_recognition_param
: any
Recognition parameter, any type, will be passed throughMaaCustomRecognitionCallback
.custom_recognition_param
. Optional, default empty json, i.e.{}
-
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
, will be passed throughMaaCustomRecognitionCallback
.roi
. Optional, default [0, 0, 0, 0]. -
roi
: array<int, 4> | string
Same asTemplateMatch
.roi
.
Does nothing.
Clicks.
Additional properties for this action:
-
target
: true | string | array<int, 4>
The position to click. Optional, default is true.- true: Clicks the target just recognized in this node (i.e., clicks itself).
- string: Enter the node name to click a target recognized by a previously executed node.
- array<int, 4>: Clicks a random point within a fixed coordinate area [x, y, w, h]. To click the entire screen, set it to [0, 0, 0, 0].
-
target_offset
: array<int, 4>
Additional movement from thetarget
before clicking, where the four values are added together. Optional, default is [0, 0, 0, 0].
Linear swipe.
Additional properties for this action:
-
begin
: true | string | array<int, 4>
The starting point of the swipe. Optional, default is true. The values are the same asClick
.target
. -
begin_offset
: array<int, 4>
Additional movement from thebegin
before swiping, where the four values are added together. Optional, default is [0, 0, 0, 0]. -
end
: true | string | array<int, 4>
The end point of the swipe. Optional, default is true. The values are the same asClick
.target
. -
end_offset
: array<int, 4>
Additional movement from theend
before swiping, where the four values are added together. Optional, default is [0, 0, 0, 0]. -
duration
: uint
Duration of the swipe in milliseconds. Optional, default is 200.
Multi-finger linear swipe.
Additional properties for this action:
-
swipes
: list<object,>
Multi swipe array. Required.
The order of the array elements has no effect, only based onstarting
.-
starting
: uint
Swipe start time, in milliseconds. Optional, default 0.
MultiSwipe
additional field, the swipe will start pressing at thestarting
millisecond in this action. -
begin
: true | string | array<int, 4>
The starting point of the swipe. Optional, default is true. The values are the same asClick
.target
. -
begin_offset
: array<int, 4>
Additional movement from thebegin
before swiping, where the four values are added together. Optional, default is [0, 0, 0, 0]. -
end
: true | string | array<int, 4>
The end point of the swipe. Optional, default is true. The values are the same asClick
.target
. -
end_offset
: array<int, 4>
Additional movement from theend
before swiping, where the four values are added together. Optional, default is [0, 0, 0, 0]. -
duration
: uint
Duration of the swipe in milliseconds. Optional, default is 200.
-
For example:
{
"A": {
"action": "MultiSwipe",
"swipes": [
{
"begin": xxx,
"end": xxx
},
{
"starting": 500,
"begin": xxx,
"end": xxx
}
]
}
}
Presses a key.
-
key
: int | list<int, >
The key(s) to press, supporting only virtual key code of corresponding controller.
Inputs text.
input_text
: string
The text to input, some controller only supports ascii.
Starts an app.
Additional properties for this action:
package
: string
Launch entry. Required.
You need to enter the package name or activity, for example,com.hypergryph.arknights
orcom.hypergryph.arknights/com.u8.sdk.U8UnityContext
.
Closes an app.
Additional properties for this action:
package
: string
The app to close. Required.
You need to enter the package name, for example,com.hypergryph.arknights
.
Stops the current task chain (the individual task chain passed to MaaTaskerPostTask).
Execute a command.
This action attribute requires additional fields:
-
exec
: string
The path of the program to be executed. Required. -
args
: list<string,>
The arguments to be executed. Optional.
supports runtime parameters replacement:{ENTRY}
: Entry name.{NODE}
: Node name.{IMAGE}
: The path to the file where the screenshot is saved. The file is deleted before the process exits. Please copy it by yourself if you want to save it permanently.{BOX}
: Identify the hit target, the format is[x, y, w, h]
.{RESOURCE_DIR}
: The path of the resource folder loaded last time.{LIBRARY_DIR}
: The path of the folder where the MaaFW library is located.
-
detach
: bool
Detach the child process, that is, do not wait for the child process to complete, and continue directly to the next task. Optional, default false.
Example:
{
"NodeA": {
"action": "Command",
"exec": "Python",
"args": [
"{RESOURCE_DIR}/my_script/test.py"
"Haha",
"{IMAGE}",
"{NODE}",
"{BOX}"
]
},
"NodeB": {
"action": "Command",
"exec": "{RESOURCE_DIR}/my_exec/my_exec.exe"
}
}
The actual command is:
# NodeA
Python C:/MaaXXX/resource/my_script/test.py Haha C:/temp/123.png NodeA [0,0,0,0]
# NodeB
C:/MaaXXX/resource/my_exec/my_exec.exe
Execute the action handle passed in through the MaaResourceRegisterCustomAction
interface
This action attribute requires additional fields:
-
custom_action
: string
Action name, same as the identifier name passed in the registration interface. It will also be passed throughMaaCustomActionCallback
.custom_action_name
. Required. -
custom_action_param
: any
Action parameter, any type, will be passed throughMaaCustomActionCallback
.custom_action_param
. Optional, default empty json, i.e.{}
-
target
: true | string | array<int, 4>
Same asClick
.target
, will be passed throughMaaCustomActionCallback
.box
. Optional, default true. -
target_offset
: array<int, 4>
Same asClick
.target_offset
.
Waits for the screen to stabilize. It exits the action only when there is no significant change in the screen for a certain continuous time.
The field value can be a uint or an object. For example:
{
"NodeA": {
"pre_wait_freezes": 500
},
"NodeB": {
"pre_wait_freezes": {
// more properties ...
}
}
}
If the value is an object, you can set additional fields:
-
time
: uint
It exits the action only when there has been no significant change in the screen for "time" milliseconds in a row. Optional, default is 1. -
target
: true | string | array<int, 4>
The target to wait for. Optional, default is true. The values are the same asClick
.target
. -
target_offset
: array<int, 4>
Additional movement from thetarget
to be used as the waiting target, where the four values are added together. Optional, default is [0, 0, 0, 0]. -
threshold
: double
The template matching threshold to determine "no significant change." Optional, default is 0.95. -
method
: int
The template matching algorithm to determine "no significant change," i.e., cv::TemplateMatchModes. Optional, default is 5. The same asTemplateMatch
.method
. -
rate_limit
: uint
Identification rate limit, in milliseconds. Optional, default 1000.
Each identification consumes at leastrate_limit
milliseconds, and sleep will be executed if the time is less than that. -
timeout
: uint
Timeout for recognizing, in milliseconds. Optional, default is 20,000 milliseconds (20 seconds).
See Callback Protocol (not written yet).