Workflows
This guide provides a comprehensive overview of Temporal Workflows.
In day-to-day conversations, the term Workflow frequently denotes either a Workflow Type, a Workflow Definition What is a Workflow Definition? A Workflow Definition is the code that defines the constraints of a Workflow Execution. What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
Workflow Definition
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
A Workflow Definition is often also referred to as a Workflow Function. In Temporal's documentation, a Workflow Definition refers to the source for the instance of a Workflow Execution, while a Workflow Function refers to the source for the instance of a Workflow Function Execution.
A Workflow Execution effectively executes once to completion, while a Workflow Function Execution occurs many times during the life of a Workflow Execution.
We strongly recommend that you write a Workflow Definition in a language that has a corresponding Temporal SDK.
Deterministic constraints
A critical aspect of developing Workflow Definitions is ensuring they exhibit certain deterministic traits – that is, making sure that the same Commands are emitted in the same sequence, whenever a corresponding Workflow Function Execution (instance of the Function Definition) is re-executed.
The execution semantics of a Workflow Execution include the re-execution of a Workflow Function.
The use of Workflow APIs in the function is what generates Commands What is a Command? A Command is a requested action issued by a Worker to the Temporal Cluster after a Workflow Task Execution completes.
For example, using an SDK's "Execute Activity" API generates the ScheduleActivityTask Command. When this API is called upon re-execution, that Command is compared with the Event that is in the same location within the sequence. The Event in the sequence must be an ActivityTaskScheduled Event, where the Activity name is the same as what is in the Command.
If a generated Command doesn't match what it needs to in the existing Event History, then the Workflow Execution returns a non-deterministic error.
The following are the two reasons why a Command might be generated out of sequence or the wrong Command might be generated altogether:
- Code changes are made to a Workflow Definition that is in use by a running Workflow Execution.
- There is intrinsic non-deterministic logic (such as inline random branching).
Code changes can cause non-deterministic behavior
The Workflow Definition can change in very limited ways once there is a Workflow Execution depending on it. To alleviate non-deterministic issues that arise from code changes, we recommend using Workflow Versioning.
For example, let's say we have a Workflow Definition that defines the following sequence:
- Start and wait on a Timer/sleep.
- Spawn and wait on an Activity Execution.
- Complete.
We start a Worker and spawn a Workflow Execution that uses that Workflow Definition. The Worker would emit the StartTimer Command and the Workflow Execution would become suspended.
Before the Timer is up, we change the Workflow Definition to the following sequence:
- Spawn and wait on an Activity Execution.
- Start and wait on a Timer/sleep.
- Complete.
When the Timer fires, the next Workflow Task will cause the Workflow Function to re-execute. The first Command the Worker sees would be ScheduleActivityTask Command, which wouldn't match up to the expected TimerStarted Event.
The Workflow Execution would fail and return a non-deterministic error.
The following are examples of minor changes that would not result in non-determinism errors when re-executing a History which already contain the Events:
- Changing the duration of a Timer.
- Changing the arguments to:
- The Activity Options in a call to spawn an Activity Execution (local or nonlocal).
- The Child Workflow Options in a call to spawn a Child Workflow Execution.
- Call to Signal an External Workflow Execution.
Intrinsic non-deterministic logic
Intrinsic non-determinism is when a Workflow Function Execution might emit a different sequence of Commands on re-execution, regardless of whether all the input parameters are the same.
For example, a Workflow Definition can not have inline logic that branches (emits a different Command sequence) based off a local time setting or a random number.
In the representative pseudocode below, the local_clock()
function returns the local time, rather than Temporal-defined time:
fn your_workflow() {
if local_clock().is_before("12pm") {
await workflow.sleep(duration_until("12pm"))
} else {
await your_afternoon_activity()
}
}
Each Temporal SDK offers APIs that enable Workflow Definitions to have logic that gets and uses time, random numbers, and data from unreliable resources. When those APIs are used, the results are stored as part of the Event History, which means that a re-executed Workflow Function will issue the same sequence of Commands, even if there is branching involved.
In other words, all operations that do not purely mutate the Workflow Execution's state should occur through a Temporal SDK API.
Workflow Versioning
The Workflow Versioning feature enables the creation of logical branching inside a Workflow Definition based on a developer specified version identifier. This feature is useful for Workflow Definition logic needs to be updated, but there are running Workflow Executions that currently depends on it. It is important to note that a practical way to handle different versions of Workflow Definitions, without using the versioning API, is to run the different versions on separate Task Queues.
- How to version Workflow Definitions in Go
- How to version Workflow Definitions in Java
- How to version Workflow Definitions in TypeScript
Handling unreliable Worker Processes
You do not handle Worker Process failure or restarts in a Workflow Definition.
Workflow Function Executions are completely oblivious to the Worker Process in terms of failures or downtime. The Temporal Platform ensures that the state of a Workflow Execution is recovered and progress resumes if there is an outage of either Worker Processes or the Temporal Cluster itself. The only reason a Workflow Execution might fail is due to the code throwing an error or exception, not because of underlying infrastructure outages.
Workflow Type
A Workflow Type is a name that maps to a Workflow Definition.
- A single Workflow Type can be instantiated as multiple Workflow Executions.
- A Workflow Type is scoped by a Task Queue. It is acceptable to have the same Workflow Type name map to different Workflow definitions if they are using completely different Workers.
Workflow Type cardinality with Workflow Definitions and Workflow Executions
Workflow Execution
A Temporal Workflow Execution is a durable, reliable, and scalable function execution. It is the main unit of execution of a Temporal Application.
Each Temporal Workflow Execution has exclusive access to its local state.
It executes concurrently to all other Workflow Executions, and communicates with other Workflow Executions through Signals and the environment through Activities What is an Activity? In day-to-day conversations, the term "Activity" frequently denotes either an Activity Type, an Activity Definition, or an Activity Execution.
Durability
Durability is the absence of an imposed time limit.
A Workflow Execution is durable because it executes a Temporal Workflow Definition (also called a Temporal Workflow Function), your application code, effectively once and to completion—whether your code executes for seconds or years.
Reliability
Reliability is responsiveness in the presence of failure.
A Workflow Execution is reliable, because it is fully recoverable after a failure. The Temporal Platform ensures the state of the Workflow Execution persists in the face of failures and outages and resumes execution from the latest state.
Scalability
Scalability is responsiveness in the presence of load.
A single Workflow Execution is limited in size and throughput but is scalable because it can Continue-As-New What is Continue-As-New? Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History. What is a Temporal Cluster? A Temporal Cluster is the Temporal Server paired with persistence. What is a Worker Process? A Worker Process is responsible for polling a Task Queue, dequeueing a Task, executing your code in response to a Task, and responding to the Temporal Server with the results.
Commands and awaitables
A Workflow Execution does two things:
- Issue Commands.
What is a Command?
A Command is a requested action issued by a Worker to the Temporal Cluster after a Workflow Task Execution completes.
- Wait on an Awaitables (often called Futures).
Command generation and waiting
Commands are issued and Awaitables are provided by the use of Workflow APIs in the Workflow Definition What is a Workflow Definition? A Workflow Definition is the code that defines the constraints of a Workflow Execution.
Commands are generated whenever the Workflow Function is executed. The Worker Process supervises the Command generation and makes sure that it maps to the current Event History. (For more information, see Deterministic constraints.) The Worker Process batches the Commands and then suspends progress to send the Commands to the Cluster whenever the Workflow Function reaches a place where it can no longer progress without a result from an Awaitable.
A Workflow Execution may only ever block progress on an Awaitable that is provided through a Temporal SDK API. Awaitables are provided when using APIs for the following:
- Awaiting: Progress can block using explicit "Await" APIs.
- Requesting cancellation of another Workflow Execution: Progress can block on confirmation that the other Workflow Execution is cancelled.
- Sending a Signal: Progress can block on confirmation that the Signal sent.
- Spawning a Child Workflow Execution: Progress can block on confirmation that the Child Workflow Execution started, and on the result of the Child Workflow Execution.
What is a Child Workflow Execution?
A Child Workflow Execution is a Workflow Execution that is spawned from within another Workflow.
- Spawning an Activity Execution: Progress can block on the result of the Activity Execution.
What is an Activity Execution?
An Activity Execution is the full chain of Activity Task Executions.
- Starting a Timer: Progress can block until the Timer fires.
Status
A Workflow Execution can be either Open or Closed.
Workflow Execution statuses
Open
- Running: The only Open status for a Workflow Execution. When the Workflow Execution is Running, it is either actively progressing or is waiting on something.
Closed
A Closed status means that the Workflow Execution cannot make further progress because of one of the following reasons:
- Cancelled: The Workflow Execution successfully handled a cancellation request.
- Completed: The Workflow Execution has completed successfully.
- Continued-As-New: The Workflow Execution Continued-As-New.
What is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
- Failed: The Workflow Execution returned an error and failed.
- Terminated: The Workflow Execution was terminated.
- Timed Out: The Workflow Execution reached a timeout limit.
Workflow Execution Chain
A Workflow Execution Chain is a sequence of Workflow Executions that share the same Workflow Id. Each link in the Chain is often called a Workflow Run. Each Workflow Run in the sequence is connected by one of the following:
- Continue-As-New
What is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
- Retries
What is a Retry Policy?
A Retry Policy is a collection of attributes that instructs the Temporal Server how to retry a failure of a Workflow Execution or an Activity Task Execution.
- Temporal Cron Job
What is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
A Workflow Execution is uniquely identified by its Namespace, Workflow Id What is a Workflow Id? A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace. What is a Run Id? A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
The Workflow Execution Timeout What is a Workflow Execution Timeout? A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New. What is a Workflow Run Timeout? This is the maximum amount of time that a single Workflow Run is restricted to.
Event loop
A Workflow Execution is made up of a sequence of Events What is an Event? Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution. What is an Event History? An append log of Events that represents the full state a Workflow Execution.
Workflow Execution
Time constraints
Is there a limit to how long Workflows can run?
No, there is no time constraint on how long a Workflow Execution can be Running.
However, Workflow Executions intended to run indefinitely should be written with some care. The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution. There is a hard limit of 50,000 Events in a Workflow Execution Event History, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events. When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.
To prevent runaway Workflow Executions, you can use the Workflow Execution Timeout, the Workflow Run Timeout, or both. A Workflow Execution Timeout can be used to limit the duration of Workflow Execution Chain, and a Workflow Run Timeout can be used to limit the duration an individual Workflow Execution (Run).
You can use the Continue-As-New What is Continue-As-New? Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
Command
A Command is a requested action issued by a Worker What is a Worker? In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them. What is a Temporal Cluster? A Temporal Cluster is the Temporal Server paired with persistence. What is a Workflow Task Execution? A Workflow Task Execution is when a Worker picks up a Workflow Task and uses it to make progress on the execution of a Workflow function.
The action that the Cluster takes is recorded in the Workflow Execution's Event History What is an Event History? An append log of Events that represents the full state a Workflow Execution. What is an Event? Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
Commands are generated by the use of Workflow APIs in your code. During a Workflow Task Execution there may be several Commands that are generated. The Commands are batched and sent to the Cluster as part of the Workflow Task Execution completion request, after the Workflow Task has progressed as far as it can with the Workflow function. There will always be WorkflowTaskStarted and WorkflowTaskCompleted Events in the Event History when there is a Workflow Task Execution completion request.
Commands are generated by the use of Workflow APIs in your code
Commands are described in the Command reference and are defined in the Temporal gRPC API.
Event
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution. Each Event corresponds to an enum
that is defined in the Server API.
All Events are recorded in the Event History What is an Event History? An append log of Events that represents the full state a Workflow Execution.
A list of all possible Events that could appear in a Workflow Execution Event History is provided in the Event reference.
Event History
An append-log of Events What is an Event? Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
- Event History is durably persisted by the Temporal service, enabling seamless recovery of your application state from crashes or failures.
- It also serves as an audit log for debugging.
Event History limits The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution. There is a hard limit of 50,000 Events in a Workflow Execution Event History, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events. When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.
Continue-As-New
Continue-As-New is a mechanism by which the latest relevant state is passed to a new Workflow Execution, with a fresh Event History.
As a precautionary measure, the Temporal Platform limits the total Event History What is an Event History? An append log of Events that represents the full state a Workflow Execution.
All values passed to a Workflow Execution through parameters or returned through a result value are recorded into the Event History. A Temporal Cluster stores the full Event History of a Workflow Execution for the duration of a Namespace's retention period. A Workflow Execution that periodically executes many Activities has the potential of hitting the size limit.
A very large Event History can adversely affect the performance of a Workflow Execution. For example, in the case of a Workflow Worker failure, the full Event History must be pulled from the Temporal Cluster and given to another Worker via a Workflow Task. If the Event history is very large, it may take some time to load it.
The Continue-As-New feature enables developers to complete the current Workflow Execution and start a new one atomically.
The new Workflow Execution has the same Workflow Id, but a different Run Id, and has its own Event History.
In the case of Temporal Cron Jobs What is a Temporal Cron Job? A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Run Id
A Run Id is a globally unique, platform-level identifier for a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
Temporal guarantees that only one Workflow Execution with a given Workflow Id What is a Workflow Id? A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace.
A Run Id uniquely identifies a Workflow Execution even if it shares a Workflow Id with other Workflow Executions.
Don't rely on storing the current Run Id or using it for any logical choices. A Workflow Retry changes the Run Id. Because the current Run Id is mutable, relying on it might produce non-determinism issues,
Learn more
For more information, see the following links.
Workflow Id
A Workflow Id is a customizable, application-level identifier for a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
A Workflow Id is meant to be a business-process identifier such as customer identifier or order identifier.
A Workflow Id Reuse Policy What is a Workflow Id Reuse Policy? A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.
It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution, regardless of the Workflow Id Reuse Policy.
An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow execution already started
error.
A Workflow Execution can be uniquely identified across all Namespaces by its Namespace, Workflow Id, and Run Id What is a Run Id? A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
Workflow Id Reuse Policy
A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.
It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution.
An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow execution already started
error.
A Workflow Id Reuse Policy has three possible values:
- Allow Duplicate The Workflow Execution is allowed to exist regardless of the Closed status of a previous Workflow Execution with the same Workflow Id. This is the default policy, if one is not specified. Use this when it is OK to have a Workflow Execution with the same Workflow Id as a previous, but now Closed, Workflow Execution.
- Allow Duplicate Failed Only: The Workflow Execution is allowed to exist only if a previous Workflow Execution with the same Workflow Id does not have a Completed status. Use this policy when there is a need to re-execute a Failed, Timed Out, Terminated or Cancelled Workflow Execution and guarantee that the Completed Workflow Execution will not be re-executed.
- Reject Duplicate: The Workflow Execution cannot exist if a previous Workflow Execution has the same Workflow Id, regardless of the Closed status. Use this when there can only be one Workflow Execution per Workflow Id within a Namespace for the given retention period.
A Workflow Id Reuse Policy applies only if a Closed Workflow Execution with the same Workflow Id exists within the Retention Period of the associated Namespace. For example, if the Namespace's retention period is 30 days, a Workflow Id Reuse Policy can only compare the Workflow Id of the spawning Workflow Execution against the Closed Workflow Executions for the last 30 days.
If there is an attempt to spawn a Workflow Execution with a Workflow Id Reuse Policy that won't allow it the Server will prevent the Workflow Execution from spawning.
Workflow Execution Timeout
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Workflow Execution Timeout period
The default value is ∞ (infinite).
If this timeout is reached, the Workflow Execution changes to a Timed Out status.
This timeout is different from the Workflow Run Timeout What is a Workflow Run Timeout? This is the maximum amount of time that a single Workflow Run is restricted to. What is a Temporal Cron Job? A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Workflow Run Timeout
A Workflow Run Timeout is the maximum amount of time that a single Workflow Run is restricted to.
Workflow Run Timeout period
The default is set to the same value as the Workflow Execution Timeout What is a Workflow Execution Timeout? A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New. What is a Temporal Cron Job? A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
If the Workflow Run Timeout is reached, the Workflow Execution is Terminated.
Workflow Task Timeout
A Workflow Task Timeout is the maximum amount of time allowed for a Worker What is a Worker? In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them. What is a Workflow Task? A Workflow Task is a Task that contains the context needed to make progress with a Workflow Execution. What is a Task Queue? A Task Queue is a first-in, first-out queue that a Worker Process polls for Tasks.
Workflow Task Timeout period
The default value is 10 seconds. This timeout is primarily available to recognize whether a Worker has gone down so that the Workflow Execution can be recovered on a different Worker. The main reason for increasing the default value would be to accommodate a Workflow Execution that has a very long Workflow Execution History that could take longer than 10 seconds for the Worker to load.
Implementation guides:
Signal
A Signal is an asynchronous request to a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
A Signal delivers data to a running Workflow Execution. It cannot return data to the caller; to do so, use a Query instead. The Workflow code that handles a Signal can mutate Workflow state. A Signal can be sent from a Temporal Client or a Workflow. When a Signal is sent, it is received by the Cluster and recorded as an Event to the Workflow Execution Event History. A successful response from the Cluster means that the Signal has been persisted and will be delivered at least once to the Workflow Execution.1 The next scheduled Workflow Task will contain the Signal Event.
A Signal must include a destination (Namespace and Workflow Id) and name. It can include a list of arguments.
Signal handlers are Workflow functions that listen for Signals by the Signal name. Signals are delivered in the order they are received by the Cluster. If multiple deliveries of a Signal would be a problem for your Workflow, add idempotency logic to your Signal handler that checks for duplicates.
Query
A Query is a synchronous operation that is used to get the state of a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
Queries are sent from a Temporal Client to a Workflow Execution. The API call is synchronous. The Query is identified at both ends by a Query name. The Workflow must have a Query handler that is developed to handle that Query and provide data that represents the state of the Workflow Execution.
Queries are strongly consistent and are guaranteed to return the most recent state. This means that the data reflects the state of all confirmed Events that came in before the Query was sent. An Event is considered confirmed if the call creating the Event returned success. Events that are created while the Query is outstanding may or may not be reflected in the Workflow state the Query result is based on.
A Query can carry arguments to specify the data it is requesting. And each Workflow can expose data to multiple types of Queries.
A Query must never mutate the state of the Workflow Execution—that is, Queries are read-only and cannot contain any blocking code. This means, for example, that Query handling logic cannot schedule Activity Executions.
Sending Queries to completed Workflow Executions is supported, though Query reject conditions can be configured per Query.
Stack Trace Query
In many SDKs, the Temporal Client exposes a predefined __stack_trace
Query that returns the stack trace of all the threads owned by that Workflow Execution.
This is a great way to troubleshoot a Workflow Execution in production.
For example, if a Workflow Execution has been stuck at a state for longer than an expected period of time, you can send a __stack_trace
Query to return the current call stack.
The __stack_trace
Query name does not require special handling in your Workflow code.
Stack Trace Queries are available only for running Workflow Executions.
Child Workflow
A Child Workflow Execution is a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application.
A Workflow Execution can be both a Parent and a Child Workflow Execution because any Workflow can spawn another Workflow.
Parent and Child Workflow Execution entity relationship
A Parent Workflow Execution must await on the Child Workflow Execution to spawn.
The Parent can optionally await on the result of the Child Workflow Execution.
Consider the Child's Parent Close Policy What is a Parent Close Policy? If a Workflow Execution is a Child Workflow Execution, a Parent Close Policy determines what happens to the Workflow Execution if its Parent Workflow Execution changes to a Closed status (Completed, Failed, Timed out).
When a Parent Workflow Execution reaches a Closed status, the Cluster propagates Cancellation Requests or Terminations to Child Workflow Executions depending on the Child's Parent Close Policy.
If a Child Workflow Execution uses Continue-As-New, from the Parent Workflow Execution's perspective the entire chain of Runs is treated as a single execution.
Parent and Child Workflow Execution entity relationship with Continue As New
When to use Child Workflows
Consider Workflow Execution Event History size limits.
An individual Workflow Execution has an Event History What is an Event History? An append log of Events that represents the full state a Workflow Execution.
On one hand, because Child Workflow Executions have their own Event Histories, they are often used to partition large workloads into smaller chunks.
For example, a single Workflow Execution does not have enough space in its Event History to spawn 100,000 Activity Executions What is an Activity Execution? An Activity Execution is the full chain of Activity Task Executions.
However, because a Parent Workflow Execution Event History contains Events What is an Event? Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
In general, however, Child Workflow Executions result in more overall Events recorded in Event Histories than Activities. Because each entry in an Event History is a cost in terms of compute resources, this could become a factor in very large workloads. Therefore, we recommend starting with a single Workflow implementation that uses Activities until there is a clear need for Child Workflows.
Consider each Child Workflow Execution as a separate service.
Because a Child Workflow Execution can be processed by a completely separate set of Workers What is a Worker? In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them.
Consider that a single Child Workflow Execution can represent a single resource.
As all Workflow Executions, a Child Workflow Execution can create a one to one mapping with a resource. For example, a Workflow that manages host upgrades could spawn a Child Workflow Execution per host.
Parent Close Policy
A Parent Close Policy determines what happens to a Child Workflow Execution What is a Child Workflow Execution? A Child Workflow Execution is a Workflow Execution that is spawned from within another Workflow.
There are three possible values:
- Abandon: the Child Workflow Execution is not affected.
- Request Cancel: a Cancellation request is sent to the Child Workflow Execution.
- Terminate (default): the Child Workflow Execution is forcefully Terminated.
ParentClosePolicy
proto definition.
Each Child Workflow Execution may have its own Parent Close Policy. This policy applies only to Child Workflow Executions and has no effect otherwise.
Parent Close Policy entity relationship
You can set policies per child, which means you can opt out of propagating terminates / cancels on a per-child basis. This is useful for starting Child Workflows asynchronously (see relevant issue here or the corresponding SDK docs).
Temporal Cron Job
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Temporal Cron Job timeline
A Temporal Cron Job is similar to a classic unix cron job. Just as a unix cron job accepts a command and a schedule on which to execute that command, a Cron Schedule can be provided with the call to spawn a Workflow Execution. If a Cron Schedule is provided, the Temporal Server will spawn an execution for the associated Workflow Type per the schedule.
Each Workflow Execution within the series is considered a Run.
- Each Run receives the same input parameters as the initial Run.
- Each Run inherits the same Workflow Options as the initial Run.
The Temporal Server spawns the first Workflow Execution in the chain of Runs immediately.
However, it calculates and applies a backoff ( What is a Workflow Run Timeout? This is the maximum amount of time that a single Workflow Run is restricted to.firstWorkflowTaskBackoff
) so that the first Workflow Task of the Workflow Execution does not get placed into a Task Queue until the scheduled time.
After each Run Completes, Fails, or reaches the Workflow Run TimeoutfirstWorkflowTaskBackoff
that is calculated based on the current Server time and the defined Cron Schedule.
The Temporal Server spawns the next Run only after the current Run has Completed, Failed, or has reached the Workflow Run Timeout.
This means that, if a Retry Policy has also been provided, and a Run Fails or reaches the Workflow Run Timeout, the Run will first be retried per the Retry Policy until the Run Completes or the Retry Policy has been exhausted.
If the next Run, per the Cron Schedule, is due to spawn while the current Run is still Open (including retries), the Server automatically starts the new Run after the current Run completes successfully.
The start time for this new Run and the Cron definitions are used to calculate the firstWorkflowTaskBackoff
that is applied to the new Run.
A Workflow Execution Timeout What is a Workflow Execution Timeout? A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Temporal Cron Job Run Failure with a Retry Policy
Cron Schedules
Cron Schedules are interpreted in UTC time by default.
The Cron Schedule is provided as a string and must follow one of two specifications:
Classic specification
This is what the "classic" specification looks like:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
For example, 15 8 * * *
causes a Workflow Execution to spawn daily at 8:15 AM UTC.
Use the crontab guru site to test your cron expressions.
robfig
predefined schedules and intervals
You can also pass any of the predefined schedules or intervals described in the robfig/cron
documentation.
| Schedules | Description | Equivalent To |
| ---------------------- | ------------------------------------------ | ------------- |
| @yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 1 1 * |
| @monthly | Run once a month, midnight, first of month | 0 0 1 * * |
| @weekly | Run once a week, midnight between Sat/Sun | 0 0 * * 0 |
| @daily (or @midnight) | Run once a day, midnight | 0 0 * * * |
| @hourly | Run once an hour, beginning of hour | 0 * * * * |
For example, "@weekly" causes a Workflow Execution to spawn once a week at midnight between Saturday and Sunday.
Intervals just take a string that can be accepted by time.ParseDuration.
@every <duration>
Time zones
This feature only applies in Temporal 1.15 and up
You can change the time zone that a Cron Schedule is interpreted in by prefixing the specification with CRON_TZ=America/New_York
(or your desired time zone from tz). CRON_TZ=America/New_York 15 8 * * *
therefore spawns a Workflow Execution every day at 8:15 AM New York time, subject to caveats listed below.
Consider that using time zones in production introduces a surprising amount of complexity and failure modes! If at all possible, we recommend specifying Cron Schedules in UTC (the default).
If you need to use time zones, here are a few edge cases to keep in mind:
- Beware Daylight Saving Time: If a Temporal Cron Job is scheduled around the time when daylight saving time (DST) begins or ends (for example,
30 2 * * *
), it might run zero, one, or two times in a day! The Cron library that we use does not do any special handling of DST transitions. Avoid schedules that include times that fall within DST transition periods.- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
1:59 … 1:00 … 1:01 … 1:59 … 2:00 … 2:01 AM
and any Cron jobs that fall in that 1 AM hour are fired again. The inverse happens when clocks "spring forward" for DST, and Cron jobs that fall in the 2 AM hour are skipped. - In other time zones like Chile and Iran, DST "spring forward" is at midnight. 11:59 PM is followed by 1 AM, which means
00:00:00
never happens.
- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
- Self Hosting note: If you manage your own Temporal Cluster, you are responsible for ensuring that it has access to current
tzdata
files. The official Docker images are built with tzdata installed (provided by Alpine Linux), but ultimately you should be aware of how tzdata is deployed and updated in your infrastructure. - Updating Temporal: If you use the official Docker images, note that an upgrade of the Temporal Cluster may include an update to the tzdata files, which may change the meaning of your Cron Schedule. You should be aware of upcoming changes to the definitions of the time zones you use, particularly around daylight saving time start/end dates.
- Absolute Time Fixed at Start: The absolute start time of the next Run is computed and stored in the database when the previous Run completes, and is not recomputed. This means that if you have a Cron Schedule that runs very infrequently, and the definition of the time zone changes between one Run and the next, the Run might happen at the wrong time. For example,
CRON_TZ=America/Los_Angeles 0 12 11 11 *
means "noon in Los Angeles on November 11" (normally not in DST). If at some point the government makes any changes (for example, move the end of DST one week later, or stay on permanent DST year-round), the meaning of that specification changes. In that first year, the Run happens at the wrong time, because it was computed using the older definition.
How to stop a Temporal Cron Job
A Temporal Cron Job does not stop spawning Runs until it has been Terminated or until the Workflow Execution Timeout What is a Workflow Execution Timeout? A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
A Cancellation Request affects only the current Run.
Use the Workflow Id in any requests to Cancel or Terminate.
Implementation guides:
- How to set a Cron Schedule in Go
- How to set a Cron Schedule in Java
- How to set a Cron Schedule in PHP
- How to set a Cron Schedule in Typescript
Schedule
A Schedule contains instructions for starting a Workflow Execution What is a Workflow Execution? A Temporal Workflow Execution is a durable, scalable, reliable, and reactive function execution. It is the main unit of execution of a Temporal Application. What is a Temporal Cron Job? A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
A Schedule has an identity and is independent of a Workflow Execution. This differs from a Temporal Cron Job, which relies on a cron schedule as a property of the Workflow Execution.
Action
The Action of a Schedule is where the Workflow Execution properties are established, such as Workflow Type, Task Queue, parameters, and timeouts.
Workflow Executions started by a Schedule have the following additional properties:
- The Action's timestamp is appended to the Workflow Id.
- The
TemporalScheduledStartTime
Search Attributeis added to the Workflow Execution. The value is the Action's timestamp.What is a Search Attribute?
A Search Attribute is an indexed name used in List Filters to filter a list of Workflow Executions that have the Search Attribute in their metadata.
- The
TemporalScheduledById
Search Attribute is added to the Workflow Execution. The value is the Schedule Id.
Spec
The Schedule Spec describes when the Action is taken. There are two kinds of Schedule Spec:
- A simple interval, like "every 30 minutes" (aligned to start at the Unix epoch, and optionally including a phase offset).
- A calendar-based expression, similar to the "cron expressions" supported by lots of software, including the older Temporal Cron feature.
These two kinds have multiple representations, depending on the interface or SDK you're using, but they all support the same features.
In tctl, for example, an interval is specified as a string like 45m
to mean every 45 minutes, or 6h/5h
to mean every 6 hours but at the start of the fifth hour within each period.
In tctl, a calendar expression can be specified as either a traditional cron string with five (or six or seven) positional fields, or as JSON with named fields:
{
"year": "2022",
"month": "Jan,Apr,Jul,Oct",
"dayOfMonth": "1,15",
"hour": "11-14"
}
The following calendar JSON fields are available:
year
month
dayOfMonth
dayOfWeek
hour
minute
second
comment
Each field can contain a comma-separated list of ranges (or the *
wildcard), and each range can include a slash followed by a skip value.
The hour
, minute
, and second
fields default to 0
while the others default to *
, so you can describe many useful specs with only a few fields.
For month
, names of months may be used instead of integers (case-insensitive, abbreviations permitted).
For dayOfWeek
, day-of-week names may be used.
The comment
field is optional and can be used to include a free-form description of the intent of the calendar spec, useful for complicated specs.
No matter which form you supply, calendar and interval specs are converted to canonical representations. What you see when you "describe" or "list" a Schedule might not look exactly like what you entered, but it has the same meaning.
Other Spec features:
Multiple intervals/calendar expressions: A Spec can have combinations of multiple intervals and/or calendar expressions to define a specific Schedule.
Time bounds: Provide an absolute start or end time (or both) with a Spec to ensure that no actions are taken before the start time or after the end time.
Exclusions: A Spec can contain exclusions in the form of zero or more calendar expressions. This can be used to express scheduling like "each Monday at noon except for holidays. You'll have to provide your own set of exclusions and include it in each schedule; there are no pre-defined sets. (This feature isn't currently exposed in tctl or the Temporal Web UI.)
Jitter: If given, a random offset between zero and the maximum jitter is added to each Action time (but bounded by the time until the next scheduled Action).
Time zones: By default, calendar-based expressions are interpreted in UTC. Temporal recommends using UTC to avoid various surprising properties of time zones. If you don't want to use UTC, you can provide the name of a time zone. The time zone definition is loaded on the Temporal Server Worker Service from either disk or the fallback embedded in the binary.
For more operational control, embed the contents of the time zone database file in the Schedule Spec itself. (Note: this isn't currently exposed in tctl or the web UI.)
Pause
A Schedule can be Paused. When a Schedule is Paused, the Spec has no effect. However, you can still force manual actions by using the tctl schedule trigger command.
To assist communication among developers and operators, a “notes” field can be updated on pause or resume to store an explanation for the current state.
Backfill
A Schedule can be Backfilled.
When a Schedule is Backfilled, all the Actions that would have been taken over a specified time period are taken now (in parallel if the AllowAll
Overlap Policy is used; sequentially if BufferAll
is used).
You might use this to fill in runs from a time period when the Schedule was paused due to an external condition that's now resolved, or a period before the Schedule was created.
Limit number of Actions
A Schedule can be limited to a certain number of scheduled Actions (that is, not trigger immediately). After that it will act as if it were paused.
Policies
A Schedule supports a set of Policies that enable customizing behavior.
Overlap Policy
The Overlap Policy controls what happens when it is time to start a Workflow Execution but a previously started Workflow Execution is still running. The following options are available:
Skip
: Default. Nothing happens; the Workflow Execution is not started.BufferOne
: Starts the Workflow Execution as soon as the current one completes. The buffer is limited to one. If another Workflow Execution is supposed to start, but one is already in the buffer, only the one in the buffer eventually starts.BufferAll
: Allows an unlimited number of Workflows to buffer. They are started sequentially.CancelOther
: Cancels the running Workflow Execution, and then starts the new one after the old one completes cancellation.TerminateOther
: Terminates the running Workflow Execution and starts the new one immediately.AllowAll
Starts any number of concurrent Workflow Executions. With this policy (and only this policy), more than one Workflow Execution, started by the Schedule, can run simultaneously.
Catchup Window
The Temporal Cluster might be down or unavailable at the time when a Schedule should take an Action. When it comes back up, the Catchup Window controls which missed Actions should be taken at that point. The default is one minute, which means that the Schedule attempts to take any Actions that wouldn't be more than one minute late. An outage that lasts longer than the Catchup Window could lead to missed Actions. (But you can always Backfill.)
Pause-on-failure
If this policy is set, a Workflow Execution started by a Schedule that ends with a failure or timeout (but not Cancellation or Termination) causes the Schedule to automatically pause.
Note that with the AllowAll
Overlap Policy, this pause might not apply to the next Workflow Execution, because the next Workflow Execution might have started before the failed one finished.
It applies only to Workflow Executions that were scheduled to start after the failed one finished.
Last completion result
A Workflow started by a Schedule can obtain the completion result from the most recent successful run. (How you do this depends on the SDK you're using.)
For overlap policies that don't allow overlap, “the most recent successful run” is straightforward to define.
For the AllowAll
policy, it refers to the run that completed most recently, at the time that the run in question is started.
Consider the following overlapping runs:
time -------------------------------------------->
A |----------------------|
B |-------|
C |---------------|
D |--------------T
If D asks for the last completion result at time T, it gets the result of A. Not B, even though B started more recently, because A completed later. And not C, even though C completed after A, because the result for D is captured when D is started, not when it's queried.
Failures and timeouts do not affect the last completion result.
Last failure
A Workflow started by a Schedule can obtain the details of the failure of the most recent run that ended at the time when the Workflow in question was started. Unlike last completion result, a successful run does reset the last failure.
Limitations
The Scheduled Workflows feature is available in Temporal Server version 1.18.
Internally, a Schedule is implemented as a Workflow. If you're using Advanced Visibility (Elasticsearch), these Workflow Executions are hidden from normal views. If you're using Standard Visibility, they are visible, though there's no need to interact with them directly.
Native support for Schedules in language SDKs is coming soon.
For now, tctl
and the web UI are the main interfaces to Schedules.
For advanced use, you can also use the gRPC API by getting a WorkflowServiceClient
object from the SDK and calling methods such as CreateSchedule
.
- The Cluster usually deduplicates Signals, but does not guarantee deduplication: During shard migration, two Signal Events (and therefore two deliveries to the Workflow Execution) can be recorded for a single Signal because the deduping info is stored only in memory.↩