In this blog, we’ll discuss how Matillion’s variables can help ease the management of metadata configurations across components, jobs, and environments. With the help of variables, we can simplify the management of projects that can quickly become cumbersome to manage in Matillion.
What is Matillion ETL?
Matillion ETL is a platform designed to help you speed up your data pipeline development by connecting it to many different data sources, enabling teams to rapidly integrate and build sophisticated data transformations in a cloud environment with a very intuitive low-code/no-code GUI.
Why Do Variables Matter in Matillion?
Variables can be used in various parameters and expressions, allowing users to pass and centralize environment-specific configurations. This allows code in Matillion to be consistent and repeatable.Â
There are two primary types of variables in Matillion. They can be scalar values configured by environment and job variables. They can also take on a 2D-Array form used in Grid Variables.
Variables in Matillion are name-value pairs that can pass configurations across jobs and environments. They are also accessible through Python and Bash scripts, which can be substituted at run time.Â
There are four main variable types available for use in Matillion. They are as follows:Â
Grid Variables
Job Variables
Environment Variables
Automatic Variables
Job Variables
As you might expect, job variables are defined within an individual job. They will also override environment variables of the same name within that specific job. They will be included when jobs are imported and exported.
Creating Job Variables
We start by navigating the job variables window to configure a job variable. We can achieve this in two ways. First, we can navigate to the menu by right-clicking a job, clicking Variables, and clicking Manage Job Variables.
Alternatively, you can click into a job, right-click anywhere within the job’s design grid, and navigate to the Manage Job Variables option on the menu.
Job Variables Menu
Within the Job Variables Menu is some configurations that must be defined. They are as follows:
Environment Variables
Environment Variables are defined across a given environment. Key-value (or values) pairs can act as job variables passed to other jobs. To navigate the menu, click on the Project icon in the top left corner, then Manage Environment Variables in the Environment Variables Menu. Like the Job Variables Menu, some configurations must be filled in.
What makes the Environment Variables menu unique is that there are columns that represent the different environments available in a given project.Â
For each of these columns, an additional default value can be set. This can be particularly useful if you have Matillion jobs consistent across environments, such as Development, QA, and Production.Â
Rather than having to hard code each individual variable in each job, you can set them per environment all in one place and then call the variables in the job consistently. This helps maintain consistency in your jobs.
Automatic Variables
Another beneficial aspect of Environment Variables is the preset Automatic Variables. These are pre-defined in Matillion, and their values cannot be changed.
Variable | Type | Description |
| Text | Name of the current project group. It can be set via Rename Project Group from the Project menu. |
| Numeric | Internal ID of the current project group. |
| Text | Name of the current project. It can be set via Manage Project from the Project menu. |
| Numeric | Internal ID of the current project. |
| Text | Name of the current version. Versions can be renamed via Rename Version from the Project menu unless locked. Read Version Control for more information. |
| Numeric | The internal ID of the current version. |
| Text | Name of the current environment. This can be set by right-clicking the environment in the Environments panel and selecting Edit Environment. Read Environments for more information. |
| Text | The name of the default schema for the current environment. |
| Text | The name of the database for the current environment. |
| Text | URL of the Matillion ETL instance with the current environment. |
| Numeric | The internal ID of the current environment. |
| Numeric | The port number of the current environment. |
| Text | Username for the environment connection. |
| Text | The name of the current job. This can be set by right-clicking the job in the Project panel and selecting Manage Job. |
| Numeric | The internal ID of the current job. All jobs have a unique ID that can be used to refer to it within a project. Note that this is not the ID of a particular run of a job. |
| Text | The name of the current component, as defined by the user. Components can be renamed by selecting them and editing the Name property. |
| Numeric | The internal ID of a given component in Matillion ETL. |
| Text | An error message returned by a component can be used in job error handling. |
| Numeric | The ID of a task in Matillion ETL. These can also be viewed via Task History. |
| Text | A detailed error description that contains the |
| Numeric | The task ID is associated with the current running job. |
Grid Variables
Grid Variables are a special type of job variable. They are 2D arrays that hold scalar values in named columns. They can be used in many different components, mainly using the Use Grid Variable checkbox in the component property menu.
How to set up a Grid Variable
1. In a transformation job, right-click and select Manage Grid Variables on the dropdown menu.
2. This is the Grid Variables management section. Click the + sign to set up a new grid variable.
When the Use Grid Variable checkbox is selected, you are moved to a menu that displays the configuration settings that need to be filled out.
Populating Grid Variables
There are a couple of ways to populate the values for a grid variable. First, we can set default values like other variable types, like the example below.
Using the Append to Grid job component, we can also populate data into a grid variable. This component can either populate a grid variable with fixed inputs or grab data from another grid variable.
The following are the settings for this component.
Name | Name of the Component |
Append/Prepend | Determining whether to append or prepend data to the grid |
Target Grid Variable | Name of the grid variable being populated |
Fixed / Grid | Choose between fixed inputs or populating from a different grid variable |
Values Grid (for Fixed input) | Fixed user inputs that will populate a grid variable |
Source Grid Variable (for Grid Input) | Source Grid Variable if Grid is chosen |
Column Mapping (for Grid Input) | Mapping columns from source to target grid variable |
Values Grid Example:
Column Mapping Example:
Variable Behavior
It’s essential to understand how different types of variables behave in Matillion. Variable information can be passed onto other components or jobs in some cases.Â
Behavior can be broken down into two different sets of categories. The first is shared and copied. The second is job and environment.
Shared vs. Copied Variables
Copied variables can be updated within individual branches of a job and can do so without affecting other branches. A branch using a copied variable will begin by taking its default value, even if this variable has been updated in another parallel branch.Â
Shared variables, in contrast, are updated across all branches of a job. Think of this like local and global variables in other systems.
To demonstrate how this behavior works, let’s use a simple example.Â
For this example, we’ve created two simple variables, phData_copy_example
and phData_shared_example
, and initialized them with the values 2 and 1.
We’ve then created a simple fixed-flow population, using the values from these variables to populate a simple table. For this example, the results are truncated each run.
The syntax in which we call variables is ${variable_name}. Below is an example of calling the two variables we established and using them in our fixed flow.
If we run this job locally, we can see the two columns populated with the above values.
To demonstrate how the shared and copied variables are affected, we’ll create a simple job that runs the fixed flow with an iterator that updates the variables. We’ll first do this.
We’ll update the copy and shared values in this iteration to 4 and 3.
After running the job, we see the table populated with the new values 4 and 3.
Now, let’s add a second iteration of the run, but this time, without its own iterator that updates the variable values.
When we check the table’s contents, we see that the shared value is still 3, and the copied value is 1. This is because the shared variable from our iterator was updated for both components, which means the shared value was populated with a 3 twice.Â
Conversely, the copied variable was only updated in the first iteration of the job. So, when the job ran a second time, the Copied column populated with the original default value of 1.
This is an odd example, but it shows the difference between copied and shared variables.
Environment vs. Job Variables
Environment and job variables have a similar parallel to global and local variables. Environment variables are global. They can be used across Matillion ETL across many components and jobs.Â
Job variables are local. They are defined within the scope of a job. Importantly, they also override environment variables of the same name.
To demonstrate this, we’ll create a simple variable at the environment level with the default value environment
.
We’ll then create a simple data flow that populates a table with the variable value.
When we run the job, we see that the Variable_Type
column is populated with environment
.
Now, let’s make a job variable with the same name and give it a different default value of job
.
When we look at the new output, we see that the variable_type
column is now populated with job
. This is because the job variable took precedence over the environment variable, similar to how a local variable overrides a global variable in other programming languages.
Closing
Within this blog, we learned about the different variable types available in Matillion and saw examples of how they work. They can be used in various components and jobs to pass data along and make job development more consistent and efficient for your organization.
Are you looking for more Matillion assistance? phData excels at assisting organizations in achieving success with Matillion. Contact us today for help, advice, and actionable strategies.