Packt+ | Advance your knowledge in tech

You're reading from Pentaho Data Integration Quick Start Guide Create ETL processes using Pentaho

Product type Paperback

Published in Aug 2018

Publisher Packt

ISBN-13 9781789343328

Length 178 pages

Edition 1st Edition

Languages

Java

Tools

Pentaho

Concepts

Business Intelligence

Author (1):

Carina Roldán

View More author details

Table of Contents (15) Chapters

Title Page

Dedication

Packt Upsell

Foreword

Contributors

Preface

1. Getting Started with PDI FREE CHAPTER

2. Getting Familiar with Spoon

3. Extracting Data

4. Transforming Data

5. Loading Data

6. Orchestrating Your Work

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Defining and using Kettle variables

In PDI, you can define and use variables, just as you do when you code in any computer language. We already defined a couple of variables when we created the kettle.properties file in Chapter 1, Getting Started with PDI. Now, we will see where and how to use them.

It's simple: any time you see a dollar sign by the side of a textbox, you can use a variable:

Sample textboxes that allow variables

You can reference a variable by enclosing its name in curly braces, preceded by a dollar sign (for example, ${INPUT_FOLDER}).

Note

A less used notation for a variable is as follows: %%<variable name>%% (for example, %%INPUT_FOLDER%%).

Let's go back to the transformation created in the previous section. Instead of a fixed value for the location of the output file, we will use variables. The following describes how to do it:

Open the transformation (if you had closed it). You can do this from Main Menu or from Main Toolbar.
Double-click on the Text file output step. Replace the full path for the location of the file with the following: ${OUTPUT_FOLDER}/${FILENAME}.

Note

Note that you can combine variables, and can also mix variable names with static text.

Close the window and press F10 to run the transformation.

In the window that appears, select the Variables tab. You will see the names of both variables – OUTPUT_FOLDER and FILENAME:

Variables in the Execute a Transformation window

The OUTPUT_FOLDER variable already has a value, which is taken from the kettle.properties file. The FILENAME variable doesn't have a value yet.

To the right of the name, type the name that you want to give to the output file, as shown in the following screenshot:

Entering values for variables

Click on Run
Browse the filesystem to make sure that the file with the name provided was generated

Beside the user-defined variables – those created by you, either in the kettle.properties file or inside Spoon – PDI has a list of predefined variables that you can also use. The list mainly includes variables related to the environment (for example, ${os.name}, for the name of the operating system on which you are working, or ${Internal.Entry.Current.Directory}, which references the file directory where the current job or transformation is saved). To see the full list of variables, both predefined and user-defined, just position the cursor inside any textbox where a variable is allowed, and press Ctrl + Spacebar. A full list will be displayed.

If you click on any of the variables for a second, the actual value of the variable will be shown, as indicated in the following screenshot:

PDI variables

If you double-click on a variable name, the name will be transcribed into the textbox.

Using named parameters

In the last exercise, you used two variables: one created in the kettle.properties file, and the other created inside of Spoon at runtime. There are still more ways to define variables. One of them is to create a named parameter. Named parameters are variables that you define in a transformation, and they can have a default value. You only have to supply a value if it differs from the default. Let's look at how it works, as follows:

Open the last transformation (if you had closed it).
Double-click anywhere in the work area excepting over the steps or hops. This will open the Transformation properties window.

Click on the Variables tab. This is where we define the named parameters.
Fill in the grid as shown, replacing the path in the example with the real path where you have PDI installed:

Defining a named parameter

Close the window.
Double-click on the CSV file input step. Replace the full path of the location of the file with the following: ${SAMPLES_DIR}/Zipssortedbycitystate.csv.
Close the window and save the transformation.

Click F9 to run the transformation. The Parameters tab in the Run Options window will show the named parameter that we just defined:

Running a transformation with a named parameter

Click on Run. PDI will replace the value of the variable, exactly as it did before.

Note that this time, we didn't supply a value for the variable, as it already had a proper value. Now, suppose that we move the samples folder to a different location. The following describes how we can provide the new value:

Click F9 to run the transformation.
In the Parameters tab, fill in the Valuecolumn with the proper value, as shown in the following screenshot:

Supplying a value for a named parameter

Click on Run. PDI will replace the value of the variable with the value that you provided, and will read the file from that location.

You're reading from Pentaho Data Integration Quick Start Guide Create ETL processes using Pentaho

Table of Contents (15) Chapters

Defining and using Kettle variables

Note

Note

Using named parameters

Authors (1)

Other recommended products

Personalised recommendations for you