TensorFlow core is the lower level library on which the higher level TensorFlow modules are built. The concepts of the lower level library are very important to learn before we go deeper into learning the advanced TensorFlow. In this section, we will have a quick recap of all those core concepts.
Code warm-up - Hello TensorFlow
As a customary tradition when learning any new programming language, library, or platform, let's write the simple Hello TensorFlow code as a warm-up exercise before we dive deeper.
Note
We assume that you have already installed TensorFlow. If you have not, refer to the TensorFlow installation guide at https://www.tensorflow.org/install/ for detailed instructions to install TensorFlow.
Open the file ch-01_TensorFlow_101.ipynb
in Jupyter Notebook to follow and run the code as you study the text.
- Import the TensorFlow Library with the following code:
import tensorflow as tf
- Get a TensorFlow session. TensorFlow offers two kinds of sessions:
Session()
and InteractiveSession()
. We will create an interactive session with the following code:
tfs = tf.InteractiveSession()
Note
The only difference between Session()
and InteractiveSession()
is that the session created with InteractiveSession()
becomes the default session. Thus, we do not need to specify the session context to execute the session-related command later. For example, say that we have a session object, tfs
, and a constant object, hello
. If tfs
is an InteractiveSession()
object, then we can evaluate hello
with the code hello.eval()
. If tfs
is a Session()
object, then we have to use either tfs.hello.eval()
or a with
block. The most common practice is to use the with
block, which will be shown later in this chapter.
- Define a TensorFlow constant,
hello
:
hello = tf.constant("Hello TensorFlow !!")
- Execute the constant in a TensorFlow session and print the output:
print(tfs.run(hello))
- You will get the following output:
'Hello TensorFlow !!'
Now that you have written and executed the first two lines of code with TensorFlow, let's look at the basic ingredients of TensorFlow.
Tensors are the basic elements of computation and a fundamental data structure in TensorFlow. Probably the only data structure that you need to learn to use TensorFlow. A tensor is an n-dimensional collection of data, identified by rank, shape, and type.
Rank is the number of dimensions of a tensor, and shape is the list denoting the size in each dimension. A tensor can have any number of dimensions. You may be already familiar with quantities that are a zero-dimensional collection (scalar), a one-dimensional collection (vector), a two-dimensional collection (matrix), and a multidimensional collection.
A scalar value is a tensor of rank 0 and thus has a shape of [1]. A vector or a one-dimensional array is a tensor of rank 1 and has a shape of [columns] or [rows]. A matrix or a two-dimensional array is a tensor of rank 2 and has a shape of [rows, columns]. A three-dimensional array would be a tensor of rank 3, and in the same manner, an n-dimensional array would be a tensor of rank n.
Note
Refer to the following resources to learn more about tensors and their mathematical underpinnings:
A tensor can store data of one type in all its dimensions, and the data type of its elements is known as the data type of the tensor.
At the time of writing this book, the TensorFlow had the following data types defined:
Note
We recommend that you should avoid using the Python native data types. Instead of the Python native data types, use TensorFlow data types for defining tensors.
Tensors can be created in the following ways:
- By defining constants, operations, and variables, and passing the values to their constructor.
- By defining placeholders and passing the values to
session.run()
. - By converting Python objects such as scalar values, lists, and NumPy arrays with the
tf.convert_to_tensor()
function.
Let's examine different ways of creating Tensors.
The constant valued tensors are created using the tf.constant()
function that has the following signature:
tf.constant(
value,
dtype=None,
shape=None,
name='Const',
verify_shape=False
)
Let's look at the example code provided in the Jupyter Notebook with this book:
c1=tf.constant(5,name='x')
c2=tf.constant(6.0,name='y')
c3=tf.constant(7.0,tf.float32,name='z')
Let's look into the code in detail:
- The first line defines a constant tensor
c1
, gives it value 5, and names it x. - The second line defines a constant tensor
c2
, stores value 6.0, and names it y. - When we print these tensors, we see that the data types of
c1
and c2
are automatically deduced by TensorFlow. - To specifically define a data type, we can use the
dtype
parameter or place the data type as the second argument. In the preceding code example, we define the data type as tf.float32
for c3
.
Let's print the constants c1
, c2
, and c3
:
print('c1 (x): ',c1)
print('c2 (y): ',c2)
print('c3 (z): ',c3)
When we print these constants, we get the following output:
c1 (x): Tensor("x:0", shape=(), dtype=int32)
c2 (y): Tensor("y:0", shape=(), dtype=float32)
c3 (z): Tensor("z:0", shape=(), dtype=float32)
In order to print the values of these constants, we have to execute them in a TensorFlow session with the tfs.run()
command:
print('run([c1,c2,c3]) : ',tfs.run([c1,c2,c3]))
We see the following output:
run([c1,c2,c3]) : [5, 6.0, 7.0]
TensorFlow provides us with many operations that can be applied on Tensors. An operation is defined by passing values and assigning the output to another tensor. For example, in the provided Jupyter Notebook file, we define two operations, op1
and op2
:
op1 = tf.add(c2,c3)
op2 = tf.multiply(c2,c3)
When we print op1
and op2
, we find that they are defined as Tensors:
print('op1 : ', op1)
print('op2 : ', op2)
The output is as follows:
op1 : Tensor("Add:0", shape=(), dtype=float32)
op2 : Tensor("Mul:0", shape=(), dtype=float32)
To print the value of these operations, we have to run them in our TensorFlow session:
print('run(op1) : ', tfs.run(op1))
print('run(op2) : ', tfs.run(op2))
The output is as follows:
run(op1) : 13.0
run(op2) : 42.0
The following table lists some of the built-in operations:
While constants allow us to provide a value at the time of defining the tensor, the placeholders allow us to create tensors whose values can be provided at runtime. TensorFlow provides the tf.placeholder()
function with the following signature to create placeholders:
tf.placeholder(
dtype,
shape=None,
name=None
)
As an example, let's create two placeholders and print them:
p1 = tf.placeholder(tf.float32)
p2 = tf.placeholder(tf.float32)
print('p1 : ', p1)
print('p2 : ', p2)
We see the following output:
p1 : Tensor("Placeholder:0", dtype=float32)
p2 : Tensor("Placeholder_1:0", dtype=float32)
Now let's define an operation using these placeholders:
op4 = p1 * p2
TensorFlow allows using shorthand symbols for various operations. In the earlier example, p1 * p2
is shorthand for tf.multiply(p1,p2)
:
print('run(op4,{p1:2.0, p2:3.0}) : ',tfs.run(op4,{p1:2.0, p2:3.0}))
The preceding command runs the op4
in the TensorFlow Session, feeding the Python dictionary (the second argument to the run()
operation) with values for p1
and p2
.
The output is as follows:
run(op4,{p1:2.0, p2:3.0}) : 6.0
We can also specify the dictionary using the feed_dict
parameter in the run()
operation:
print('run(op4,feed_dict = {p1:3.0, p2:4.0}) : ',
tfs.run(op4, feed_dict={p1: 3.0, p2: 4.0}))
The output is as follows:
run(op4,feed_dict = {p1:3.0, p2:4.0}) : 12.0
Let's look at one last example, with a vector being fed to the same operation:
print('run(op4,feed_dict = {p1:[2.0,3.0,4.0], p2:[3.0,4.0,5.0]}) : ',
tfs.run(op4,feed_dict = {p1:[2.0,3.0,4.0], p2:[3.0,4.0,5.0]}))
The output is as follows:
run(op4,feed_dict={p1:[2.0,3.0,4.0],p2:[3.0,4.0,5.0]}):[ 6. 12. 20.]
The elements of the two input vectors are multiplied in an element-wise fashion.
Creating tensors from Python objects
We can create tensors from Python objects such as lists and NumPy arrays, using the tf.convert_to_tensor()
operation with the following signature:
tf.convert_to_tensor(
value,
dtype=None,
name=None,
preferred_dtype=None
)
Let's create some tensors and print them for practice:
- Create and print a 0-D Tensor:
tf_t=tf.convert_to_tensor(5.0,dtype=tf.float64)
print('tf_t : ',tf_t)
print('run(tf_t) : ',tfs.run(tf_t))
The output is as follows:
tf_t : Tensor("Const_1:0", shape=(), dtype=float64)
run(tf_t) : 5.0
- Create and print a 1-D Tensor:
a1dim = np.array([1,2,3,4,5.99])
print("a1dim Shape : ",a1dim.shape)
tf_t=tf.convert_to_tensor(a1dim,dtype=tf.float64)
print('tf_t : ',tf_t)
print('tf_t[0] : ',tf_t[0])
print('tf_t[0] : ',tf_t[2])
print('run(tf_t) : \n',tfs.run(tf_t))
The output is as follows:
a1dim Shape : (5,)
tf_t : Tensor("Const_2:0", shape=(5,), dtype=float64)
tf_t[0] : Tensor("strided_slice:0", shape=(), dtype=float64)
tf_t[0] : Tensor("strided_slice_1:0", shape=(), dtype=float64)
run(tf_t) :
[ 1. 2. 3. 4. 5.99]
- Create and print a 2-D Tensor:
a2dim = np.array([(1,2,3,4,5.99),
(2,3,4,5,6.99),
(3,4,5,6,7.99)
])
print("a2dim Shape : ",a2dim.shape)
tf_t=tf.convert_to_tensor(a2dim,dtype=tf.float64)
print('tf_t : ',tf_t)
print('tf_t[0][0] : ',tf_t[0][0])
print('tf_t[1][2] : ',tf_t[1][2])
print('run(tf_t) : \n',tfs.run(tf_t))
The output is as follows:
a2dim Shape : (3, 5)
tf_t : Tensor("Const_3:0", shape=(3, 5), dtype=float64)
tf_t[0][0] : Tensor("strided_slice_3:0", shape=(), dtype=float64)
tf_t[1][2] : Tensor("strided_slice_5:0", shape=(), dtype=float64)
run(tf_t) :
[[ 1. 2. 3. 4. 5.99]
[ 2. 3. 4. 5. 6.99]
[ 3. 4. 5. 6. 7.99]]
- Create and print a 3-D Tensor:
a3dim = np.array([[[1,2],[3,4]],
[[5,6],[7,8]]
])
print("a3dim Shape : ",a3dim.shape)
tf_t=tf.convert_to_tensor(a3dim,dtype=tf.float64)
print('tf_t : ',tf_t)
print('tf_t[0][0][0] : ',tf_t[0][0][0])
print('tf_t[1][1][1] : ',tf_t[1][1][1])
print('run(tf_t) : \n',tfs.run(tf_t))
The output is as follows:
a3dim Shape : (2, 2, 2)
tf_t : Tensor("Const_4:0", shape=(2, 2, 2), dtype=float64)
tf_t[0][0][0] : Tensor("strided_slice_8:0", shape=(), dtype=float64)
tf_t[1][1][1] : Tensor("strided_slice_11:0", shape=(), dtype=float64)
run(tf_t) :
[[[ 1. 2.][ 3. 4.]]
[[ 5. 6.][ 7. 8.]]]
Note
TensorFlow can seamlessly convert NumPy ndarray
to TensorFlow tensor and vice-versa.
So far, we have seen how to create tensor objects of various kinds: constants, operations, and placeholders. While working with TensorFlow to build and train models, you will often need to hold the values of parameters in a memory location that can be updated at runtime. That memory location is identified by variables in TensorFlow.
In TensorFlow, variables are tensor objects that hold values that can be modified during the execution of the program.
While tf.Variable
appears similar to tf.placeholder
, there are subtle differences between the two:
In TensorFlow, a variable can be created with tf.Variable()
. Let's see an example of placeholders and variables with a linear model:
- We define the model parameters
w
and b
as variables with initial values of [.3]
and [-0.3]
, respectively:
w = tf.Variable([.3], tf.float32)
b = tf.Variable([-.3], tf.float32)
- The input
x
is defined as a placeholder and the output y
is defined as an operation:
x = tf.placeholder(tf.float32)
y = w * x + b
- Let's print
w
, v
, x
, and y
and see what we get:
print("w:",w)
print("x:",x)
print("b:",b)
print("y:",y)
We get the following output:
w: <tf.Variable 'Variable:0' shape=(1,) dtype=float32_ref>
x: Tensor("Placeholder_2:0", dtype=float32)
b: <tf.Variable 'Variable_1:0' shape=(1,) dtype=float32_ref>
y: Tensor("add:0", dtype=float32)
The output shows that x
is a placeholder tensor and y
is an operation tensor, while w
and b
are variables with shape (1,)
and data type float32
.
Before you can use the variables in a TensorFlow session, they have to be initialized. You can initialize a single variable by running its initializer operation.
For example, let's initialize the variable w
:
tfs.run(w.initializer)
However, in practice, we use a convenience function provided by the TensorFlow to initialize all the variables:
tfs.run(tf.global_variables_initializer())
Note
You can also use the tf.variables_initializer()
function to initialize only a set of variables.
The global initializer convenience function can also be invoked in the following manner, instead of being invoked inside the run()
function of a session object:
tf.global_variables_initializer().run()
After initializing the variables, let's run our model to give the output for values of x = [1,2,3,4]:
print('run(y,{x:[1,2,3,4]}) : ',tfs.run(y,{x:[1,2,3,4]}))
We get the following output:
run(y,{x:[1,2,3,4]}) : [ 0. 0.30000001 0.60000002 0.90000004]
Tensors generated from library functions
Tensors can also be generated from various TensorFlow functions. These generated tensors can either be assigned to a constant or a variable, or provided to their constructor at the time of initialization.
As an example, the following code generates a vector of 100 zeroes and prints it:
a=tf.zeros((100,))
print(tfs.run(a))
TensorFlow provides different types of functions to populate the tensors at the time of their definition:
- Populating all elements with the same values
- Populating elements with sequences
- Populating elements with a random probability distribution, such as the normal distribution or the uniform distribution
Populating tensor elements with the same values
The following table lists some of the tensor generating library functions to populate all the elements of the tensor with the same values:
Populating tensor elements with sequences
The following table lists some of the tensor generating functions to populate elements of the tensor with sequences:
Populating tensor elements with a random distribution
TensorFlow provides us with the functions to generate tensors filled with random valued distributions.
The distributions generated are affected by the graph-level or the operation-level seed. The graph-level seed is set using tf.set_random_seed
, while the operation-level seed is given as the argument seed
in all of the random distribution functions. If no seed is specified, then a random seed is used.
The following table lists some of the tensor generating functions to populate elements of the tensor with random valued distributions:
Getting Variables with tf.get_variable()
If you define a variable with a name that has been defined before, then TensorFlow throws an exception. Hence, it is convenient to use the tf.get_variable()
function instead of tf.Variable()
. The function tf.get_variable()
returns the existing variable with the same name if it exists, and creates the variable with the specified shape and initializer if it does not exist. For example:
w = tf.get_variable(name='w',shape=[1],dtype=tf.float32,initializer=[.3])
b = tf.get_variable(name='b',shape=[1],dtype=tf.float32,initializer=[-.3])
The initializer can be a tensor or list of values as shown in above examples or one of the inbuilt initializers:
tf.constant_initializer
tf.random_normal_initializer
tf.truncated_normal_initializer
tf.random_uniform_initializer
tf.uniform_unit_scaling_initializer
tf.zeros_initializer
tf.ones_initializer
tf.orthogonal_initializer
In distributed TensorFlow where we can run the code across machines, the tf.get_variable()
gives us global variables. To get the local variables TensorFlow has a function with similar signature: tf.get_local_variable()
.
Sharing or Reusing Variables: Getting already-defined variables promotes reuse. However, an exception will be thrown if the reuse flags are not set by using tf.variable_scope.reuse_variable()
or tf.variable.scope(reuse=True)
.
Now that you have learned how to define tensors, constants, operations, placeholders, and variables, let's learn about the next level of abstraction in TensorFlow, that combines these basic elements together to form a basic unit of computation, the data flow graph or computational graph.