Parallelizing TensorFlow
To extend our reach for parallelizing TensorFlow, we can also perform separate operations from our graph on entirely different machines in a distributed manner. This recipe will show you how.
Getting ready
A few months after the release of TensorFlow, Google released Distributed TensorFlow, which was a big upgrade to the TensorFlow ecosystem, and one that allowed a TensorFlow cluster to be set up (on separate worker machines) and share the computational task of training and evaluating models. Using Distributed TensorFlow is as easy as setting up parameters for workers and then assigning different jobs to different workers.
In this recipe, we will set up two local workers and assign them to different jobs.
How to do it...
- To start, we load TensorFlow and define our two local workers with a configuration dictionary file (ports
2222
and2223
) as follows:
import tensorflow as tf # Cluster for 2 local workers (tasks 0 and 1): cluster = tf.train.ClusterSpec({'local': ['localhost...