Stefano Cappellini

AI, Deep Learning, Machine Learning, Software Engineering

Variables sharing in TensorFlow: Variable vs get_variable

Written on , in: ,

What are the differences between the variables created through the Variable constructor and the ones created using the get_variable function? How does the variable sharing work in TensorFlow? These are the two questions that will be addressed in this post.

Note: This “post” assumes some previous knowledge of TensorFlow. You can read about it on the official doc. It also assumes some knowledge of TensorFlow scopes. You can read about them here.

Note2: this “post” is associated with this Notebook

Fact 1

Variables created using the Variable constructor CANNOT be shared. Every time you call this constructor, it will create a brand new variable (and this is exactly what you should expect from a constructor!)

def sharing_1():
    tf.reset_default_graph()
    with tf.name_scope("first"):
        x = tf.Variable(10, name="x")
    with tf.name_scope("first/"): # note the trailing slash
        x2 = tf.Variable(10, name="x")
    print(x.name, x2.name, x == x2)

sharing_1()
# first/x:0 first/x_1:0 False

Fact 2

Variables created using the Variable constructor and the ones created using the get_variable are different beasts. Thus, you cannot reuse variables created using the constructor.

def sharing_2():
    tf.reset_default_graph()
    with tf.variable_scope("first"):
        x = tf.Variable(10, name="x")
        x2 = tf.get_variable("x", [])

    with tf.variable_scope("first", reuse=tf.AUTO_REUSE):
        with tf.name_scope("first/"): # note the trailing slash
            x3 = tf.Variable(10, name="x")
            x4 = tf.get_variable("x", [])
    print(x.name, x2.name, x3.name, x4.name)
    print(x2 == x4)
    print(x == x4)

sharing_2()
# first/x:0 first/x_1:0 first/x_2:0 first/x_1:0
# True
# False

This example is surprising, isn’t it? When you call the get_variable the first time, it creates another variable, called “x_1” to prevent a name clashing with the same variable created before. However, you can access this variable simply by using the name you chose, that is, “x”. It’s magic!

Thus: to create variables use the get_variable function. Stop using the Variable constructor!

Fact 3

There are many different ways of sharing a variable. Let’s see them:

First method: this is the most flexible way of sharing. You can both reuse existing variables and create new variables

def sharing_method1():
    tf.reset_default_graph()
    with tf.variable_scope("first"):
        x = tf.get_variable("x", [])

    with tf.variable_scope("first", reuse=tf.AUTO_REUSE):
        # You can reuse existing variables
        x2 = tf.get_variable("x", [])

        # And create new variables
        x3 = tf.get_variable("x3", [])

    print(x.name, x2.name, x3.name, x == x2)

sharing_method1()
# first/x:0 first/x:0 first/x3:0 True

Second method: this is somehow more limited. You can only reuse existing variables: you cannot create new variables

def sharing_method2():
    tf.reset_default_graph()
    with tf.variable_scope("first"):
        x = tf.get_variable("x", [])

    with tf.variable_scope("first", reuse=True):
        # You can only reuse existing variables
        x2 = tf.get_variable("x", [])

    print(x.name, x2.name, x == x2)

sharing_method2()
# first/x:0 first/x:0 first/x3:0 True

Third method: somehow between the two. You can decide when to start reusing the existing variables. From there you won’t be able to create new variables.

def sharing_method3():
    tf.reset_default_graph()
    with tf.variable_scope("first"):
        x = tf.get_variable("x", [])

    with tf.variable_scope("first") as scope:
        # You can create new variables before
        x3 = tf.get_variable("x3", [])

        # And decide when to start reusing the existing variables
        scope.reuse_variables()
        x2 = tf.get_variable("x", [])

    print(x.name, x2.name, x3.name, x == x2)

sharing_method3()
# first/x:0 first/x:0 first/x3:0 True

What’s the best sharing method?

Whenever it is possible, I prefer the first method, that is, using the reuse=True argument. This helps detecting bugs and makes the code a lot easier to read. Why?

  • With AUTO_REUSE, if you wrongly spell the name of an existing variable, a new variable will be created. Detecting a bug will be thus a lot harder.
  • If you decide to go for the reuse=True argument, you will be able to detect the typo immediatly, simply because you cannot create new variables in a scope with reuse set to True
  • The reuse_variables solution may be a good compromise, but I find it confusing: you have to search in your code when the reuse starts.

Obviously, if you really want to both create new variables and reuse some existing variables, then this approach won’t work. In this case, I find the AUTO_REUSE to be a better option.

comments powered by Disqus