This is the first notebook in a series to be posted aiming to solve and understand exercises from d2l.ai curriculum on deep learning, the corresponding lesson reference for this notebook is this link.

This series of practice notebook posts may use the exercises and content provided from d2l.ai, I write these to get a good hands-on practice in deep learning.

import tensorflow as tf
tf.__version__
'2.4.1'

Setup for exercises

Problem 1

X = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4))
Y = tf.constant([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
X, Y
(<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)>,
 <tf.Tensor: shape=(3, 4), dtype=float32, numpy=
 array([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]], dtype=float32)>)

Problem 2

a = tf.reshape(tf.range(3), (3, 1))
b = tf.reshape(tf.range(2), (1, 2))
a, b
(<tf.Tensor: shape=(3, 1), dtype=int32, numpy=
 array([[0],
        [1],
        [2]], dtype=int32)>,
 <tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[0, 1]], dtype=int32)>)

Solutions

Problem 1

X == Y
<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False,  True, False,  True],
       [False, False, False, False],
       [False, False, False, False]])>
X < Y
<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[ True, False,  True, False],
       [False, False, False, False],
       [False, False, False, False]])>
X > Y
<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False, False, False, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])>

The operations are as expected of an elementwise comparison. Let's try to check if the operations are opposites of each other by trying to not one of them.

(X > Y) == tf.math.logical_not(X < Y)
<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[ True, False,  True, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])>

We can see that apart from two cases where the numbers were equal(1 and 3), all the other values matched

Problem 2

a = tf.reshape(tf.range(16), (2, 4, -1))
b = tf.reshape(tf.range(16), (4, 2, -1))
a, b
(<tf.Tensor: shape=(2, 4, 2), dtype=int32, numpy=
 array([[[ 0,  1],
         [ 2,  3],
         [ 4,  5],
         [ 6,  7]],
 
        [[ 8,  9],
         [10, 11],
         [12, 13],
         [14, 15]]], dtype=int32)>,
 <tf.Tensor: shape=(4, 2, 2), dtype=int32, numpy=
 array([[[ 0,  1],
         [ 2,  3]],
 
        [[ 4,  5],
         [ 6,  7]],
 
        [[ 8,  9],
         [10, 11]],
 
        [[12, 13],
         [14, 15]]], dtype=int32)>)
a + b
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-12-bd58363a63fc> in <module>()
----> 1 a + b

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py in binary_op_wrapper(x, y)
   1162     with ops.name_scope(None, op_name, [x, y]) as name:
   1163       try:
-> 1164         return func(x, y, name=name)
   1165       except (TypeError, ValueError) as e:
   1166         # Even if dispatching the op failed, the RHS may be a tensor aware

/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
    199     """Call target, and fall back on dispatchers if there is a TypeError."""
    200     try:
--> 201       return target(*args, **kwargs)
    202     except (TypeError, ValueError):
    203       # Note: convert_to_eager_tensor currently raises a ValueError, not a

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py in _add_dispatch(x, y, name)
   1484     return gen_math_ops.add(x, y, name=name)
   1485   else:
-> 1486     return gen_math_ops.add_v2(x, y, name=name)
   1487 
   1488 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py in add_v2(x, y, name)
    470       return _result
    471     except _core._NotOkStatusException as e:
--> 472       _ops.raise_from_not_ok_status(e, name)
    473     except _core._FallbackException:
    474       pass

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
   6860   message = e.message + (" name: " + name if name is not None else "")
   6861   # pylint: disable=protected-access
-> 6862   six.raise_from(core._status_to_exception(e.code, message), None)
   6863   # pylint: enable=protected-access
   6864 

/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: Incompatible shapes: [2,4,2] vs. [4,2,2] [Op:AddV2]

I tried using tensors of 3d shapes, thinking it might but it did'nt, so I was searching about the rules to determine whether an array can be broadcasted or not and found this documentation, where the conditions are explained, the main points to consider broadcasting are, if the dimensions

  • are equal, or

  • one of them is 1

In the above case we hade shapes: [2,4,2] vs. [4,2,2], Let's try a different shape

a = tf.reshape(tf.range(12), (6, 2, -1))
b = tf.reshape(tf.range(16), (1, -1))
a, b
(<tf.Tensor: shape=(6, 2, 1), dtype=int32, numpy=
 array([[[ 0],
         [ 1]],
 
        [[ 2],
         [ 3]],
 
        [[ 4],
         [ 5]],
 
        [[ 6],
         [ 7]],
 
        [[ 8],
         [ 9]],
 
        [[10],
         [11]]], dtype=int32)>, <tf.Tensor: shape=(1, 16), dtype=int32, numpy=
 array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15]],
       dtype=int32)>)
a + b
<tf.Tensor: shape=(6, 2, 16), dtype=int32, numpy=
array([[[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16]],

       [[ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17],
        [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18]],

       [[ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
        [ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]],

       [[ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
        [ 7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]],

       [[ 8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
        [ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]],

       [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25],
        [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]]],
      dtype=int32)>

Similar to the examples in the link, the above example followed the rules and operation(addition) could happen with the help of broadcasting.

a - 6 X 2 X 1
    b -     1 X 16
a + b - 6 X 2 X 16