## Understanding one-hot vectors, GRU in Keras TF

# Understanding one-hot vectors Here you will learn to generate one-hot encoded vectors from words. One-hot encoding is a common transformation applied to words to represent them numerically. You will be using the Keras `to_categorical()` function to create one-hot vectors. The `to_categorical()` function expects a sequence of integers as the input. Therefore, a `word2index` dictionary is provided which can be used to convert a word to an integer. To successfully complete this exercise you will also have to use the built-in Python `zip()` function. The `zip()` function allows you to iterate multiple things at once. For example if you have two lists `xx` and `yy` of same length, by calling `for x,y in zip(xx,yy)` you can access each `x` and `y` elements of the lists iteratively. from tensorflow.python.keras.utils import to_categorical # Create a list of words and convert them to indices words = ["I", "like", "cats"] word_ids = [word2index[w] for w in words] print(word_ids) # Create onehot vectors using to_categorical function onehot_1 = to_categorical(word_ids) # Print words and their corresponding onehot vectors print([(w,ohe.tolist()) for w,ohe in zip(words, onehot_1)]) # Create onehot vectors with a fixed number of classes and print the result onehot_2 = to_categorical(word_ids, num_classes=5) print([(w,ohe.tolist()) for w,ohe in zip(words, onehot_2)]) # Part 1: Exploring the to_categorical() function Did you know that in real-world problems, the vocabulary size can grow very large (e.g. more than hundred thousand)? This exercise is broken into two parts and you will learn the importance of setting the `num_classes` argument of the `to_categorical()` function. In part 1, you will implement the function `compute_onehot_length()` that generates one-hot vectors for a given list of words and computes the length of those vectors. The `to_categorical()` function has already been imported. def compute_onehot_length(words, word2index): # Create word IDs for words word_ids = [word2index[w] for w in words] # Convert word IDs to onehot vectors onehot = to_categorical(word_ids) # Return the length of a single one-hot vector return onehot.shape[1] word2index = {"He":0, "drank": 1, "milk": 2} # Compute and print onehot length of a list of words print(compute_onehot_length(["He","drank","milk"], word2index)) words_1 = ["I", "like", "cats", "We", "like", "dogs", "He", "hates", "rabbits"] # Call compute_onehot_length on words_1 length_1 = compute_onehot_length(words_1, word2index) words_2 = ["I", "like", "cats", "We", "like", "dogs", "We", "like", "cats"] # Call compute_onehot_length on words_2 length_2 = compute_onehot_length(words_2, word2index) # Print length_1 and length_2 print("length_1 =>", length_1, " and length_2 => ", length_2) # Part 1: Text reversing model - Encoder Creating a simple text reversing model is a great method to understand the mechanics of encoder decoder models and how they connect. You will now implement the encoder part of a text reversing model. The implementation of the encoder has been split over two exercises. In this exercise, you will be defining the `words2onehot()` helper function. The `words2onehot()` function should take in a list of words and a dictionary `word2index` and convert the list of words to an array of one-hot vectors. The `word2index` dictionary is available in the workspace. ![](https://assets.datacamp.com/production/repositories/4609/datasets/c6fb0acf4665a26c1ff6e65d926036aa9e934722/12_encoder_decoder_2.png) import numpy as np def words2onehot(word_list, word2index): # Convert words to word IDs word_ids = [word2index[w] for w in word_list] # Convert word IDs to onehot vectors and return the onehot array onehot = to_categorical(word_ids, num_classes=3) return onehot words = ["I", "like", "cats"] # Convert words to onehot vectors using words2onehot onehot = words2onehot(words, word2index) # Print the result as (<word>, <onehot>) tuples print([(w,ohe.tolist()) for w,ohe in zip(words, onehot)]) # Part 2: Text reversing model - Encoder You will now implement the rest of the encoder of the text reversing model. The encoder feeds on the one-hot vectors produced by the `words2onehot()` function you implemented previously. Here you will be implementing the `encoder()` function. The `encoder()` function takes in a set of one-hot vectors and converts them to a list of word ids. For this exercise, the `words2onehot()` function and the `word2index` dictionary (having the words `We`, `like` and `dogs`) have been provided. ![](https://assets.datacamp.com/production/repositories/4609/datasets/c6fb0acf4665a26c1ff6e65d926036aa9e934722/12_encoder_decoder_2.png) def encoder(onehot): # Get word IDs from onehot vectors and return the IDs word_ids = np.argmax(onehot, axis=1) return word_ids # Define "We like dogs" as words words = 'We like dogs'.split(" ") # Convert words to onehot vectors using words2onehot onehot = words2onehot(words, word2index) # Get the context vector by using the encoder function context = encoder(onehot) print(context) # Complete text reversing model You will now implement the decoder part of the text reversing model, which will convert the context vector from the encoder to reversed words. You will be defining two functions `onehot2words()` and `decoder()`. The `onehot2words()` function takes in a list of ids and a dictionary `index2word` and converts an array of one-hot vectors to a list of words. The `decoder()` function takes in the context vector (i.e., list of word ids) and converts it to the reversed list of words. For this exercise, the `index2word` dictionary, the context vector `context`, the `encoder()` function and the `words2onehot()` functions will be provided. # Define the onehot2words function that returns words for a set of onehot vectors def onehot2words(onehot, index2word): ids = np.argmax(onehot, axis=1) res = [index2word[id] for id in ids] return res # Define the decoder function that returns reversed onehot vectors def decoder(context_vector): word_ids_rev = context_vector[::-1] onehot_rev = to_categorical(word_ids_rev, num_classes=3) return onehot_rev # Convert context to reversed onehot vectors using decoder onehot_rev = decoder(context) # Get the reversed words using the onehot2words function reversed_words = onehot2words(onehot_rev, index2word) print(reversed_words) # Part 1: Understanding GRU models Did you know these models can remember even up to thousands of time steps compared to standard recurrent neural networks which can usually remember less than hundred time steps only. Understanding GRU models is essential to use them effectively to implement machine translation models. In this exercise, you will implement a simple model that has an input layer and a GRU layer. You will then use the model to produce output values for a random input array. Don't be discouraged that you are using random data. The objective of this exercise is to understand the shape of the outputs produced by the GRU layer. In later chapters, you will feed in actual sentences to GRU layers to perform translation. import tensorflow.keras as keras import numpy as np # Define an input layer inp = keras.layers.Input(batch_shape=(2,3,4)) # Define a GRU layer that takes in the input gru_out = keras.layers.GRU(10)(inp) # Define a model that outputs the GRU output model = keras.models.Model(inputs=inp, outputs=gru_out) x = np.random.normal(size=(2,3,4)) # Get the output of the model and print the result y = model.predict(x) print("shape (y) =", y.shape, "\ny = \n", y) # Understanding sequential model output In this exercise you will learn to use the `keras.layers.GRU` layer. `keras.layers.GRU` nicely wraps the functionality of a GRU to a `Layer` object. You will explore what the shape of the output of a GRU layer looks like and how it changes when different arguments are provided. It is rare to view the numerical vectors produced by a GRU in real life, but in order to use these layers in more complex models, you need to have a good understanding of the shapes of the outputs and how to get the desired output using various arguments. Here you will have `keras`, and `numpy` (as `np`) loaded already. You can access layers by calling `keras.layers.<Layer>` or a model by calling `keras.models.Model`. # Define the Input layer inp = keras.layers.Input(batch_shape=(3,20,5)) # Define a GRU layer that takes in inp as the input gru_out1 = keras.layers.GRU(10)(inp) print("gru_out1.shape = ", gru_out1.shape) # Define the second GRU and print the shape of the outputs gru_out2, gru_state2 = keras.layers.GRU(10, return_state=True)(inp) print("gru_out2.shape = ", gru_out2.shape) print("gru_state.shape = ", gru_state2.shape) # Define the third GRU layer which will return all the outputs gru_out3 = keras.layers.GRU(10, return_sequences=True)(inp) print("gru_out3.shape = ", gru_out3.shape) Output- ![enter image description here](https://i.ibb.co/CwC3LQG/Capture.png)

-dipeshpal, Oct. 24, 2020, 12:25 p.m.