Post

Replies

Boosts

Views

Activity

Can not reproduce the Tensorflow official Text tutorial
Hi dear Apple, I tried to reproduce the result from this Tensorflow offical Text tutorial form this: https://www.tensorflow.org/text/tutorials/text_generation. BUT It seems that tensorflow-macos cannot properly training the GRU layer or something, cause I cannot get reasonable reasult(evaluate on training dataset) after 200 epoches training. The trained model just predict unsense characters. It looks like: # ----------------------------------------------- # ----------- model definition ------------- # ----------------------------------------------- class MyModel(tf.keras.Model): def __init__(self, vocab_size, embedding_dim, rnn_units): super().__init__(self) self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim) self.gru = tf.keras.layers.GRU(rnn_units, return_sequences=True, return_state=True) self.dense = tf.keras.layers.Dense(vocab_size) def call(self, inputs, states=None, return_state=False, training=False): x = inputs x = self.embedding(x, training=training) if states is None: states = self.gru.get_initial_state(x) x, states = self.gru(x, initial_state=states, training=training) x = self.dense(x, training=training) if return_state: return x, states else: return x # ----------------------------------------------- # ----------- model training output ------------- # ----------------------------------------------- ... Epoch 200/200 172/172 [==============================] - 19s 108ms/step - loss: 0.0771 # ----------------------------------------------- # -------- model prediction helper class -------- # ----------------------------------------------- class OneStep(tf.keras.Model): def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0): super().__init__() self.temperature = temperature self.model = model self.chars_from_ids = chars_from_ids self.ids_from_chars = ids_from_chars # Create a mask to prevent "[UNK]" from being generated. skip_ids = self.ids_from_chars(['[UNK]'])[:, None] sparse_mask = tf.SparseTensor( # Put a -inf at each bad index. values=[-float('inf')]*len(skip_ids), indices=skip_ids, # Match the shape to the vocabulary dense_shape=[len(ids_from_chars.get_vocabulary())]) self.prediction_mask = tf.sparse.to_dense(sparse_mask) @tf.function def generate_one_step(self, inputs, states=None): # Convert strings to token IDs. input_chars = tf.strings.unicode_split(inputs, 'UTF-8') input_ids = self.ids_from_chars(input_chars).to_tensor() # Run the model. # predicted_logits.shape is [batch, char, next_char_logits] predicted_logits, states = self.model(inputs=input_ids, states=states, return_state=True) # Only use the last prediction. predicted_logits = predicted_logits[:, -1, :] predicted_logits = predicted_logits/self.temperature # Apply the prediction mask: prevent "[UNK]" from being generated. predicted_logits = predicted_logits + self.prediction_mask # Sample the output logits to generate token IDs. predicted_ids = tf.random.categorical(predicted_logits, num_samples=1) predicted_ids = tf.squeeze(predicted_ids, axis=-1) # Convert from token ids to characters predicted_chars = self.chars_from_ids(predicted_ids) # Return the characters and model state. return predicted_chars, states # ----------------------------------------------- # --- model prediction on training data output --- # ----------------------------------------------- 2022-01-15 06:11:42.090613: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2022-01-15 06:11:42.165460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2022-01-15 06:11:42.223806: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2022-01-15 06:11:42.285768: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. ROMEO:O:O: : : : Js s s s s s s s s s s s s s s s s s s s s s s s s hihininininininininininininingnglglgly I also test the tutorial notebook on Colab which gives the resonable prediction, like: ROMEO: Thy vaith is hure as foul with your estate with their hellish: it will compound me lapging Neath, To Laurh may die this content of her person? ... For your information, I also tried the Tensorflow Image tutorial in my Mac, surprisedly, model trains noramlly and output the reasonable prediction. So I think there is some issue on the RNN module in tensorflow-macos maybe? Spec. MacBook Pro (16-inch, 2021) Chip Apple M1 Max Memory 64 GB
4
0
1.6k
Jan ’22