Hi dear Apple,
I tried to reproduce the result from this Tensorflow offical Text tutorial form this: https://www.tensorflow.org/text/tutorials/text_generation.
BUT It seems that tensorflow-macos cannot properly training the GRU layer or something, cause I cannot get reasonable reasult(evaluate on training dataset) after 200 epoches training. The trained model just predict unsense characters. It looks like:
# -----------------------------------------------
# ----------- model definition -------------
# -----------------------------------------------
class MyModel(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, rnn_units):
super().__init__(self)
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(rnn_units,
return_sequences=True,
return_state=True)
self.dense = tf.keras.layers.Dense(vocab_size)
def call(self, inputs, states=None, return_state=False, training=False):
x = inputs
x = self.embedding(x, training=training)
if states is None:
states = self.gru.get_initial_state(x)
x, states = self.gru(x, initial_state=states, training=training)
x = self.dense(x, training=training)
if return_state:
return x, states
else:
return x
# -----------------------------------------------
# ----------- model training output -------------
# -----------------------------------------------
...
Epoch 200/200
172/172 [==============================] - 19s 108ms/step - loss: 0.0771
# -----------------------------------------------
# -------- model prediction helper class --------
# -----------------------------------------------
class OneStep(tf.keras.Model):
def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
super().__init__()
self.temperature = temperature
self.model = model
self.chars_from_ids = chars_from_ids
self.ids_from_chars = ids_from_chars
# Create a mask to prevent "[UNK]" from being generated.
skip_ids = self.ids_from_chars(['[UNK]'])[:, None]
sparse_mask = tf.SparseTensor(
# Put a -inf at each bad index.
values=[-float('inf')]*len(skip_ids),
indices=skip_ids,
# Match the shape to the vocabulary
dense_shape=[len(ids_from_chars.get_vocabulary())])
self.prediction_mask = tf.sparse.to_dense(sparse_mask)
@tf.function
def generate_one_step(self, inputs, states=None):
# Convert strings to token IDs.
input_chars = tf.strings.unicode_split(inputs, 'UTF-8')
input_ids = self.ids_from_chars(input_chars).to_tensor()
# Run the model.
# predicted_logits.shape is [batch, char, next_char_logits]
predicted_logits, states = self.model(inputs=input_ids, states=states,
return_state=True)
# Only use the last prediction.
predicted_logits = predicted_logits[:, -1, :]
predicted_logits = predicted_logits/self.temperature
# Apply the prediction mask: prevent "[UNK]" from being generated.
predicted_logits = predicted_logits + self.prediction_mask
# Sample the output logits to generate token IDs.
predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
predicted_ids = tf.squeeze(predicted_ids, axis=-1)
# Convert from token ids to characters
predicted_chars = self.chars_from_ids(predicted_ids)
# Return the characters and model state.
return predicted_chars, states
# -----------------------------------------------
# --- model prediction on training data output ---
# -----------------------------------------------
2022-01-15 06:11:42.090613: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-01-15 06:11:42.165460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-01-15 06:11:42.223806: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-01-15 06:11:42.285768: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
ROMEO:O:O:
:
:
:
Js s s s s s s s s s s s s s s s s s s s s s s s s hihininininininininininininingnglglgly
I also test the tutorial notebook on Colab which gives the resonable prediction, like:
ROMEO:
Thy vaith is hure as foul with your estate with their hellish: it
will compound me lapging Neath,
To Laurh may die this content of her person?
...
For your information, I also tried the Tensorflow Image tutorial in my Mac, surprisedly, model trains noramlly and output the reasonable prediction.
So I think there is some issue on the RNN module in tensorflow-macos maybe?
Spec.
MacBook Pro (16-inch, 2021)
Chip Apple M1 Max
Memory 64 GB
Selecting any option will automatically load the page