I pinpointed the issue further. The core of the issue is the dimensionality of the weight vector provided to the dataset. In my Non-M1 implementations it was (none,) and it was working well. on M1, I need to change it to (None,1). that said:
it is only required when we we work with 3-dimensional data (possibly just output (i.e. input can be of any dimension) -- but I did not test that), (possibly that dimensionality must be increased further as the dimensionality of our data increase -- not tested)
we use either custom loss function or we wrap the built-in one in a custom loss wrapper (using class and def() has the same effect)
the odd behavior is that my initial explorations as well as my research syntax works on well on M1 CPU without any modification. the syntax below fails with the above conditions both on M1 CPU and GPU. I have not investigated it further.
I also worked with TF 2.8 and experienced the same behavior.
Thx for looking into that. the expected solution is either alignment of behavior across environments or further investigation of the required structure of the weight vector and update in documentation.
Here is the syntax:
TEST CONDITIONS:
breaking condition: 1,1,3,1,1 and 1,1,3,1,2
dataset_weight = 1 # 0 No, 1 Yes
dw_type = 1 # 1 unidimensional, 2 dimensional
data_shape = 3 # 2 two dimensional # 3 dimensional
gpu = 1 # 0 No, 1 Yes
loss = 1 # 0 No, 1 Yes, 2 pseudo custom loss
import numpy as np
import pandas as pd
import sys
"""
if 'tensorflow' in sys.modules:
print("tensorflow uploaded")
del sys.modules["tensorflow"]
del tf
import tensorflow as tf
else:
print("tensorflow not uploaded")
import tensorflow as tf
if gpu == 1:
pass
else:
tf.config.set_visible_devices([], 'GPU')
#print("GPUs:", tf.config.list_physical_devices('GPU'))
print("GPUs:", tf.config.list_logical_devices('GPU'))
#print("CPUs:", tf.config.list_physical_devices('CPU'))
print("CPUs:", tf.config.list_logical_devices('CPU'))
"""
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import backend as K
import tensorflow as tf
print("TensorFlow version:", tf.version)
batch = 128
url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['MPG', 'Displacement', 'Horsepower', 'Weight']
dataset = pd.read_csv(url, names=column_names,
na_values='?', comment='\t',
sep=' ', skipinitialspace=True).dropna()
if data_shape == 2:
x_train = np.array(dataset[['Horsepower', 'Weight']]).reshape(-1,2)
y_train = np.array(dataset[['MPG','Displacement']]).reshape(-1,2)
else:
x_train = np.array(dataset[['Horsepower', 'Weight']]).reshape(-1,2,2)
y_train = np.array(dataset[['MPG','Displacement']]).reshape(-1,2,2)
if dw_type == 2:
weight = np.expand_dims(np.ones(x_train.shape[0]), axis = 1)
else:
weight = np.ones(x_train.shape[0])
#print(dataset)
print(x_train.shape)
print(y_train.shape)
print(weight.shape)
if dataset_weight == 0:
train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train)).cache().shuffle(x_train.shape[0]).batch(batch).repeat().prefetch(tf.data.experimental.AUTOTUNE)
else:
train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train, weight)).cache().shuffle(x_train.shape[0]).batch(batch).repeat().prefetch(tf.data.experimental.AUTOTUNE)
model = Sequential([
Dense(64, activation='relu'),
Dense(32, activation='relu'),
Dense(2)])
loss_tf = tf.keras.losses.MeanSquaredError()
def custom_loss(y_true, y_pred):
error = y_true-y_pred
sqr_error = K.square(error)
mean_sqr_error = K.mean(sqr_error)
sqrt_mean_sqr_error = K.sqrt(mean_sqr_error)
return sqrt_mean_sqr_error
def pseudo_custom_loss(y_true, y_pred):
return loss_tf(y_true, y_pred)
if loss == 0:
model.compile(optimizer='adam', loss=loss_tf, run_eagerly=True)
elif loss == 1:
model.compile(optimizer='adam', loss=custom_loss, run_eagerly=True)
else:
model.compile(optimizer='adam', loss=pseudo_custom_loss, run_eagerly=True)
model.fit(train_data, epochs=2, steps_per_epoch = 3)
print(model.summary())