Post

Replies

Boosts

Views

Activity

Comment on Tensorflow MobileNetV3Small model not training on custom image classification task
@venkatg I ran the unmodified "CIFAR10 dataset" command from the unofficial mobilenet v3 you mentioned with and without tensorflow metal. I slightly modified the code for tensorflow 2 compatibility. On my system, running without tensorflow-metal looked good (declining loss, increasing accuracy). The run with tensorflow-metal, definitively did not look good: exactly constant loss and roughly constant accuracy over the epochs. There currently seem to be some serious issues with tensorflow-metal v0.3.
Topic: Graphics & Games SubTopic: General Tags:
Dec ’21
Comment on Tensorflow MobileNetV3Small model not training on custom image classification task
@venkatg I can confirm that for some operations training with metal works as expected and for others not. In your example it might be the "inception" or the "conv padded" operations which, at a quick glance, could be the delta between inception v3 and the CNN example you provided (see, https://arxiv.org/pdf/1512.00567.pdf, Table 1) In my case (https://developer.apple.com/forums/thread/696474) It could e.g. be the LSTM operation (see table 1 here https://arxiv.org/pdf/1507.05717.pdf). Of course all this is highly speculative, maybe the best way to pin down the issue (but time consuming) would be to write per operator test cases and run those with and without metal. I hope Apple is doing this anyway. One thing we know for sure, is that there is an issue with the random number generator in metal.
Topic: Graphics & Games SubTopic: General Tags:
Dec ’21
Comment on Wrong results when using tensor flow-metal
Thanks for looking into it. The issue also persists after upgrading to macOS 12.1. Not sure if this is related, but also the random number generation seems to have an issue if and only if tensorflow-metal is installed. The test is described in this thread: https://developer.apple.com/forums/thread/696835 -- repeated cals to e.g. tf.random.uniform((10,)) produce the exact same results, only if tensorflow-metal is installed
Topic: Machine Learning & AI SubTopic: General Tags:
Dec ’21