Post

Replies

Boosts

Views

Activity

Reply to Performance issue on Macbook Pro M1
October 2023 and the issue is still there -- after my upgrade to Sonoma OS I can't get my tensorflow metal to behave well with batch-size of 128 -- I used to run at 64 just fine (it was speedy) and now with higher batches I do see some (not great) performance improvements but the model overfits with large batch sizes. I have read the blogs for all sort of suggestions, reverting back to older version of TF for MAC (I don't want to do that). One suggestion I saw from some postings is to disable GPU alltogether -- anyone had any succces with that?
Topic: Machine Learning & AI SubTopic: General Tags:
Oct ’23
Reply to Tensorflow on M1 Macbook Pro, error when model fit executes
So - after downgrading to an older version of TensorFlow-Metal and TensorFlow-MacOs it works back in the desired speed (although a few annoying warning messages) conda install -c apple tensorflow-deps==2.9.0 python -m pip install tensorflow-macos==2.9.0 python -m pip install tensorflow-metal==0.5.0 That did the trick But the annoying warning messages are as follows - if anyone has an idea how to fix I am running on Mac OS Sonoma 819/819 [==============================] - ETA: 0s - loss: 0.7380 - auc: 0.7107 - prc: 0.63482023-09-28 22:20:35.837423: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled. loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x64x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x1x1x1999xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>' loc("mps_select"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":294:0)): error: 'anec.gain_offset_control' op result #0 must be 4D/5D memref of 16-bit float or 8-bit signed integer or 8-bit unsigned integer values, but got 'memref<1x40x1x1xi1>'
Topic: Machine Learning & AI SubTopic: General Tags:
Sep ’23
Reply to Tensorflow on M1 Macbook Pro, error when model fit executes
After downgrading to legacy version the speed came back and works again. I then tried to upgrade back to the latest libraries and plug iin and I am getting the following warnings: 2023-09-28 22:02:25.250960: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro 2023-09-28 22:02:25.250990: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB 2023-09-28 22:02:25.250999: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB 2023-09-28 22:02:25.251066: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-09-28 22:02:25.251101: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) Every epoc -- or every Fit or Predict displays this line now: 2023-09-28 22:05:11.530865: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. Any idea what's going on and to get things back to optimal performance? Its is still about 5X the speed of the legacy versions
Topic: Machine Learning & AI SubTopic: General Tags:
Sep ’23
Reply to Tensorflow-metal runs extremely slow
It seems like there is no consensus as to how to resolve this. I have upgraded my OS to Sonoma on Mac - latest OS to date and it seems like my tensorflow needed to be updated along with all dependent libraries and at that point it runs EXTREMELY slow. I have been searching all over to find a solution but there isnt' one that I was able to find. Any help / direction from you would be greatly appreciated
Topic: Machine Learning & AI SubTopic: General Tags:
Sep ’23
Reply to Tensorflow on M1 Macbook Pro, error when model fit executes
I just upgraded to Sonoma the latest Mac OS and my fit method start giving me weird warning and errors and the training is very slow! Unusable 2023-09-28 10:50:29.393958: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro 2023-09-28 10:50:29.394708: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB 2023-09-28 10:50:29.394896: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB 2023-09-28 10:50:29.395661: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:303] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-09-28 10:50:29.395694: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:269] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) Training a brand new model - [13, 23, 42, 70]-Long 2023-09-28 10:50:35.795666: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. 2023-09-28 10:50:35.803057: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. Epoch 1/10 2023-09-28 10:50:37.043038: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
Topic: Machine Learning & AI SubTopic: General Tags:
Sep ’23