Thanks for mmmetal and roserg! I did misunderstanding the SIMDgroup usage, you are right.
By the way, Anyone who intent to use this feature can refer in the implementation in TF-Lite:
https://github.com/alpa-projects/tensorflow-alpa/blob/ee8f6612b515ada4509fa53491c5ba5b3ef8524a/tensorflow/lite/delegates/gpu/common/tasks/conv_metal_simd.cc
Topic:
Graphics & Games
SubTopic:
General
Tags: