Silly question but does anyone know where can I find the code for the demos in this video?
https://developer.apple.com/videos/play/wwdc2022/10063/
I am trying to replicate the distributed training but running into version errors with Horovod.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi, I am trying to write the SWIN Transformer for Image Classification using MPSGraph. The code basically runs two-nested for-loops.
At one instance in the inner for-loop, we have a step where matrix multiplication is carried out. In total, this command runs 12 times, and the first 10 times it gives an accurate answer. But the 11th time the answer is completely inaccurate.
I even tried saving the tensors, as bytes, just before the matrix multiplication was carried out for the 11th time and loaded them into a separate project. There the answer was accurate.
I thought this was a memory error because I have an M1 with just 8GB ram but I even tried running my code on the M1 Ultra studio and the result was the same.
I am trying to run distributed training with TF-Metal and two M1-Ultra Studios.
On connecting them via thunderbolt cable, I am able to see the device in the "About Mac" column but TensorFlow doesn't pick that GPU up.
Commands I am using:-
mirrored_strategy = tf.distribute.MirroredStrategy()
Output:-
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
TF-Metal version: 2.9.2
I am trying to implement the gather operation similar to how it is done in TensorFlow and my goal is to capture multiple columns/rows along the desired axis.
But I am facing errors and the result isn't the same. I also can't understand the difference between the three different kinds of Gather operations since the documentation is so sparse.
TF Code:-
import tensorflow as tf
tensor = tf.constant([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=tf.float32)
columns = [1, 3]
print(tf.gather(tensor, columns, axis=1).numpy())
# [[2. 4.]
# [6. 8.]]
Swift Code:-
import Foundation
import MetalPerformanceShadersGraph
let graph = MPSGraph()
let device = MTLCreateSystemDefaultDevice()!
let b = 2
let w = 4
let inputShape = [NSNumber(value: b), NSNumber(value: w)]
let inputTensor = graph.placeholder(shape: inputShape, dataType: .float32, name: nil)
let desc = MPSNDArrayDescriptor(dataType: .float32, shape: inputShape)
let inputNDArray = MPSNDArray(device: device, descriptor: desc)
var inputValues: [Float] = []
for i in 1...b*w{
inputValues.append(Float(i))
}
print("Input: \(inputValues)")
print("Input Shape: \(inputShape)")
inputNDArray.writeBytes(&inputValues, strideBytes: nil)
let inputs = MPSGraphTensorData(inputNDArray)
var indices: [Int32] = [1, 3]
var remainingIn: [Int32] = [2, 3, 4]
let indicesTensor = graph.constant(Data(bytes: &indices, count: indices.count * 4), shape: [2, 4], dataType: .int32)
print("Indices Shape: \(indicesTensor.shape)")
let gather = graph.gatherAlongAxis(0, updates: inputTensor, indices: indicesTensor, name: nil)
//let gather = graph.gather(withUpdatesTensor: inputTensor, indicesTensor: indicesTensor, axis: 1, batchDimensions: 0, name: nil)
//let gather = graph.gatherND(withUpdatesTensor: inputTensor, indicesTensor: indicesTensor, batchDimensions: 0, name: nil)
print("Gather: \(gather.shape)")
let results = graph.run(feeds: [inputTensor: inputs],
targetTensors: [gather],
targetOperations: nil)
let outputNDArray = results[gather]!.mpsndarray()
var outputValues: [Float32] = .init(repeating: 0, count: Int(truncating: gather.shape![0]) * Int(truncating: gather.shape![1]))
outputNDArray.readBytes(&outputValues, strideBytes: nil)
print("Output: \(outputValues)")
//Output: [5.0, 0.0]