Post

Replies

Boosts

Views

Created

error during training Mask RCNN TF2
I cloned the repository (https://github.com/akTwelve/Mask_RCNN) then I executed a training script then it gives me an error: Does it look like TensorFlow cannot find GPU? How to resolve the problem? And the training process is not working at all. M1 pro, tensorflow==2.10.0, keras==2.10.0 Metal device set to: Apple M1 Pro systemMemory: 32.00 GB maxCacheSize: 10.67 GB 2022-09-23 14:20:01.478879: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2022-09-23 14:20:01.479476: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) 2022-09-23 14:20:01.547229: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled 2022-09-23 14:20:01.566461: W tensorflow/core/common_runtime/forward_type_inference.cc:332] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_FLOAT } } } is neither a subtype nor a supertype of the combined inputs preceding it: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT32 } } } while inferring type of node 'proposal_targets/cond/output/_16' 2022-09-23 14:20:01.603163: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz 2022-09-23 14:20:01.618797: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. Starting at epoch 0. LR=0.001 Checkpoint Path: ./kangaroo_cfg20220923T1420/mask_rcnn_kangaroo_cfg_{epoch:04d}.h5 Selecting layers to train fpn_c5p5 (Conv2D) fpn_c4p4 (Conv2D) fpn_c3p3 (Conv2D) fpn_c2p2 (Conv2D) fpn_p5 (Conv2D) fpn_p2 (Conv2D) fpn_p3 (Conv2D) fpn_p4 (Conv2D) rpn_model (Functional) mrcnn_mask_conv1 (TimeDistributed) mrcnn_mask_bn1 (TimeDistributed) mrcnn_mask_conv2 (TimeDistributed) mrcnn_mask_bn2 (TimeDistributed) mrcnn_class_conv1 (TimeDistributed) mrcnn_class_bn1 (TimeDistributed) mrcnn_mask_conv3 (TimeDistributed) mrcnn_mask_bn3 (TimeDistributed) mrcnn_class_conv2 (TimeDistributed) mrcnn_class_bn2 (TimeDistributed) mrcnn_mask_conv4 (TimeDistributed) mrcnn_mask_bn4 (TimeDistributed) mrcnn_bbox_fc (TimeDistributed) mrcnn_mask_deconv (TimeDistributed) mrcnn_class_logits (TimeDistributed) mrcnn_mask (TimeDistributed) The full error message is in the text file (attached) mask_rcnn_error_message.txt
0
0
1.4k
Sep ’22