求助! tensorflow 中,CNN 在GPU上运行无响应,然后内核崩溃 |
您所在的位置:网站首页 › vscodekernel崩溃 › 求助! tensorflow 中,CNN 在GPU上运行无响应,然后内核崩溃 |
import tensorflow as tf
import numpy as np
import os
def getCNNmodel():
input_layer = tf.keras.layers.Input(shape=(513, 32,1),name="input")
trunk_0 = tf.keras.layers.Conv2D(filters=3, kernel_size=3, activation='relu', padding='same',name="trunk0")(input_layer)
trunk_0 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(trunk_0)
trunk_0 = tf.keras.layers.Flatten()(trunk_0)
branch_1 = tf.keras.layers.Dense(units=128, activation='relu',name="branch1_1")(trunk_0)
branch_1 = tf.keras.layers.Dense(units=10, activation='softmax',name="branch1_2")(branch_1)
#branch_2 = tf.keras.layers.Dense(units=128, activation='relu',name="branch2_1")(trunk_0)
#branch_2 = tf.keras.layers.Dense(units=10, activation='softmax',name="branch2_2")(branch_2)
#merged_output = tf.keras.layers.concatenate([branch_1, branch_2],name="merged")
model = tf.keras.models.Model(inputs=input_layer, outputs=branch_1, name="modelForCFF")
# 编译模型
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
return model
os.environ['KMP_DUPLICATE_LIB_OK']="TRUE"
def getDNNmodel():
input_layer = tf.keras.layers.Input(shape=(513),name="input")
branch_1 = tf.keras.layers.Dense(units=128, activation='relu',name="branch1_1")(input_layer)
branch_1 = tf.keras.layers.Dense(units=10, activation='softmax',name="branch1_2")(branch_1)
#branch_2 = tf.keras.layers.Dense(units=128, activation='relu',name="branch2_1")(trunk_0)
#branch_2 = tf.keras.layers.Dense(units=10, activation='softmax',name="branch2_2")(branch_2)
#merged_output = tf.keras.layers.concatenate([branch_1, branch_2],name="merged")
model = tf.keras.models.Model(inputs=input_layer, outputs=branch_1, name="modelForCFF")
# 编译模型
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
return model
#os.environ['KMP_DUPLICATE_LIB_OK']="TRUE"
模型如上图 然后运行: modelForCFF = getCNNmodel() X_train = np.zeros([1,513,32,1]) Y_train = np.zeros([100,10]) modelForCFF.summary() #modelForCFF.fit(X_train,Y_train,epochs=10,verbose=1) with tf.device('/GPU:0'): a = modelForCFF.predict(X_train)能成功显示模型: Model: "modelForCFF" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input (InputLayer) [(None, 513, 32, 1)] 0 trunk0 (Conv2D) (None, 513, 32, 3) 30 max_pooling2d (MaxPooling2D (None, 256, 16, 3) 0 ) flatten (Flatten) (None, 12288) 0 branch1_1 (Dense) (None, 128) 1572992 branch1_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,574,312 Trainable params: 1,574,312 Non-trainable params: 0 _________________________________________________________________然后就会卡住,但是如果改成: with tf.device('/CPU:0'): a = modelForCFF.predict(X_train)可以完整运行 Model: "modelForCFF" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input (InputLayer) [(None, 513, 32, 1)] 0 trunk0 (Conv2D) (None, 513, 32, 3) 30 max_pooling2d (MaxPooling2D (None, 256, 16, 3) 0 ) flatten (Flatten) (None, 12288) 0 branch1_1 (Dense) (None, 128) 1572992 branch1_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,574,312 Trainable params: 1,574,312 Non-trainable params: 0 _________________________________________________________________ 1/1 [==============================] - 0s 129ms/step我用了下面的几种办法: 1.建新的环境,重新下载安装tensorflow 2.10 避免可能的包会冲突。 2.更新显卡驱动 3.检查cuda,cudnn的安装情况(11.5安装完好);又卸载了重新安装另一个版本(11.6),符合驱动等安装条件的 均没有作用,还是会CNN在GPU上运行卡住,然后内核挂掉,但是在CPU上能跑。 现在问题已经解决,解决办法请看链接 (84条消息) 解决Loaded cuDNN version 8400 Could not load library cudnn_cnn_infer64_8.dll. 问题_—Xi—的博客-CSDN博客 |
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |