El entrenamiento después de 15 épocas en el conjunto de datos CIFAR-10 parece hacer que la pérdida de validación ya no disminuya, permaneciendo alrededor de 1.4 (con una precisión de validación del 60%). He barajado el conjunto de entrenamiento, dividido entre 255 e importado como float32. He probado numerosas arquitecturas, con y sin abandono en las capas Conv2D y nada parece funcionar. La misma arquitectura logra una precisión del 99.7% en los conjuntos de prueba para MNIST. Por favor vea la arquitectura a continuación:
(Nota: he intentado aumentar el abandono y aumentar / disminuir la tasa de aprendizaje del optimizador Adam para evitar el sobreajuste, todo esto es evitar el sobreajuste, pero ahora tanto el entrenamiento como el conjunto de pruebas tienen una precisión baja similar alrededor del 60%).
with tf.device('/gpu:0'):
tf.placeholder(tf.float32, shape=(None, 20, 64))
#placeholder initialized (pick /cpu:0 or /gpu:0)
seed = 6
np.random.seed(seed)
modelnn = Sequential()
neurons = x_train_reduced.shape[1:]
modelnn.add(Convolution2D(32, 3, 3, input_shape=neurons, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(32, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
#modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
#modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
#modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Flatten())
#modelnn.add(Dropout(0.5))
modelnn.add(Dense(1024, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(Dropout(0.5))
modelnn.add(Dense(512, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(Dropout(0.5))
modelnn.add(Dense(10, activation='softmax'))
modelnn.compile(loss='categorical_crossentropy', optimizer=optimizer_input, metrics=['accuracy'])
y_train = to_categorical(y_train)
modelnn.fit(x_train_reduced, y_train, nb_epoch=nb_epoch_count, shuffle=True, batch_size=bsize,
validation_split=0.1)
Resultados:
44100/44100 [==============================] - 22s - loss: 2.1453 - acc: 0.2010 - val_loss: 1.9812 - val_acc: 0.2959
Epoch 2/50
44100/44100 [==============================] - 24s - loss: 1.9486 - acc: 0.3089 - val_loss: 1.8685 - val_acc: 0.3567
Epoch 3/50
44100/44100 [==============================] - 18s - loss: 1.8599 - acc: 0.3575 - val_loss: 1.7822 - val_acc: 0.3982
Epoch 4/50
44100/44100 [==============================] - 18s - loss: 1.7925 - acc: 0.3933 - val_loss: 1.7272 - val_acc: 0.4229
Epoch 5/50
44100/44100 [==============================] - 18s - loss: 1.7425 - acc: 0.4195 - val_loss: 1.6806 - val_acc: 0.4459
Epoch 6/50
44100/44100 [==============================] - 18s - loss: 1.6998 - acc: 0.4440 - val_loss: 1.6436 - val_acc: 0.4682
Epoch 7/50
44100/44100 [==============================] - 18s - loss: 1.6636 - acc: 0.4603 - val_loss: 1.6156 - val_acc: 0.4837
Epoch 8/50
44100/44100 [==============================] - 18s - loss: 1.6333 - acc: 0.4781 - val_loss: 1.6351 - val_acc: 0.4776
Epoch 9/50
44100/44100 [==============================] - 18s - loss: 1.6086 - acc: 0.4898 - val_loss: 1.5732 - val_acc: 0.5063
Epoch 10/50
44100/44100 [==============================] - 18s - loss: 1.5776 - acc: 0.5065 - val_loss: 1.5411 - val_acc: 0.5227
Epoch 11/50
44100/44100 [==============================] - 18s - loss: 1.5585 - acc: 0.5145 - val_loss: 1.5485 - val_acc: 0.5212
Epoch 12/50
44100/44100 [==============================] - 18s - loss: 1.5321 - acc: 0.5288 - val_loss: 1.5354 - val_acc: 0.5316
Epoch 13/50
44100/44100 [==============================] - 18s - loss: 1.5082 - acc: 0.5402 - val_loss: 1.5022 - val_acc: 0.5427
Epoch 14/50
44100/44100 [==============================] - 18s - loss: 1.4945 - acc: 0.5438 - val_loss: 1.4916 - val_acc: 0.5490
Epoch 15/50
44100/44100 [==============================] - 192s - loss: 1.4762 - acc: 0.5535 - val_loss: 1.5159 - val_acc: 0.5394
Epoch 16/50
44100/44100 [==============================] - 18s - loss: 1.4577 - acc: 0.5620 - val_loss: 1.5389 - val_acc: 0.5257
Epoch 17/50
44100/44100 [==============================] - 18s - loss: 1.4425 - acc: 0.5671 - val_loss: 1.4590 - val_acc: 0.5667
Epoch 18/50
44100/44100 [==============================] - 18s - loss: 1.4258 - acc: 0.5766 - val_loss: 1.4552 - val_acc: 0.5763
Epoch 19/50
44100/44100 [==============================] - 18s - loss: 1.4113 - acc: 0.5805 - val_loss: 1.4439 - val_acc: 0.5767
Epoch 20/50
44100/44100 [==============================] - 18s - loss: 1.3971 - acc: 0.5879 - val_loss: 1.4473 - val_acc: 0.5769
Epoch 21/50
44100/44100 [==============================] - 18s - loss: 1.3850 - acc: 0.5919 - val_loss: 1.4251 - val_acc: 0.5871
Epoch 22/50
44100/44100 [==============================] - 18s - loss: 1.3668 - acc: 0.6006 - val_loss: 1.4203 - val_acc: 0.5910
Epoch 23/50
44100/44100 [==============================] - 18s - loss: 1.3549 - acc: 0.6051 - val_loss: 1.4207 - val_acc: 0.5939
Epoch 24/50
44100/44100 [==============================] - 18s - loss: 1.3373 - acc: 0.6111 - val_loss: 1.4516 - val_acc: 0.5784
Epoch 25/50
44100/44100 [==============================] - 18s - loss: 1.3285 - acc: 0.6149 - val_loss: 1.4146 - val_acc: 0.5922
Epoch 26/50
44100/44100 [==============================] - 18s - loss: 1.3134 - acc: 0.6205 - val_loss: 1.4090 - val_acc: 0.6024
Epoch 27/50
44100/44100 [==============================] - 18s - loss: 1.3043 - acc: 0.6239 - val_loss: 1.4307 - val_acc: 0.5959
Epoch 28/50
44100/44100 [==============================] - 18s - loss: 1.2862 - acc: 0.6297 - val_loss: 1.4241 - val_acc: 0.5978
Epoch 29/50
44100/44100 [==============================] - 18s - loss: 1.2706 - acc: 0.6340 - val_loss: 1.4046 - val_acc: 0.6067
Epoch 30/50
44100/44100 [==============================] - 18s - loss: 1.2634 - acc: 0.6405 - val_loss: 1.4120 - val_acc: 0.6037
Epoch 31/50
44100/44100 [==============================] - 18s - loss: 1.2473 - acc: 0.6446 - val_loss: 1.4067 - val_acc: 0.6045
Epoch 32/50
44100/44100 [==============================] - 18s - loss: 1.2411 - acc: 0.6471 - val_loss: 1.4083 - val_acc: 0.6098
Epoch 33/50
44100/44100 [==============================] - 18s - loss: 1.2241 - acc: 0.6498 - val_loss: 1.4091 - val_acc: 0.6076
Epoch 34/50
44100/44100 [==============================] - 18s - loss: 1.2121 - acc: 0.6541 - val_loss: 1.4209 - val_acc: 0.6127
Epoch 35/50
44100/44100 [==============================] - 18s - loss: 1.1995 - acc: 0.6582 - val_loss: 1.4230 - val_acc: 0.6131
Epoch 36/50
44100/44100 [==============================] - 18s - loss: 1.1884 - acc: 0.6622 - val_loss: 1.4024 - val_acc: 0.6124
Epoch 37/50
44100/44100 [==============================] - 18s - loss: 1.1778 - acc: 0.6657 - val_loss: 1.4328 - val_acc: 0.6080
Epoch 38/50
44100/44100 [==============================] - 18s - loss: 1.1612 - acc: 0.6683 - val_loss: 1.4246 - val_acc: 0.6159
Epoch 39/50
44100/44100 [==============================] - 18s - loss: 1.1466 - acc: 0.6735 - val_loss: 1.4282 - val_acc: 0.6122
Epoch 40/50
44100/44100 [==============================] - 18s - loss: 1.1325 - acc: 0.6783 - val_loss: 1.4311 - val_acc: 0.6157
Epoch 41/50
44100/44100 [==============================] - 18s - loss: 1.1213 - acc: 0.6806 - val_loss: 1.4647 - val_acc: 0.6047
Epoch 42/50
44100/44100 [==============================] - 18s - loss: 1.1064 - acc: 0.6842 - val_loss: 1.4631 - val_acc: 0.6047
Epoch 43/50
44100/44100 [==============================] - 18s - loss: 1.0967 - acc: 0.6870 - val_loss: 1.4535 - val_acc: 0.6106
Epoch 44/50
44100/44100 [==============================] - 18s - loss: 1.0822 - acc: 0.6893 - val_loss: 1.4532 - val_acc: 0.6149
Epoch 45/50
44100/44100 [==============================] - 18s - loss: 1.0659 - acc: 0.6941 - val_loss: 1.4691 - val_acc: 0.6108
Epoch 46/50
44100/44100 [==============================] - 18s - loss: 1.0610 - acc: 0.6956 - val_loss: 1.4751 - val_acc: 0.6106
Epoch 47/50
44100/44100 [==============================] - 18s - loss: 1.0397 - acc: 0.6981 - val_loss: 1.4857 - val_acc: 0.6041
Epoch 48/50
44100/44100 [==============================] - 18s - loss: 1.0208 - acc: 0.7039 - val_loss: 1.4901 - val_acc: 0.6106
Epoch 49/50
44100/44100 [==============================] - 18s - loss: 1.0187 - acc: 0.7036 - val_loss: 1.4994 - val_acc: 0.6106
Epoch 50/50
44100/44100 [==============================] - 18s - loss: 1.0024 - acc: 0.7070 - val_loss: 1.5078 - val_acc: 0.6039
Time: 1109.7512991428375
Neural Network now trained from dimensions (49000, 3, 32, 32)
Actualización: Pruebas adicionales que incluyen BatchNormalization con y sin MaxNorm -
Nueva arquitectura:
modelnn.add(Convolution2D(32, 3, 3, input_shape=neurons, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(32, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(64, 3, 3, activation='relu', border_mode='same'))
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.2))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(Convolution2D(128, 3, 3, activation='relu', border_mode='same'))
modelnn.add(BatchNormalization())
modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Dropout(0.2))
# modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
# modelnn.add(Convolution2D(256, 3, 3, activation='relu', border_mode='same'))
# modelnn.add(MaxPooling2D(pool_size=(2, 2)))
modelnn.add(Flatten())
modelnn.add(Dense(1024, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.5))
modelnn.add(Dense(512, activation='relu', W_constraint=maxnorm(3)))
modelnn.add(BatchNormalization())
modelnn.add(Dropout(0.5))
modelnn.add(Dense(10, activation='softmax'))