如何用卷积神经网络CNN识别手写数字集资料.docx
《如何用卷积神经网络CNN识别手写数字集资料.docx》由会员分享,可在线阅读,更多相关《如何用卷积神经网络CNN识别手写数字集资料.docx(21页珍藏版)》请在冰豆网上搜索。
如何用卷积神经网络CNN识别手写数字集资料
如何用卷积神经网络CNN识别手写数字集?
前几天用CNN识别手写数字集,后来看到kaggle上有一个比赛是识别手写数字集的,已经进行了一年多了,目前有1179个有效提交,最高的是100%,我做了一下,用keras做的,一开始用最简单的MLP,准确率只有98.19%,然后不断改进,现在是99.78%,然而我看到排名第一是100%,心碎==,于是又改进了一版,现在把最好的结果记录一下,如果提升了再来更新。
手写数字集相信大家应该很熟悉了,这个程序相当于学一门新语言的“HelloWorld”,或者mapreduce的“WordCount”:
)这里就不多做介绍了,简单给大家看一下:
复制代码
1#Author:
Charlotte
2#Plotmnistdataset
3fromkeras.datasetsimportmnist
4importmatplotlib.pyplotasplt
5#loadtheMNISTdataset
6(X_train,y_train),(X_test,y_test)=mnist.load_data()
7#plot4imagesasgrayscale
8plt.subplot(221)
9plt.imshow(X_train[0],cmap=plt.get_cmap('PuBuGn_r'))
10plt.subplot(222)
11plt.imshow(X_train[1],cmap=plt.get_cmap('PuBuGn_r'))
12plt.subplot(223)
13plt.imshow(X_train[2],cmap=plt.get_cmap('PuBuGn_r'))
14plt.subplot(224)
15plt.imshow(X_train[3],cmap=plt.get_cmap('PuBuGn_r'))
16#showtheplot
17plt.show()
复制代码
图:
1.BaseLine版本
一开始我没有想过用CNN做,因为比较耗时,所以想看看直接用比较简单的算法看能不能得到很好的效果。
之前用过机器学习算法跑过一遍,最好的效果是SVM,96.8%(默认参数,未调优),所以这次准备用神经网络做。
BaseLine版本用的是MultiLayerPercepton(多层感知机)。
这个网络结构比较简单,输入--->隐含--->输出。
隐含层采用的rectifierlinearunit,输出直接选取的softmax进行多分类。
网络结构:
代码:
复制代码
1#coding:
utf-8
2#BaselineMLPforMNISTdataset
3importnumpy
4fromkeras.datasetsimportmnist
5fromkeras.modelsimportSequential
6fromkeras.layersimportDense
7fromkeras.layersimportDropout
8fromkeras.utilsimportnp_utils
9
10seed=7
11numpy.random.seed(seed)
12#加载数据
13(X_train,y_train),(X_test,y_test)=mnist.load_data()
14
15num_pixels=X_train.shape[1]*X_train.shape[2]
16X_train=X_train.reshape(X_train.shape[0],num_pixels).astype('float32')
17X_test=X_test.reshape(X_test.shape[0],num_pixels).astype('float32')
18
19X_train=X_train/255
20X_test=X_test/255
21
22#对输出进行onehot编码
23y_train=np_utils.to_categorical(y_train)
24y_test=np_utils.to_categorical(y_test)
25num_classes=y_test.shape[1]
26
27#MLP模型
28defbaseline_model():
29model=Sequential()
30model.add(Dense(num_pixels,input_dim=num_pixels,init='normal',activation='relu'))
31model.add(Dense(num_classes,init='normal',activation='softmax'))
32model.summary()
33pile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
34returnmodel
35
36#建立模型
37model=baseline_model()
38
39#Fit
40model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=10,batch_size=200,verbose=2)
41
42#Evaluation
43scores=model.evaluate(X_test,y_test,verbose=0)
44print("BaselineError:
%.2f%%"%(100-scores[1]*100))#输出错误率
复制代码
结果:
复制代码
1Layer(type)OutputShapeParam#Connectedto
2====================================================================================================
3dense_1(Dense)(None,784)615440dense_input_1[0][0]
4____________________________________________________________________________________________________
5dense_2(Dense)(None,10)7850dense_1[0][0]
6====================================================================================================
7Totalparams:
623290
8____________________________________________________________________________________________________
9Trainon60000samples,validateon10000samples
10Epoch1/10
113s-loss:
0.2791-acc:
0.9203-val_loss:
0.1420-val_acc:
0.9579
12Epoch2/10
133s-loss:
0.1122-acc:
0.9679-val_loss:
0.0992-val_acc:
0.9699
14Epoch3/10
153s-loss:
0.0724-acc:
0.9790-val_loss:
0.0784-val_acc:
0.9745
16Epoch4/10
173s-loss:
0.0509-acc:
0.9853-val_loss:
0.0774-val_acc:
0.9773
18Epoch5/10
193s-loss:
0.0366-acc:
0.9898-val_loss:
0.0626-val_acc:
0.9794
20Epoch6/10
213s-loss:
0.0265-acc:
0.9930-val_loss:
0.0639-val_acc:
0.9797
22Epoch7/10
233s-loss:
0.0185-acc:
0.9956-val_loss:
0.0611-val_acc:
0.9811
24Epoch8/10
253s-loss:
0.0150-acc:
0.9967-val_loss:
0.0616-val_acc:
0.9816
26Epoch9/10
274s-loss:
0.0107-acc:
0.9980-val_loss:
0.0604-val_acc:
0.9821
28Epoch10/10
294s-loss:
0.0073-acc:
0.9988-val_loss:
0.0611-val_acc:
0.9819
30BaselineError:
1.81%
复制代码
可以看到结果还是不错的,正确率98.19%,错误率只有1.81%,而且只迭代十次效果也不错。
这个时候我还是没想到去用CNN,而是想如果迭代100次,会不会效果好一点?
于是我迭代了100次,结果如下:
Epoch100/100
8s-loss:
4.6181e-07-acc:
1.0000-val_loss:
0.0982-val_acc:
0.9854
BaselineError:
1.46%
从结果中可以看出,迭代100次也只提高了0.35%,没有突破99%,所以就考虑用CNN来做。
2.简单的CNN网络
keras的CNN模块还是很全的,由于这里着重讲CNN的结果,对于CNN的基本知识就不展开讲了。
网络结构:
代码:
复制代码
1#coding:
utf-8
2#SimpleCNN
3importnumpy
4fromkeras.datasetsimportmnist
5fromkeras.modelsimportSequential
6fromkeras.layersimportDense
7fromkeras.layersimportDropout
8fromkeras.layersimportFlatten
9fromkeras.layers.convolutionalimportConvolution2D
10fromkeras.layers.convolutionalimportMaxPooling2D
11fromkeras.utilsimportnp_utils
12
13seed=7
14numpy.random.seed(seed)
15
16#加载数据
17(X_train,y_train),(X_test,y_test)=mnist.load_data()
18#reshapetobe[samples][channels][width][height]
19X_train=X_train.reshape(X_train.shape[0],1,28,28).astype('float32')
20X_test=X_test.reshape(X_test.shape[0],1,28,28).astype('float32')
21
22#normalizeinputsfrom0-255to0-1
23X_train=X_train/255
24X_test=X_test/255
25
26#onehotencodeoutputs
27y_train=np_utils.to_categorical(y_train)
28y_test=np_utils.to_categorical(y_test)
29num_classes=y_test.shape[1]
30
31#defineasimpleCNNmodel
32defbaseline_model():
33#createmodel
34model=Sequential()
35model.add(Convolution2D(32,5,5,border_mode='valid',input_shape=(1,28,28),activation='relu'))
36model.add(MaxPooling2D(pool_size=(2,2)))
37model.add(Dropout(0.2))
38model.add(Flatten())
39model.add(Dense(128,activation='relu'))
40model.add(Dense(num_classes,activation='softmax'))
41#Compilemodel
42pile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
43returnmodel
44
45#buildthemodel
46model=baseline_model()
47
48#Fitthemodel
49model.fit(X_train,y_train,validation_data=(X_test,y_test),nb_epoch=10,batch_size=128,verbose=2)
50
51#Finalevaluationofthemodel
52scores=model.evaluate(X_test,y_test,verbose=0)
53print("CNNError:
%.2f%%"%(100-scores[1]*100))
复制代码
结果:
复制代码
1____________________________________________________________________________________________________
2Layer(type)OutputShapeParam#Connectedto
3====================================================================================================
4convolution2d_1(Convolution2D)(None,32,24,24)832convolution2d_input_1[0][0]
5____________________________________________________________________________________________________
6maxpooling2d_1(MaxPooling2D)(None,32,12,12)0convolution2d_1[0][0]
7____________________________________________________________________________________________________
8dropout_1(Dropout)(None,32,12,12)0maxpooling2d_1[0][0]
9____________________________________________________________________________________________________
10flatten_1(Flatten)(None,4608)0dropout_1[0][0]
11____________________________________________________________________________________________________
12dense_1(Dense)(None,128)589952flatten_1[0][0]
13____________________________________________________________________________________________________
14dense_2(Dense)(None,10)1290dense_1[0][0]
15====================================================================================================
16Totalparams:
592074
17____________________________________________________________________________________________________
18Trainon60000samples,validateon10000samples
19Epoch1/10
2032s-loss:
0.2412-acc:
0.9318-val_loss:
0.0754-val_acc:
0.9766
21Epoch2/10
2232s-loss:
0.0726-acc:
0.9781-val_loss:
0.0534-val_acc:
0.9829
23Epoch3/10
2432s-loss:
0.0497-acc:
0.9852-val_loss:
0.0391-val_acc:
0.9858
25Epoch4/10
2632s-loss:
0.0413-acc:
0.9870-val_loss:
0.0432-val_acc:
0.9854
27Epoch5/10
2834s-loss:
0.0323-acc:
0.9897-val_loss:
0.0375-val_acc:
0.9869
29Epoch6/10
3036s-loss:
0.0281-acc:
0.9909-val_loss:
0.0424-val_acc:
0.9864
31Epoch7/10
3236s-loss:
0.0223-acc:
0.9930-val_loss:
0.0328-val_acc:
0.9893
33Epoch8/10
3436s-loss:
0.0198-acc:
0.9939-val_loss:
0.0381-val_acc:
0.9880
35Epoch9/10
3636s-loss:
0.0156-acc:
0.9954-val_loss:
0.0347-val_acc:
0.9884
37Epoch10/10
3836s-loss:
0.0141-acc:
0.9955-val_loss:
0.0318-val_acc:
0.9893
39CNNError:
1.07%
复制代码
迭代的结果中,loss和acc为训练集的结果,val_loss和val_acc为验证机的结果。
从结果上来看,效果不错,比100次迭代的MLP(1.46%)提升了0.39%,CNN的误差率为1.07%。
这里的CNN的网络结构还是比较简单的,如果把CNN的结果再加几层,边复杂一代,结果是否还能提升?
3.LargerCNN
这一次我加了几层卷积层,代码:
复制代码
1#LargerCNN
2importnumpy
3fromkeras.datasetsimportmnist
4fromkeras.modelsimportSequential
5fromkeras.layersimportDense
6fromkeras.layersimportDropout
7fromkeras.layersimportFlatten
8fromkeras.layers.convolutionalimportConvolution2D
9fromkeras.layers.convolutionalimportMaxPooling2D
10fromkeras.utilsimportnp_utils
11
12seed=7
13numpy.random.seed(seed)
14#loaddata
15(X_train,y_train),(X_test,y_test)=mnist.load_data()
16#reshapetobe[samples][pixels][width][height]
17X_train=X_train.reshape(X_train.shape[0],1,28,28).astype('float32')
18X_test=X_test.reshape(X_test.shape[0],1,28,28).astype('float32')
19#normalizeinputsfrom0-255to0-1
20X_train=X_train/255
21X_test=X_test/255
22#onehotencodeoutputs
23y_train=np_utils.to_categorical(y_train)
24y_test=np_utils.to_categorical(y_test)
25num_classes=y_t