OpenCV comes with a data file, letter-recognition.data in opencv/samples/cpp/ folder. If you open it, you will see 20000 lines which may, on first sight, look like garbage. Actually, in each row, first column is an alphabet which is our label. Next 16 numbers following it are its different features. These features are obtained from UCI Machine Learning Repository. You can find the details of these features in this page.
它先把 letter-recognition.data 讀近來
然後垂直分成兩半....上半部拿來當作training pattern
下半部拿來當作test pattern
第一個column 是label..... 後面16 column 是 feature point....
這兩行就是再做這件事
responses, trainData = np.hsplit(train,[1])
labels, testData = np.hsplit(test,[1])
這行就是在train pattern.... trainData ... responses 就是辨認的label
knn.train(trainData, cv2.ml.ROW_SAMPLE, responses)
這行就是在測試pattern....把最後辨識出來的label 放到 result 去
ret, result, neighbours, dist = knn.findNearest(testData, k=5)
這行就是在比較.....
correct = np.count_nonzero(result == labels)
source code : 如下
import sys
sys.path.append('/usr/local/lib/python2.7/site-packages')
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the data, converters convert the letter to a number
data= np.loadtxt('letter-recognition.data', dtype= 'float32', delimiter = ',',
converters= {0: lambda ch: ord(ch)-ord('A')})
# split the data to two, 10000 each for train and test
train, test = np.vsplit(data,2)
# split trainData and testData to features and responses
responses, trainData = np.hsplit(train,[1])
labels, testData = np.hsplit(test,[1])
# Initiate the kNN, classify, measure accuracy.
#knn = cv2.ml.KNearest()
knn = cv2.ml.KNearest_create()
knn.train(trainData, cv2.ml.ROW_SAMPLE, responses)
ret, result, neighbours, dist = knn.findNearest(testData, k=5)
correct = np.count_nonzero(result == labels)
accuracy = correct*100.0/10000
print accuracy
Reference : http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_ml/py_knn/py_knn_opencv/py_knn_opencv.html#knn-opencv
沒有留言:
張貼留言