### k-Nearest Neighbors algorithm

K-Nearest Neighbors (KNN) classification divides data into a test set and a training set. For each row of the test set, the K nearest (in Euclidean distance) training set objects are found, and the classification is determined by majority vote with ties broken at random. If there are ties for the Kth nearest vector, all candidates are included in the vote.

 Pros High accuracy, insensitive to outliers, no assumptions about data Cons Computationally expensive, requires a lot of memory Works with Numeric values, nominal values

In the following diagram, Blue dots are already planted on a Cartesian Coordinates system and we want to know the neighbors of new fired red dot

#### Example

```from numpy import array, tile  # for createDataSetForGroups() functions
import operator

def createDataSetForGroups():
group = array([[ 1 , 4], [ 2 , 5 ], [ 3 , 1 ], [ 4 , 2]])
return group

def createDataSetForLabels():
labels = ['A', 'A', 'B', 'B']
return labels

def classify0(inX, dataSet, labels, k):
#-----------calculating distance start-------------
dataSetSize = dataSet.shape[0]  # size of input data set

diffMat = tile(inX, (dataSetSize, 1)) - dataSet  # subtracting distance
sqDiffMat = diffMat ** 2  # powers calculating
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances ** 0.5  # square root
sortedDistIndicies = distances.argsort()  # sorting distance
#-----------calculating distance end-------------

classCount = {}  # dictionary

# sorting distances
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1

sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1), reverse=True)

return sortedClassCount[0][0]  # return first element

grp = createDataSetForGroups()
lbls = createDataSetForLabels()

print classify0([4, 5], grp, lbls, 3)
```

A