root / neighbors / doc / neartag.txt

Revision 129:a87db93d93f6, 2.0 kB (checked in by Tarek Ziad?? <tarek@…>, 16 months ago)

fixed tag calculation

Line 
1=======
2neartag
3=======
4
5This module implements the k-nearest neighbor algorithme (k-NN) that allows
6to compute the distance between elements, given a set of value. Each value
7is a dimension and the set are the coordinates of the element in the multi
8dimensional space.
9
10For tags, the idea is to find neighbours of a given user, depending on the
11tags she uses. The `NearestByTag` class is instanciated with those tags::
12
13    >>> from neartag import NearestByTag
14    >>> tags = ["django", "python", "zen", "fun", "scary"]
15    >>> solver = NearestByTag(tags)
16
17Then each user is added with her name and tag values (boolean value)::
18
19    >>> user_1 = 'user 1', ["django", "python"]
20    >>> user_2 = 'user 2', ["zen", "fun", "scary"]
21    >>> user_3 = 'user 3', ["django"]
22    >>> user_4 = 'user 4', ["django", "python"]
23    >>> for user, tags in (user_1, user_2, user_3, user_4):
24    ...     solver.add_user(user, tags)
25
26The class then will give a sorted list of neighbours of a given user::
27
28    >>> solver.neighbours('user 1')
29    [(0.16..., 'user 4'), (0.3..., 'user 3'), (1.0, 'user 2')]
30    >>> solver.neighbours('user 2')
31    [(0.83..., 'user 3'), (1.0, 'user 1'), (1.0, 'user 4')]
32    >>> solver.neighbours('user 3')
33    [(0.33..., 'user 1'), (0.33..., 'user 4'), (0.83..., 'user 2')]
34    >>> solver.neighbours('user 4')
35    [(0.16..., 'user 1'), (0.33..., 'user 3'), (1.0, 'user 2')]
36
37The smallest the returned value is, the closest the user is.
38
39`neighbours` will return at most 10 neighbours, but this size can be changed::
40
41    >>> solver.neighbours('user 1', 1)
42    [(0.16..., 'user 4')]
43
44This class works in-memory, since the loaded values are small enough to fit.
45
46How to use it with an application
47=================================
48
49Tags changes all the time in an application. The best use is to instanciate
50the class over data retrieved from a database and to compute the distances,
51then to save them within a dedicated table. Since the computation can take
52time, a thread worker can update those distances from time to time in the
53background.
54
55
Note: See TracBrowser for help on using the browser.