Nonparametric Decision Making

 

Pattern Recognition and Image Analysis : Earl Gose. Richard Johnsonbaugh. Steve Jost Àú¼­, Prentice Hall, 1996, Page 149~193

 

4.1  Introduction

½Ç¼¼°èÀÇ ´ëºÎºÐÀÇ ¹®Á¦¿¡¼­ °ü½É»ç°¡ ¾î¶² typeÀÇ density functionÀ» °¡Áö´ÂÁö¸¦ ¸ð¸£´Â °æ¿ì°¡ ¸¹´Ù.ÀÌ·¯ÇÑ °æ¿ì¿¡´Â ÀÏ·ÃÀÇ sampleµé¿¡ ´ëÇÑ ÀÓÀÇÀÇ density¸¦ »ç¿ëÇÏ¿© ¸ÂÃß´Â ¹æ¹ýÀ» »ç¿ëÇϴµ¥ ÀÌ°ÍÀ»  Nonparametric Decision Making À̶ó°í ÇÑ´Ù. ÀÌ°ÍÀº densityÀÇ ÀϹÝÀûÀÎ ÇüŸ¦ ÃßÃøÇÒ ¸¸ÇÑ ÃæºÐÇÑ ±Ù°Å¸¦ °®°í ÀÖÁö ¸øÇÒ ¶§ »ç¿ëÇÑ´Ù.

sample data·ÎºÎÅÍ Á¤È®ÇÑ parameter¸¦ ±¸ÇÏ¿© parametric decision functionÀ¸·ÎºÎÅÍ °¢ classµéÀÇ ºÐÆ÷¸¦ ÆľÇÇÏ´Â °ÍÀÌ ¾Æ´Ï¶ó, sample·ÎºÎÅÍ È®·üÀû parameter¸¦ ±¸ÇÏÁö ¾Ê°í °¢ classµéÀÇ ºÐÆ÷ typeÀ» ¸ð¸£´Â »óÅ¿¡¼­ °ð¹Ù·Î classificationÀ» ¼öÇàÇÑ´Ù.

ºÐÆ÷ typeÀ» ¸ð¸¦ ¶§ ±Ù»çÀûÀÎ parametric decision functionÀ» »ç¿ëÇÏ´Â °ÍÀÌ histogram °ú kernal À̸ç ÀÌ¿Í´Â ÀüÇô ´Ù¸¥ ¹æ¹ýÀÌ nearist neighbour classification ¹æ¹ýÀÌ´Ù.

4.2  Histograms

µµ¼öºÐÆ÷¸¦ ³ªÅ¸³»´Â ±×·¡ÇÁ¸¦ ¸»ÇÏ¸ç ±âµÕ±×·¡ÇÁ ·±âµÕ¸ð¾ç ±×¸² µîÀ̶ó°íµµ ÇÑ´Ù. °üÃøÇÑ µ¥ÀÌÅÍÀÇ ºÐÆ÷ÀÇ Æ¯Â¡ÀÌ ÇÑ´«¿¡ º¸À̵µ·Ï ±âµÕ ¸ð¾çÀ¸·Î ³ªÅ¸³½ °ÍÀÌ´Ù. °¡·ÎÃà¿¡ °¢ °è±ÞÀÇ °è±Þ°£°ÝÀ» ³ªÅ¸³»´Â Á¡À» Ç¥½ÃÇÏ°í, ÀÌµé °è±Þ°£°Ý¿¡ ´ëÇÑ ±¸°£ À§¿¡ ÀÌ °è±ÞÀÇ µµ¼ö¿¡ ºñ·ÊÇÏ´Â ³ôÀÌÀÇ ±âµÕÀ» ¼¼¿î´Ù. ..........

°¢ class ¿¡ ´ëÇÑ probability density function ()À» ¸ð¸¦ ¶§ ±Ù»çÀûÀÎ ¸¦ ±¸ÇÏ´Â ¹æ¹ýÀ¸·Î¼­ º¯¼ö ÀÇ ¹üÀ§¸¦ ¸ðµç data¸¦ Æ÷ÇÔÇÏ´Â À¯ÇÑ°¹¼öÀÇ interval ·Î ³ª´©¾î¼­ ¸·´ë±×·¡ÇÁ·Î Ç¥½ÃÇÑ °ÍÀÌ´Ù. À̶§ ÀÇ °¢ intervalÀ» cell ¶Ç´Â bin À̶ó°í ºÎ¸¥´Ù. °¢ histogramÀ» ¿¬°áÇÏ¿©  density function ÀÇ ÃßÁ¤Ä¡·Î¼­ »ç¿ëÇÏ·Á¸é histogram ÇÏÀÇ ÇÕ°è ¸éÀûÀÌ 1 À̾î¾ß ÇÑ´Ù.  °¢ intervalÀ» ¶ó ÇÏ°í Àüü sample ÀÇ ¼ö¸¦ À̶ó ÇÏ¸é  °¢ bin ÀÇ ¸éÀûÀº ÀÌ°í ±× interval ÀÇ Æø¿¡ µû¶ó ³ª´µ¾îÁ® density ÀÇ ³ôÀÌ´Â ÀÌ µÈ´Ù. ÀÏ´Ü ±Ù»çÄ¡ density function ÀÌ ±¸ÇØÁö¸é Bayes' Á¤¸®¸¦ »ç¿ëÇÏ¿© decisionÀ» ÇϰԵȴÙ.

º¯¼ö °¡ ¿¬¼ÓÀ̵ç ÀÌ»êÀÌµç ±× ¹üÀ§¸¦ interval·Î ³ª´©°í °°Àº ¹æ¹ýÀ» »ç¿ëÇÏ°Ô µÈ´Ù. °¢ ÀÇ °ªÀ» °¡Áö´Â sample ÀÇ ºÎºÐÀÌ ºÐÆ÷ ÀÇ ÃßÁ¤Ä¡ ·Î »ç¿ëµÇ°í ±× ÇÕÀº 1 ÀÌ µÉ °ÍÀÌ´Ù.

bin ÀÇ Å©±â¸¦ ÀûÀýÇÏ°Ô ÇÏ´Â °ÍÀÌ Áß¿äÇÏ¸ç ³Ê¹« Å©¸é ±×·¡ÇÁ°¡ ³Ê¹« rough ÇÏ°Ô µÇ¸ç ÀÛÀ¸¸é ³Ê¹« º¯È­°¡ ½ÉÇÏ°Ô µÈ´Ù. ÀÌ·¯ÇÑ histogramÀ» Àß ¸¸µé¸é ¿ø·¡ÀÇ probability density functionÀ» ±¸ÇØ ³¾ ¼öµµ ÀÖ´Ù.

 

 

(a) 50 °³ÀÇ ÀÓÀÇÀÇ ¼öµé·Î ÀÌ·ç¾îÁø true normal density  (b) 6 °³ÀÇ intervalÀ» °¡Áø 50 °³ÀÇ normally distributed ÀÓÀÇÀÇ ¼öµé¿¡ ´ëÇÑ histogram  (c) 3 °³ÀÇ interval ÀÇ °æ¿ì  (d) 24 °³ÀÇ interval ÀÇ °æ¿ì.


Example 4.1  ´Ù¸¥ Å©±âÀÇ intervalÀ» °¡Áø density histogram ÀÇ ±¸Ãà 

grapefruit volumesÀ» À§ÇÑ density function ÀÇ histogram ±Ù»çÄ¡¸¦ ±¸Çغ¸ÀÚ. 100 °³ÀÇ grapefruit À» volume ¿¡ µû¶ó¼­ ´ÙÀ½°ú °°Àº data °¡ ÃøÁ¤µÇ¾ú´Ù.

ÀÇ interval

interval ±æÀÌ

sample ÀÇ ¼ö

sample µéÀÇ ºñÀ² (=area)

rectangle ÀÇ ³ôÀÌ

[0, 4)
[4, 6)
[6, 7)
[7, 8)
[8, 10]

4
2
1
1
2

10
30
30
20
10

0.1
0.3
0.3
0.2
0.1

0.025
0.150
0.300
0.200
0.050

°¢ rectangle ÀÇ ³ôÀÌ´Â interval ±æÀÌ·Î ³ª´©¾îÁø sample ÀÇ ºñÀ²°ú °°´Ù. ¿¹¸¦µé¸é 0 ¿¡¼­ 4 ±îÁöÀÇ interval ÀÇ °æ¿ì rectangle ÀÇ ³ôÀÌ´Â 0.1/(4-0)=0.025 ÀÌ´Ù. °á°úÀûÀÎ density ´Â ±×¸²°ú °°´Ù. ÀÌ°ÍÀº density histogram À̱⠶§¹®¿¡ rectangle ¸éÀûÀÇ ÇÕÀº 1 À̾î¾ß ÇÑ´Ù. 

rectangle ÀÇ ³ôÀÌ´Â ´Ü¼øÇÑ ¿¹ÃøÄ¡À̱⠶§¹®¿¡   ¿Í °°ÀÌ Ç¥ÇöÇÑ´Ù. sample ÀÌ ¸¹Àº ±¸°£¿¡¼­´Â bin ÀÇ Å©±â°¡ ÀÛ°í, ÀûÀº ±¸°£¿¡¼­´Â bin ÀÇ Å©±â°¡ Å©´Ù´Â °ÍÀ» ¾Ë ¼ö ÀÖ´Ù.


Example 4.2  histogram °ú Bayes' Á¤¸®¸¦ »ç¿ëÇÑ classification 

´ÙÀ½ data¸¦ »ç¿ëÇÏ¿© ÀÏ ¶§,   ÀÇ sample ÀÌ class ¿¡ ¼ÓÇÏ´ÂÁö ¿¡ ¼ÓÇÏ´ÂÁö¸¦ ºÐ·ùÇ϶ó.

´ÙÀ½ data ´Â class ¿¡¼­ ¼±ÅÃµÈ 60 °³ÀÇ ÀÓÀÇÀÇ sample ÀÇ feature ÀÇ °ªÀÌ´Ù.

0.80
2.01
2.96
4.10
4.47
5.64

0.91
2.18
2.97
4.12
4.61
5.85

0.93
2.27
3.17
4.18
4.64
5.99

0.95
2.31
3.17
4.20
4.89
6.29

1.32
2.40
3.38
4.23
4.96
6.42

1.53
2.61
3.67
4.27
5.12
6.53

1.57
2.64
3.73
4.27
5.15
6.70

1.63
2.64
3.83
4.39
5.33
6.78

1.67
2.67
3.99
4.40
5.33
7.18

1.74
2.85
4.06
4.46
5.47
7.22

´ÙÀ½ data ´Â class ¿¡¼­ ¼±ÅÃµÈ 60 °³ÀÇ ÀÓÀÇÀÇ sample ÀÇ feature ÀÇ °ªÀÌ´Ù.

3.54
5.60
6.33
6.90
7.61
8.33

3.88
5.77
6.41
6.92
7.67
8.36

4.24
5.87
6.43
7.03
7.68
8.44

4.30
5.94
6.49
7.08
7.68
8.45

4.30
5.95
6.52
7.18
7.78
8.49

4.70
6.04
6.58
7.29
7.96
8.75

4.78
6.05
6.60
7.33
8.09
8.76

4.97
6.15
6.63
7.41
8.12
9.14

5.21
6.19
6.65
7.41
8.20
9.20

5.42
6.21
6.75
7.46
8.22
9.86

 

(a) class A ¸¦ À§ÇÑ feature ÀÇ histogram   (b) class B

À§ÀÇ ±×¸²Àº class ¿Í ¸¦ À§ÇÑ ÀÇ °¢ interval¿¡¼­ sampleµéÀÇ histogram ÀÌ´Ù.ÀÌ°ÍÀ» density function À¸·Î º¯È¯Çϱâ À§Çؼ­, ÀÌ·¯ÇÑ ¼öµéÀÌ sample µéÀÇ ÀüüÀÇ ¼ö(60) ¿Í interval width (1) ·Î ³ª´©¾îÁ®¾ß ÇÑ´Ù.

  ÀÎ sampleÀ» ºÐ·ùÇϱâ À§ÇØ, 7.5 ¿¡¼­ µÎ °³ÀÇ histogram ÀÇ ³ôÀ̸¦ ºñ±³ÇÑ´Ù. ¿Ö³ÄÇϸé 7.5¸¦ Æ÷ÇÔÇÏ´Â class interval Àº classes ¿Í ¿¡ ´ëÇÏ¿© [7, 8] ±¸°£À̱⠶§¹®ÀÌ´Ù.    À̸ç ÀÌ´Ù.  ¿©±â¼­ Bayes' Á¤¸®¸¦ »ç¿ëÇÏ¿©

¶ÇÇÑ ÀÌ´Ù. ±×·¯¹Ç·Î ÀÌ´Ù.

µû¶ó¼­ ±× sample Àº class ·Î ºÐ·ùµÇ¾î¾ß ÇÑ´Ù.


and

...........

4.3  Kernel and Window Estimators

¿¹¸¦ µé¸é spike ÁýÇÕÀ̳ª delta ÇÔ¼öÀÇ °æ¿ì¿Í °°ÀÌ °¢ sample °ªÀÌ ¸Å¿ì Á¼Àº Æø°ú Å« ³ôÀ̸¦ °¡Áö´Â °æ¿ì °¢ spike ¸éÀûÀÇ ÇÕÀÌ 1 ÀÌ µÇµµ·Ï ÇÏ¿© true density functionÀÇ ¸Å¿ì °ÅÄ£ ±Ù»çÄ¡¸¦ ±¸ÇÒ ¼ö ÀÖ´Ù. °¢ spike ÀÇ ¸éÀûÀº Àüü sampleÀÇ ¼ö¿¡ ÀÇÇØ ³ª´©¾îÁø ÇØ´ç Æ÷ÀÎÆ®ÀÇ sample ÀÇ ¼öÀÌ´Ù.

¿¡ ³õ¿©ÀÖ´Â sample µéÀÇ density¸¦ ÃßÁ¤ÇÏ´Â delta ÇÔ¼ö ÀÇ ¿¹ (±×¸² 1)¸¦ µé¾îº¸ÀÚ. ÀÌ·¯ÇÑ continuous density function ÀÇ ±Ù»çÄ¡´Â ÀÇ»ç °áÁ¤¿¡¼­´Â À¯¿ëÇÏ°Ô »ç¿ëµÉ ¼ö ¾ø´Â °ÍÀÌ´Ù. ±×·¯³ª ±× delta ÇÔ¼ö°¡ »ç°¢À̳ª »ï°¢ ¶Ç´Â Á¤±Ô¹ÐµµÇÔ¼ö ¿Í °°Àº kernel À̶ó°í ºÒ¸®´Â ´Ù¸¥ ÇÔ¼ö·Î ¹Ù²Ù¾î Áø´Ù¸é ±×µéÀÇ ÇÕ°è ¸éÀûÀÌ 1 ÀÌ µÇµµ·Ï ÇÏ´Â ´õ ºÎµå·´°í ¸¸Á·½º·± ÃßÁ¤À» ÇÒ ¼ö°¡ ÀÖ´Ù(smoothening).


Example 4.3  »ï°¢ kernel ÀÇ »ç¿ë

ÀÇ sampleÀ» °¡Áö´Â °æ¿ìÀÇ »ï°¢ kernelÀ» »ç¿ëÇÑ ÃßÁ¤ density functionÀ» (±×¸² 2) ¿¡¼­ º¼ ¼ö ÀÖ°í, »ï°¢ »ç°¢ Á¤±Ô ºÐÆ÷ kernelÀ» »ç¿ëÇÏ¿© Ç¥ÁØÆíÂ÷ 1À» °¡Áö°Ô ÇÏ´Â °æ¿ì´Â (±×¸² 3) ¿¡¼­ º¼ ¼ö ÀÖ´Ù. .....


±×¸²1 : 3 °³ÀÇ delta functions ¿¡ ÀÇÇØ ±¸ÇØÁø ±Ù»çÄ¡ density. °¢ bar ÀÇ ³ôÀÌ¿Í ÆøÀº ,  ÀÌ´Ù. 

±×¸² 2 : (a) kernel ¹æ¹ýÀ» »ç¿ëÇؼ­ ÃßÁ¤ È®·ü ºÐÆ÷ÇÔ¼ö   ¸¦ °è»êÇÏ´Â °ÍÀ¸·Î kernels (Á¡¼±) °ú ±× ÇÕ°è (½Ç¼±) ¸¦ º¼ ¼ö ÀÖ´Ù. ½Ç¼± ¸éÀûÀÇ ÇÕÀº 1 ÀÌ´Ù. (b) window ¹æ¹ýÀ» »ç¿ëÇؼ­ ¸¦ °è»êÇÏ¸é ¿Í °°´Ù. 

±×¸² 3 : 3°³ÀÇ kernel functions, rectangular (½Ç¼±), triangular (Á¡¼±), normal (dashed). °¢ ¿µ¿ªÀÇ ÇÕÀº 1 ÀÌ°í Ç¥ÁØÆíÂ÷µµ 1 ÀÌ´Ù.              

¿¡ ÀÖ´Â sampleÀ» ºÐ·ùÇϱâ À§Çؼ­´Â Àüü ¹Ðµµ °¡ ÇÊ¿äÇÏÁö´Â ¾Ê°í ¿¡¼­ÀÇ °ª¸¸ ÇÊ¿äÇÏ´Ù. (±×¸² 2, b) ÀÇ °æ¿ìó·³ ¿¡¼­ ¸¦ °¡Á¤ÇØ º¸ÀÚ. °á°ú´Â ¹°·Ð ¿¡¼­ ¸ðµç kernel ÇÔ¼öÀÇ ³ôÀÌÀÇ ÇÕ°ú °°À» °ÍÀÌ´Ù. ÀÌ·± °á°ú¸¦ ¾ò´Â ¶Ç ´Ù¸¥ ¹æ¹ýÀº ÀÇ Áß¾Ó ±Ùó¿¡¼­ kernel ÇÔ¼ö¸¦ ¹Ý¿µÇÏ´Â window ÇÔ¼ö¸¦ ¸¸µå´Â °ÍÀÌ´Ù. ±×¸®°í ÀÌ window ¿¡ Æ÷ÇÔµÈ °¢ sample point ÀÇ ³ôÀ̸¦ ÇÕÇÏ´Â °ÍÀÌ´Ù. ´ëĪÀûÀÎ kernel ÇÔ¼ö°¡ º¸Åë »ç¿ëµÈ´Ù. (±×¸² 2, b)¿¡¼­ window ÇÔ¼ö´Â sample µÎ °³¸¦ Æ÷ÇÔÇÏ°í ÀÖ°í °¢°¢ÀÇ ³ôÀÌ´Â 2/27 À̸ç ÀÇ ÃßÁ¤Ä¡   ´Â 4/27 ÀÌ´Ù. ÀÌ°ÍÀº kernel ¹æ¹ýÀ» »ç¿ëÇÑ (±×¸² 2, a) ¿Í °°Àº °ªÀÌ´Ù.

histogram ÀÇ °æ¿ì¿Í ¸¶Âù°¡Áö·Î kernel À̳ª window ¹æ¹ýÀ» »ç¿ëÇÏ´Â °æ¿ì¿¡µµ ÀûÀýÇÑ width ³ª Ç¥ÁØÆíÂ÷¸¦ ¼±ÅÃÇÏ´Â ¹®Á¦´Â Áß¿äÇÏ´Ù. width °¡ ³Ê¹« Å©¸é ¹Ì¼¼ÇÑ ±¸Á¶¸¦ ÀÒÀ» ¼ö°¡ ÀÖ°í ³Ê¹« ÀÛÀ¸¸é ÃÖÁ¾ ±Ù»ç°ªÀÌ ÃæºÐÈ÷ smoothen µÇÁö ¾ÊÀ» °ÍÀÌ´Ù.  

±×¸² 4 : ÃßÁ¤ density functions  (a) »ï°¢(triangular) kernel (b) normal kernel.  normal kernel ÀÌ »ï°¢ kernelº¸´Ù ´õ ºÎµå·´´Ù.

4.4  Nearest Neighbor Classification Techniques

ºÐ·ùÇÏ°íÀÚ ÇÏ´Â class ÀÇ Á¾·ù¿¡ ´ëÇؼ­´Â ¾Ë°í ÀÖÁö¸¸ sample µé °¢°¢¿¡ ´ëÇÑ probability density functionÀ» ¾ËÁö ¸øÇÏ´Â »óÅ¿¡¼­ »ç¿ëÇÑ´Ù. ±»ÀÌ °¢ sample ¿¡ ´ëÇÑ È®·üÀû parameterµéÀ» ±¸ÇÏÁö ¾Ê°í  sampleÀÇ °ªÀ» ±×´ë·Î ÁÂÇ¥¿¡ Ç¥½ÃÇÏ¿© reference set¿¡¼­ °¡Àå À¯»ç(similar)Çϰųª °Å¸® »óÀ¸·Î °¡±î¿î (nearest) class ¿¡ ¼ÓÇÏ´Â °ÍÀ¸·Î ºÐ·ùÇÏ´Â ¹æ¹ýÀÌ´Ù.

nearest ÀÇ Àǹ̴ ¹«¾ùÀΰ¡? ±×°ÍÀº smallest Euclidean distance, absolute difference, maximum distance, Minkowski distance µîÀ¸·Î ³ª´­ ¼ö ÀÖ´Ù.

The Single Nearest Neighbor Technique

(1) Euclidean distance

-Â÷¿øÀÇ feature space ·Î ±¸¼ºµÈ °æ¿ì¶ó ÇÏ¸é µÎ °³ÀÇ Æ÷ÀÎÆ® ¿Í »çÀÌÀÇ ±âÇÏÇÐÀû °Å¸®´Â ´ÙÀ½°ú °°ÀÌ ±¸ÇØÁú ¼ö ÀÖ´Ù. ÀÌ°ÍÀº ÇÇŸ°í¶ó½º Á¤¸®¸¦ -Â÷¿øÀ¸·Î È®ÀåÇÑ °ÍÀÌ´Ù.

   

ÀÌ ¹æ¹ýÀº °¡Àå ÈçÇÏ°Ô »ç¿ëµÇ´Â °Å¸® ÃøÁ¤ ¹æ¹ýÀÌÁö¸¸ Ç×»ó ÃÖ°íÀÇ ¹æ¹ýÀº ¾Æ´Ï´Ù. °¢ Â÷¿ø¿¡¼­ ÇÕ°èµÇ±â Àü¿¡ Á¦°öÀ» Çϱ⠶§¹®¿¡ °è»êÀÌ º¹ÀâÇÏ°í dissimilarity °¡ Å« °æ¿ì°¡ °­Á¶µÉ ¼ö ÀÖ´Ù.

(2) absolute difference

µÎ Æ÷ÀÎÆ®ÀÇ Â÷À̸¦ ±×´ë·Î Ç¥ÇöÇÏ¿© °è»êÇϱⰡ ½±´Ù. city block distance, Manhattan metric, taxi-cab distance ¶ó°íµµ Ç¥ÇöµÈ´Ù.

   

(3) maximum distance

feature µé Áß¿¡¼­ °¡Àå À¯»çÇÏÁö ¾ÊÀº (°Å¸®°¡ ¸¹ÀÌ ¶³¾îÁø) ºÎºÐÀÌ °­Á¶µÇ´Â °ÍÀÌ´Ù.

   

(4) Minkowski distance

°Å¸®¸¦ ÃøÁ¤ÇÏ´Â (1), (2), (3) ¹æ¹ýÀ» Á¾ÇÕÇÑ ¹æ¹ýÀÌ´Ù. ¿©±â¼­ Àº Á¶Á¤°¡´ÉÇÑ parameter ·Î¼­ ±× °ªÀÌ 1 À̸é absolute difference ¿Í °°°í, 2 À̸é Euclidean distance ¶û °°´Ù.

   

.........

´ÙÀ½ ±×¸²Àº class A¿¡¼­ 3 °³, class B¿¡¼­ 2 °³ÀÇ sampleÀ» °¡Áö´Â °æ¿ìÀÇ feature space ÀÌ´Ù. ¾î¶² class¿¡ ¼ÓÇÏ´ÂÁö ¾Ë·ÁÁöÁö ¾ÊÀº sampleÀÌ (1,1) ÁÂÇ¥¿¡ ÀÖÀ» °æ¿ì Euclidean distance ÀÇ ¹æ¹ýÀ» ½á¼­ °¡Àå °¡±îÀÌ ÀÖ´Â class´Â (1,3)¿¡ À§Ä¡ÇÑ class A ÀÌ´Ù. µû¶ó¼­ class A ¿¡ ¼ÓÇÏ´Â °ÍÀ¸·Î ÇÑ´Ù

........

Nearest Neighbor Error Rates (¿¡·¯À², ¼º´É Å×½ºÆ®)

nearest neighbor classifier ÀÇ ¼º´ÉÀº Bayesian classifier(density functionÀ» ¾Ë°í ÀÖ´Â °æ¿ì) ¿¡ ºñÇØ Ç×»ó ÁÁÁö ¾Ê´Ù. ±× ÀÌÀ¯´Â Bayesian Àº Ç×»ó °¡Àå È®½ÇÇÑ class¸¦ ¼±ÅÃÇϱ⠶§¹®ÀÌ´Ù. ±× ÀÌÀ¯´Â Å©°Ô 2 °¡Áö°¡ ÀÖ´Ù.

1. sample ÀÇ ÁÂÇ¥¿¡¼­ °¡Àå °¡±îÀÌ ÀÖ´Â class·Î ºÐ·ùÇϱ⠶§¹®¿¡ ½ÇÁ¦·Î ºÐ·ùµÇ¾î ÀÖ´Â class ¿Í´Â Â÷ÀÌ°¡ ÀÖÀ» ¼ö ÀÖ´Ù.

2. ±âÁ¸ ºÐ·ùµÇ¾î ÀÖ´Â class ÀÇ ÁÂÇ¥´Â ´ë°³ Àü¹®°¡¿¡ ÀÇÇØ ±× ÁÂÇ¥°¡ °áÁ¤µÇ´Âµ¥ ±× °Í Á¶Â÷µµ Á¤È®ÇÏÁö ¾ÊÀ» ¼ö ÀÖ´Ù.

class , feature vector , ¾î¶² ¿¡¼­µµ Á¤È®ÇÏ°Ô ºÐ·ùµÉ È®·ü , È®·ü ¹Ðµµ ÀÇ °æ¿ì¿¡ Á¤È®ÇÏ°Ô ºÐ·ùµÉ È®·üÀÇ ±â´ë°ªÀº ´ÙÀ½°ú °°´Ù. -Â÷¿ø °ø°£ ¿¡¼­

¸¦ Bayes' Á¤¸®·Î ¹Ù²Ù¸é

´ÙÀ½½ÄÀº ¸ðµç class¿¡ ´ëÇÑ mixture density ÀÌ´Ù. Áï Àüü ¸ðÁý´Ü¿¡ ´ëÇÑ density ÀÌ´Ù.

µû¶ó¼­ ÀÇ member °¡ ºÎÁ¤È®ÇÏ°Ô ºÐ·ùµÉ È®·ü Áï °¢ classÀÇ error È®·üÀº ´ÙÀ½°ú °°´Ù.

   

´ÙÀ½ ½Ä¿¡ ÀÇÇØ ¶Ç´Ù¸¥ À¯¿ëÇÑ ½ÄÀÌ ±¸ÇØÁø´Ù.

µû¶ó¼­

 

.

Bayes' Á¤¸®¸¦ ´ëÀÔÇϸé

.    

Àüü error È®·üÀ» ¾ò±âÀ§ÇØ À§¿¡¼­ ±¸ÇÑ °¢ classÀÇ error È®·üÀÌ °¢ classÀÇ »çÀüÈ®·ü°ú  °¡Áߵǰí ÇÕÇÏ¸é ´ÙÀ½½ÄÀÌ ¾ò¾îÁø´Ù.

.   

À§¿Í °°ÀÌ °¢ classÀÇ density function°ú »çÀü È®·üÀ» ¾Ë¸é nearest neighbor ¹æ¹ýÀÇ ¿¹»óµÇ´Â error À²À» ±¸ÇÒ ¼ö ÀÖ´Ù. ½ÇÁ¦·Î density¸¦ ¾Ë °æ¿ì¿¡´Â ºÐ·ù¹ýÀ¸·Î Bayes' Á¤¸®¸¦ »ç¿ëÇÒ °ÍÀÌ´Ù. ±×·¯³ª Bayes' Á¤¸®¸¦ »ç¿ëÇÑ ¹æ¹ý°ú density¸¦ ¸ð¸£´Â °æ¿ìÀÇ nearest neighbor ¹æ¹ýÀ» »ç¿ëÇÑ °æ¿ìÀÇ error À²À» ºñ±³ÇÏ´Â °ÍÀº Èï¹ÌÀÖ´Â ÀÏÀÌ´Ù.

Example 5 Estimation of error rates for nearest neighbor and Bayesian classification for two classes with equal prior probabilities.

Figure 9 : Two uniform density functions and the mixture density. The mixture density is dashed.

´ÙÀ½ ±×¸²°ú °°ÀÌ classes ¿Í ÀÇ »çÀü È®·üÀÌ °¢°¢ 0.5 ÀÌ°í class ´Â ¹üÀ§ ¿¡¼­ , ´Â ¹üÀ§ ¿¡¼­ uniformly distributed µÇ¾ú´Ù°í °¡Á¤ÇÏÀÚ. nearest neighbor¸¦ »ç¿ëÇÑ °æ¿ìÀÇ °¢ class ÀÇ error À²Àº ¾î¶»°Ô µÇ¸ç Bayesian error À²°ú´Â ¾î¶»°Ô ºñ±³µÇ´Â°¡? density ´Â ±×¸²°ú °°À¸¸ç ´Â 0¿¡¼­ 1, 1 ¿¡¼­2, 2 ¿¡¼­ 5, 0 À¸·Î ±¸ºÐµÈ´Ù. À§¿¡¼­ ±¸ÇÑ °¢ class¸¦ À§ÇÑ error È®·ü¿¡ density¸¦ ´ëÀÔÇϸé class , ÀÇ error È®·üÀº ´ÙÀ½°ú °°´Ù.

ÀüüÀÇ error À²Àº ´ÙÀ½°ú °°´Ù.

.

Bayesian classification Àº Ç×»ó °¡Àå È®½ÇÇÑ class¸¦ ¼±ÅÃÇÑ´Ù. µû¶ó¼­ ¶ó¸é class A , ¶ó¸é class B ºÐ·ùµÈ´Ù. µû¶ó¼­ , . Àüü error À²Àº ´ÙÀ½°ú °°´Ù.

À§ÀÇ ¿¹Á¦¿¡¼­ nearest neighbor error À²Àº Bayesian error À²¿¡ ºñÇØ 4/3 ¹èÀÌ´Ù.

  Example 5 could also have been solved by inspection of Figure 9 : The half of the lying between and will be classified correctly, and the half of the that overlap the density of have a 1/3 probability of being called (because ), so . Also, the left 1/4 of the have a 2/3 probability of being called so . Thus .


         s     s                                              

         

    

Figure 4.10 : (a) Density functions for Example 4.6. (b) The density functions multiplied by the prior probabilities of their classes.

              

and      and

    non-     non-          s     s     

 

Example 4.6

and                and      and      if , and if     

    1/3     2/3

         

 

A Bound on the Nearest Neighbor Error Rate

,    (10)

,

   (11)

Substituting from (10),

   (12)

in (12). Since

,

.    (13)

Substituting (13) for in (12) produces the inequality

.    (14)

The term equals 2 when there are two classes and it approaches 1 as the number of classes becomes large.
  Since (14) is true at all values of , it can be used to compare the overall error rates

   (15)

and

   (16)

Multiplying (14) by and integrating gives

or

.    (17)

the integral of a nonnegative quantity must be greater than or equal to 0, so

thus

.    (18)

Substituting the integral from (18) into (17) gives

.    (19)

or

 

A Lower Bound on the Bayesian Error Rate from Nearest Neighbor Results

.    (20)

When , this becomes

,

so when ,

.

The k-nearest Neighbor Technique

Nearest neighbor ¹æ¹ýÁß¿¡¼­ ÀϹÝÀûÀ¸·Î »ç¿ëµÇ´Â °ÍÀº ´ÜÇϳªÀÇ °¡Àå °¡±î¿î ÀÌ¿ô ¸¸À¸·Î ±¸ÇÏ´Â °ÍÀÌ ¾Æ´Ï¶ó °³ÀÇ °¡±î¿î ÀÌ¿ôÁß¿¡¼­ "¼±ÃâÇÏ¿© (votes)" ¹ÌÁöÀÇ »ùÇõéÀ» ºÐ·ùÇÏ´Â °ÍÀÌ´Ù. ÀÌ·¯ÇÑ k-nearest neighbor ºÐ·ù°úÁ¤Àº ÈçÈ÷ À̶ó°í Ç¥ÇöµÈ´Ù. ¸¸ÀÏ °¢ Ŭ·¡½º¿¡ ´ëÇØ ¿¡·¯ºñ¿ë (costs of error) °¡ °°´Ù¸é, ¹ÌÁöÀÇ »ùÇÃÀÌ ¼ÓÇÏ´Â °ÍÀ¸·Î ÃßÁ¤µÇ´Â Ŭ·¡½º´Â °³ÀÇ °¡Àå°¡±î¿î ÀÌ¿ô Áý´ÜÁß¿¡¼­ °¡Àå ÈçÇÏ°Ô Ç¥ÇöµÇ´Â (most commonly represented) Å¬·¡½º¸¦ ¼±ÅÃÇÏ´Â °ÍÀÌ´Ù. ¿¹¸¦µé¸é ±×¸² 4.8 ¿¡¼­Ã³·³ 3°³ÀÇ ÀÌ¿ôÀÌ ÀÖ´Ù¸é, ¹ÌÁöÀÇ »ùÇà (1, 1) Àº B Ŭ·¡½º¿¡ ¼ÓÇÏ´Â °ÍÀ¸·Î ºÐ·ùµÈ´Ù. ¿Ö³ÄÇϸé 3°³ÀÇ °¡Àå°¡±î¿î ÀÌ¿ôÀº Ŭ·¡½º A ÀÇ (1, 3) °ú Ŭ·¡½º B ÀÇ µÎ°³ÀÇ »ùÇ÷Π±¸¼ºµÇ±â ¶§¹®ÀÌ´Ù.

Figure 11 : Upper bounds on the k-NN error rate as a function
  of the Bayesian error rate for two classes.

The k-nearest Neighbor Technique

    NN           (1,1)               (1, 3)      

-nearest                                   NN     -nearest          

NN     

    

                   

    s                      and     

                                                  

 

Scale Factors

    ,000     1,000,000                      

by

    

          and       and      and      and 

Example 4.7

242     1570.20 to 0.35               

 

Figure 4.12 : Performance of variants of the nearest neighbor decisin rule [Wu].

of                                   

                        5.5     0.35/0.20 = 1.75                              

 

Other Nearest Neighbor Techniques

 

4.5  Adaptive Decision Boundaries

ºÐ·ùÇÏ°íÀÚ ÇÏ´Â classµé »çÀÌÀÇ decision boundary ÀÇ ±â´ÉÀû ÇüÅÂ(Áï ±×°ÍÀÌ ¸îÂ÷¿øÀÌÁö)¸¦ ¾Ë°í ÀÖ´Ù°í °¡Á¤ÇÏ°í classµéÀ» °¡Àå Àß ºÐ·ùÇÏ´Â decision boundary¸¦ ã´Â ¹æ¹ýÀÌ´Ù. ¿¹¸¦µé¸é 2 °³ÀÇ class¸¦ ºÐ·ùÇϱâ À§ÇØ ¼±Çü decision boundary °¡ »ç¿ëµÇ¸ç °¢ sample ÀÌ °³ÀÇ feature¸¦ °¡Áø´Ù¸é discriminant function(½Äº°ÇÔ¼ö) ´Â ´ÙÀ½ÀÇ ÇüŸ¦ °¡Áø´Ù.

ÀÌ ½Ä¿¡¼­ Àº µÎ class¸¦ ºÐ·ùÇÏ´Â decision boundary ½ÄÀÌ´Ù. °¡ÁßÄ¡ Àº training set¿¡¼­ ÁÁÀº ¼º´ÉÀ» ³»±âÀ§ÇØ ¼±ÅõȴÙ. feature vector ¸¦ °¡Áö´Â sample ÀÌ ÇϳªÀÇ class·Î ºÐ·ùµÇ°í À̶ó¸é class 1, À̶ó¸é class -1 À̶ó°í ºÎ¸¥´Ù°í ÇÏÀÚ.  ¸¸ÀÏ À̶ó¸é sample x ´Â ºÐ·ùµÉ ¼ö ÀÖ´Ù.

±âÇÏÇÐÀûÀ¸·Î Àº -Â÷¿ø feature space¸¦ 2 ¿µ¿ªÀ¸·Î ³ª´©´Â decision boundary ¹æÁ¤½ÄÀÌ´Ù. ¿¡¼­ class1, ¿¡¼­ class -1À» ±¸ºÐÇÏ´Â decision boundary °¡ Á¸ÀçÇÑ´Ù¸é µÎ class ´Â linearly separable ÇÏ´Ù°í ÇÑ´Ù.

±×¸² ......

and

 adaptive decision boundary algorithm Àº ´ÙÀ½ÀÇ ´Ü°è·Î ±¸¼ºµÈ´Ù.

  1. °¡ÁßÄ¡  À» 0 À̳ª ÀÛÀº ÀÓÀÇÀÇ ¼ö·Î ÃʱâÈ­ ÇÑ´Ù. °¡ÁßÄ¡¸¦ Àß ¼±ÅÃÇϸé Á¸ÀçÇÒ ¼ö ÀÖ´Â ¿ÏÀüÇÑ ÇØ¿¡ ºü¸£°Ô ¼ö·ÅÇÒ¼ö ÀÖ´Ù.
  2. training set¿¡¼­ ´ÙÀ½ sample À» ¼±ÅÃÇÑ´Ù. ¶Ç´Â -1 ÀÌ ÀÇ true class °¡ µÇµµ·Ï ÀÇ ¹Ù¶÷Á÷ÇÑ class¸¦ ¶ó°í ÇÑ´Ù.
  3. ¸¦ °è»êÇÑ´Ù.
  4. ¸¸ÀÏ ¶ó¸é , ¿¡ ´ëÇÏ¿© ¸¦ ·Î ¹Ù²Û´Ù.¿©±â¼­ ´Â °¡ÁßÄ¡¸¦ Á¶Á¤Çϱâ À§ÇÑ step size¸¦ Á¶ÀýÇÏ´Â ¾çÀÇ »ó¼öÀÌ´Ù. À̶ó¸é ÀÌ°í À̶ó¸é ÀÌ´Ù. ¶ÇÇÑ ¸¦ ·Î ¹Ù²Ù´Âµ¥ ´Â ¾çÀÇ »ó¼öÀÌ´Ù. ¸¸ÀÏ À̶ó¸é °¡ÁßÄ¡¿¡´Â º¯È­¸¦ ÁÖÁö ¾Ê´Â´Ù.
  5. training set ¿¡¼­ÀÇ °¢ sample ¿¡ ´ëÇØ 2¿¡¼­ 4 ÀÇ ´Ü°è¸¦ ¹Ýº¹ÇÑ´Ù. ³¡³ª¸é Àüü training setÀ» ÅëÇØ ´Ù½Ã ¼öÇàÇÑ´Ù. training °úÁ¤À» ÅëÇؼ­ Àüü sample¿¡ ´ëÇØ ºÐ·ù°¡ µÇ¸é ¸ØÃá´Ù. ¸¸ÀÏ 2 °³ÀÇ class°¡ ¼±ÇüÀ¸·Î ºÐ¸®°¡ µÇÁö ¾Ê´Â´Ù¸é ÀÌ·¯ÇÑ °úÁ¤Àº Àý´ë ³¡³ªÁö ¾Ê±â ¶§¹®¿¡ ºÎ°¡ÀûÀÎ stopping ruleÀÌ ÇÊ¿äÇÏ´Ù. ÃÖ´ë·Î ¹Ýº¹°¡´ÉÇÑ È½¼ö¸¦ Á¤Çϵ簡, Æò±Õ error À²ÀÌ °¨¼ÒÇÏ´Â °ÍÀÌ ¸ØÃçÁö¸é ¾Ë°í¸®ÁòÀÌ Á¾·áµÇµç°¡ ÇÑ´Ù.

ÀûÀýÇÑ ¹æÇâÀ¸·Î °¡ÁßÄ¡¸¦ Á¶Á¤ÇØ°¡´Â °úÁ¤À» º¸ÀÚ. °¡ À߸ø ºÐ·ùµÇ¾ú´Ù¸é ÀÇ »õ·Î¿î °ªÀº ´ÙÀ½°ú °°À» °ÍÀÌ´Ù.

 

À̸ç , À̸é ÀÌ°í   ÀÌ¸é  ÀÌ´Ù.

ÀÌ ¾Ë°í¸®Áò¿¡¼­ °¡ÁßÄ¡¸¦ Á¶Á¤ÇÏ´Â °ÍÀº ºÎÁ¤È®ÇÏ°Ô ºÐ·ùµÈ sampleÀ» Á¤È®ÇÏ°Ô µÇµµ·Ï decision boundary¸¦ À̵¿ÇÏ´Â °ÍÀ» ÀǹÌÇÑ´Ù. ÇϳªÀÇ sample¿¡ ´ëÇÑ adaptation ÈÄ¿¡´Â »õ·Î¿î  °¡ °è»êµÉ ÇÊ¿ä´Â ¾øÀ¸¸ç ´ÙÀ½ sample·Î À̵¿ÇÑ´Ù.


Example 4.8  ÇϳªÀÇ ¼öÄ¡ feature () ¸¦ »ç¿ëÇÏ¿© sampleµéÀ» ºÐ·ùÇÏ´Â decision boundary¸¦ ã´Â °ÍÀ» º¸¿©ÁØ´Ù. training sampleÀº ´ÙÀ½°ú °°´Ù.

1

2

-4

-1

-1

1

¿©±â¼­  ´Â sampleÀÇ ¼öÀÌ°í ´Â 1Â÷¿ø feature ÀÌ¸ç ´Â ºÐ·ùÇÒ classÀÌ´Ù. »ó¼ö ¿Í ´Â 1·Î ÁÖ¾îÁø´Ù(Àû´çÈ÷ ÁÖ¾îÁø´Ù?). °¡ÁßÄ¡ ¿¡¼­ ½ÃÀÛÇÑ´Ù.  Ã¹¹ø° sample À» »ç¿ëÇÏ¿© ¸¦ ¾ò¾ú´Ù. À̶ó¸é class 1, À̶ó¸é class -1 À̶ó°í ºÎ¸¥´Ù°í °¡Á¤ÇÏ¿´´Ù. ±×·¯³ª ÀÌ°í µû¶ó¼­ sample Àº À߸ø ºÐ·ùµÇ¾î(error) »õ·Î¿î °¡ÁßÄ¡¸¦ Àû¿ëÇØ¾ß ÇÑ´Ù.

.

ÀÌ »õ·Î¿î °¡ÁßÄ¡°¡ ´ÙÀ½ tableÀÇ ÀÏ ¶§ º¼ ¼ö ÀÖ´Ù. ¿©±â¼­ ´Â ¹Ýº¹ Ƚ¼öÀÌ´Ù. ¾Ë°í¸®ÁòÀº °è¼Ó ¹Ýº¹µÇ¾î °ú ¿¡¼­¾ß Á¤È®ÇÏ°Ô ºÐ·ùµÈ´Ù. ¿Ö³ÄÇϸé decision boundary´Â Áï ¿¡¼­ ¾ò¾îÁö±â ¶§¹®ÀÌ´Ù. ¸¶Áö¸· decision boundary´Â ÀÌ´Ù. ÀÌ data·ÎºÎÅÍ ¸¸Á·½º·± decision boundary´Â -4 ¿Í -1 »çÀÌ¿¡ ÀÖ´Ù´Â °ÍÀ» ¾Ë ¼ö ÀÖ´Ù.

Old

Old

Error?

New

New

1
2
3
4
5
6
7
8

1
2
1
2
1
2
1
2

-4
-1
-4
-1
-4
-1
-4
-1

-1
1
-1
1
-1
1
-1
1

0
-1
0
0
1
1
2
2

0
4
3
3
2
2
1
1

0
-5
-12
-3
-7
-1
-2
1

Yes
Yes
No
Yes
No
Yes
No
No

-1
0
0
1
1
2
2
2

4
3
3
2
2
1
1
1


Example 4.9   2 °³ÀÇ ¼öÄ¡ feature¸¦ °¡Áø sampleÀ» ºÐ·ùÇÏ´Â decision boundary¸¦ ã´Â °ÍÀÌ´Ù. training sampleÀº ´ÙÀ½°ú °°´Ù.

2
3
5
50
65
35

10
8
2
25
30
40

1
1
1
-1
-1
-1

 

À§ÀÇ ¿¹Á¦¿¡ ´ëÇÏ¿© ¾ò¾îÁø decision boundary ÀÇ À§Ä¡ÀÌ´Ù.

(a) ÀÇ °æ¿ì¿¡ training set ¿¡ ´ëÇÏ¿© 20, 40, 60, 63 ȸÀÇ ¾Ë°í¸®ÁòÀÇ ¹Ýº¹À¸·Î ¾ò¾îÁö´Â decision boundary ÀÇ º¯È­µÇ´Â ±×¸²ÀÌ´Ù. ÁÂÇ¥(5,2) °¡ Á¤È®È÷ ºÐ·ùµÉ ¶§ ±îÁö °è¼ÓµÈ °ÍÀÌ´Ù. feature¸¦ Á¤±ÔÈ­ ½ÃÅ°Áö ¾Ê°í Á¤È®ÇÑ °¡ÁßÄ¡¸¦ ¾ò´Âµ¥ 381 step ¶Ç´Â ¾à 63 ȸÀÇ ¹Ýº¹ÀÌ ÇÊ¿äÇÏ¿´´Ù.

 (b) , (featureÀÇ Æò±ÕÀý´ë°ª) ÀÇ °æ¿ì¿¡ 1, 2, 3, 4 ȸÀÇ ¾Ë°í¸®ÁòÀÇ ¹Ýº¹À¸·Î ¾ò¾îÁø ±×¸²ÀÌ´Ù. ´ÜÁö 4 ¹ø¸¸¿¡ ¾Ë°í¸®ÁòÀÌ ¼ö·ÅÇÑ °ÍÀ» º¼¼öÀÖ´Ù.


 .......

4.6  Adaptive Discriminant Functions

2 °³ ÀÌ»óÀÇ class°¡ ÀÖÀ» ¶§ °¢ class ¿¡ ´ëÇÑ ºÐ¸®µÈ ¼±Çü discriminant functionÀ» À¯µµÇÏ¿© ±× Áß °¡Àå Å« discriminant functionÀ» °¡Áö´Â class¸¦ ¼±ÅÃÇÏ´Â °ÍÀÌ´Ù. ¸¸ÀÏ °³ÀÇ class ¿Í °³ÀÇ º¯¼ö¸¦ °¡Áø´Ù¸é ¼±Çü discriminant function ÁýÇÕÀº ´ÙÀ½°ú °°´Ù.


   

.

º¯¼ö  ´Â ºñ¼±Çü ÇÔ¼ö ÀÏ ¼ö ÀÖÀ¸¸ç   °¡ °¡ÁßÄ¡ ÁýÇÕÀÇ ¼±Çü ÇÔ¼öÀÏ ¶§ ºñ¼±Çü discriminant function °¡ »ç¿ëµÉ ¼öµµ ÀÖ´Ù.

sample ¸¦ ºÐ·ùÇϱâ À§ÇØ À§ÀÇ ½Ä°ú °°Àº discriminant functionÀ» °è»êÇÏ¿© sampleÀ» °¡Àå Å« ¸¦ °¡Áö´Â class ·Î ºÐ·ùÇÑ´Ù. ÀÌ·¯ÇÑ discriminant function ¿¡ °¡ÁßÄ¡¸¦ Àû¿ëÇÏ´Â ¹æ¹ýÀº ¼±Çü decision boundary¿¡¼­ °¡ÁßÄ¡¸¦ ã´Â ¹æ¹ý°ú °°´Ù. ÀÌ·¯ÇÑ ±â¼úÀº ÇØ°¡ Á¸ÀçÇÑ´Ù¸é ¸ðµç data¸¦ ¿Ïº®ÇÏ°Ô ºÐ·ùÇÏ´Â °¡ÁßÄ¡ ÁýÇÕÀ¸·Î ¼ö·ÅÇÏ´Â °ÍÀ» º¸ÀåÇÑ´Ù.

sample x °¡ class ·Î ºÐ·ùµÇ¾î¾ß Çϴµ¥ ·Î À߸ø ºÐ·ùµÇ¾úÀ» °æ¿ì  2 °³ÀÇ °°Àº  discriminant function ( ¿Í )¸¦ À§ÇÑ »õ·Î¿î °¡ÁßÄ¡°¡ ´ÙÀ½°ú °°ÀÌ ±¸ÇØÁø´Ù.


 

ÀÏ ¶§


Áï ÀÇ °¡ÁßÄ¡¸¦ Áõ°¡½ÃÄѼ­ ÀÇ °ªÀÌ ÃÖ´ë°ª¿¡ °¡±î¿ö Áöµµ·Ï Áõ°¡½ÃÄÑ¾ß ÇÑ´Ù. ¹Ý´ë·Î ÀÇ °¡ÁßÄ¡´Â °¨¼Ò½ÃÄѼ­ ÀÇ °ªÀ» ÁÙ¿©ÁØ´Ù. ´Ù¸¥ discriminant function ÀÇ °ªÀ» º¯È­½Ãų ÇÊ¿ä´Â ¾ø´Ù. ¿Ö³ÄÇϸé Àß ¸ø ºÐ·ùµÇÁö ¾Ê¾Ò±â ¶§¹®ÀÌ´Ù.

°¡ÁßÄ¡¸¦ º¯È­½ÃÅ°´Â °£´ÜÇÑ ¹æ¹ýÀ¸·Î ¿Ïº®ÇÑ ºÐ·ù°¡ °¡´ÉÇÑ ±â¼úÀÌ´Ù. ±×·¯³ª error ¾øÀÌ data¸¦ ºÐ·ùÇÒ ¸¸ÇÑ °¡ÁßÄ¡ Á¶ÇÕÀÌ ¾øÀ» °æ¿ì¿¡´Â ÃÖ¼±ÀÇ °¡ÁßÄ¡¸¦ ã´Â °ÍÀ» º¸ÀåÇÏÁö´Â ¾Ê´Â´Ù. ±×·¯³ª °æÇèÀûÀ¸·Î º¸¸é ±×°ÍÀº ´ë°³ ÁÁÀº ÀýÃæ¾È¿¡ µµ´ÞÇÑ´Ù.

ÇϳªÀÇ ¼±Çü discriminant function Àº feature space¸¦ discriminant functionÀÌ °¡Àå Å« °ªÀ» °¡Áö´Â region À¸·Î ºÐÇÒÇÑ´Ù. data¸¦ ºÐ·ùÇϱâ À§ÇØ regionµé °£ÀÇ decision boundary ÀÇ À§Ä¡¸¦ °è»êÇÒ ÇÊ¿ä´Â ¾ø´Ù. ±×·¯³ª ÀÌ·¯ÇÑ regionµéÀÌ ¾î¶² ÇüŸ¦ ÇÏ°í ÀÖ´ÂÁö¸¦ º¸´Â °ÍÀº Èï¹Ì·Ó´Ù. ´ÙÀ½ ¿¹Á¦´Â 3 °³ÀÇ class¸¦ À§ÇÑ decision region µéÀ» º¸¿©ÁØ´Ù.


Example 4.10  Finding the decision regions resulting from three discriminant functions.



and

                             

                        

    

                    and     , and                      and                     +1          +2 and      -2

                   9¤ý8/2 = 36

 

Figure 4.15

 

and                     and      nd           and (where )                and in           and      and         and            and thus     

 

Figure 4.16

 

4.7  Minimum Squared Error Discriminant Functions

ºñ·Ï adaptive decision boundary ¿Í adaptive discriminant function ÀÌ »ó´çÈ÷ ¸Å·ÂÀûÀÎ ¹æ¹ýÀÌÁö¸¸ ¿Ïº®ÇÏ°Ô ºÐ·ùÇϱâ À§Çؼ­´Â ¾Ë°í¸®ÁòÀ» À§ÇÑ feature space¿¡¼­ class µéÀÌ linearly separable Çؾ߸¸ ÇÑ´Ù. ¶ÇÇÑ class µéÀÌ hyperplane À¸·Î ºÐ¸® µÇ´õ¶óµµ decision boundary¸¦ ã±âÀ§Çؼ­´Â ¸¹Àº ¹Ýº¹ (many iteration)ÀÌ ¿ä±¸µÈ´Ù. ¶ÇÇÑ adaptive ¾Ë°í¸®ÁòÀÌ training data¸¦ Á¤È®ÇÏ°Ô ºÐ·ùÇÏ´Â ÃÖÃÊÀÇ °¡ÁßÄ¡ ÁýÇÕÀ» ¹ß°ßÇÒ ¶§ Á¾·áµÇÁö¸¸, ¹«¾ùÀÌ ÁÁÀº decision boundary ÀÎÁö¿¡ ´ëÇÑ Á÷°üÀûÀÎ (intuitive notion) °Í°ú °°Áö ¾ÊÀ» ¼ö ÀÖ´Ù. Áï ÁÁÀº decision boundary °¡ ¾Æ´Ñµ¥ Á¾·áµÉ ¼ö ÀÖ´Ù. ¿¹µéµé¸é ´ÙÀ½ ±×¸²ÀÇ (b)¿¡¼­ 4 ¹ø¸¸¿¡ ¾ò¾îÁø adaptive decision boundary º¸´Ù´Â Á÷°üÀûÀ¸·Î + ¿Í - ¸¦ ±¸ºÐÇÏ´Â ¼öÁ÷ decision boundary °¡ ´õ ÁÁ´Ù´Â °ÍÀ» Á÷°üÀûÀ¸·Î ¾Ë ¼ö ÀÖ´Ù.

minimum squared error ºÐ·ù ¹æ¹ýÀº iterationÀ» ¿ä±¸ÇÏÁöµµ ¾Ê°í, linearly separableÀ» ¿ä±¸ÇÏÁöµµ ¾Ê´Â´Ù. adaptive decision boundary ¹æ¹ýÀ» »ç¿ëÇÑ °Íº¸´Ù ´õ Á÷°üÀûÀ¸·Î ±×·² µíÇÑ decision boundary¸¦ ã¾Æ³»Áö¸¸ ¿ÏÀüÇÑ ÇϳªÀÇ Çظ¦ ã¾Æ³»´Â °ÍÀ» º¸ÀåÇÏÁö´Â ¾Ê´Â´Ù. minimum squared error ¿¡¼­´Â classÀÇ ¼ö¿¡ »ó°ü¾øÀÌ ´Ü ÇϳªÀÇ discriminant functionÀ» »ç¿ëÇÑ´Ù.

¸¸ÀÏ  °³ÀÇ sample °ú °³ÀÇ feature °¡ ÀÖ´Ù¸é,  °³ÀÇ feature vector °¡ ÀÖÀ» °ÍÀÌ´Ù.

, .

ÀÇ true class¸¦ ¶ó°í ÇÒ ¶§ ±×°ÍÀº ¾î¶² ¼öÄ¡°ªµµ °¡Áú ¼ö ÀÖ´Ù. ¿ì¸®´Â ´Ü ÇϳªÀÇ linear discriminant function () À» À§ÇÑ °¡ÁßÄ¡ ÁýÇÕ ,  À» ã°íÀÚ ÇÑ´Ù.

  

¿©±â¼­ ¸ðµç ¿¡ ´ëÇØ ÀÌ´Ù. ½ÇÁ¦·Î ±×·¯ÇÑ °¡ÁßÄ¡ °ªÀº Á¸ÀçÇÏÁö ¾ÊÁö¸¸ ´ÜÁö ÀûÀýÇÏ°Ô ÀûÀýÇÏ°Ô °¡ÁßÄ¡ ¸¦ ¼±ÅÃÇÏ´Â °ÍÀÌ´Ù. Áï ¸ñǥġ (desired values ) ¿Í Ãâ·ÂÄ¡ (actual values ) °£ÀÇ Â÷ÀÌÀÇ Á¦°öÀÇ ÇÕÀÌ ÃÖ¼Ò°¡ µÇµµ·Ï ÇÏ´Â °ÍÀÌ´Ù. ±× °ª ´Â ´ÙÀ½°ú °°ÀÌ ±¸ÇØÁø´Ù.

or


Example 11  The minimum squared error procedure.

0
0
1
1

0
1
0
1

-1
-1
1
1

Inserting this data and (24) into (25) produces


.

Computing the partial derivatives of with respect to , , and and setting each one equal to zero produces

,

,

and

.

Solving for , , and results in and . Substituting these weights into equation (24) gives the discriminant function

.

or


Example 12  Comparison of the minimum squared error decision boundary with the adaptive decision boundary.

, , and

Figure 17 : The minimum squared error decision boundary for Example 11.

..........

4.8  Choosing a Decision Making Technique

 

À§ ±×¸²¿¡¼­ (a) ¿¡¼­´Â minimum squared error (MSE) ¹æ¹ýÀÌ adaptive decision boundary (ADB) ¹æ¹ýº¸´Ù ´õ ÁÁÀº °á°ú¸¦ º¸ÀδÙ. (b) ¿¡¼­´Â boundary ·Î¼­ ¸¦ »ç¿ëÇÏ¿© ADB °¡ MSE º¸´Ù ´õ ÁÁÀº °á°ú¸¦ º¸¿©ÁØ´Ù.

 

À§ ±×¸² (a) ¿¡¼­´Â °¢ class¿¡ ´ëÇØ ÇϳªÀÇ adaptive linear decision boundary ¸¦ »ç¿ëÇÏ¿© ¿Ïº®ÇÏ°Ô ±¸ºÐµÈ classµéÀÌ´Ù. ÇÊ¿äÇÑ boundary µé¸¸À» º¸¿©ÁØ´Ù. (b)¿¡¼­´Â °¢ class ¿¡ ´ëÇØ ÇϳªÀÇ adaptive discriminant function À» »ç¿ëÇÏ¿© ¿Ïº®ÇÏ°Ô ±¸ºÐµÈ´Ù. (c)¿¡¼­´Â 1, 3, 5, 9 ÀÇ  °ªÀ» °¡Áö´Â minimum squared error À» »ç¿ëÇÏ¿© ¿Ïº®ÇÏ°Ô ±¸ºÐµÈ´Ù.

...............