ÀÚ¿¬¾ð¾î ó¸®

(Natural-Language Processing)

 

C ÀΰøÁö´É ÇÁ·Î±×·¡¹Ö : Herbert Schildt ÁöÀ½, ½Å°æ¼÷.·ù¼º·Ä ¿Å±è, ¼¼¿õ, 1991 (¿ø¼­ : Artificial Intelligence using C, McGraw-Hill, 1987), page 155~214

 

1. ÀÚ¿¬¾ð¾î 󸮶õ ¹«¾ùÀΰ¡? (WHAT IS NATURAL-LANGUAGE PROCESSING ?)

2. ÀÚ¿¬¾ð¾î ó¸® Á¢±Ù¹æ½Ä (APPROACHES TO NATURAL-LANGUAGE PROCESSING)

3. ¾ð¾î Á¦ÇÑ (RESTRICTING LANGUAGE)

4. »óÅÂ-¸Ó½Å NLP Æļ­ (THE STATE-MACHINE NLP PARSER)

     (1) »óŸӽŠNLP Æļ­ÀÇ ºÐ¼® (Analysis the state-machine NLP parser)

5. ¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâ Æļ­ (THE CONTEXT-FREE RECURSIVE-DESCENT NLP PARSER)

     (1) ¹®¸ÆÀÚÀ¯ Recursive-Descent NLP Æļ­ÀÇ ºÐ¼®
         (Analysis of the Context-Free Recursive-Descent NLP Parser)

6. ÀâÀ½Á¦°Å Æļ­ (THE NOISE-DISPOSAL PARSER)

    (1) ÀâÀ½Á¦°Å Æļ­ÀÇ ºÐ¼® (Analysis of the Noise-Disposal Parser)

 

Á¦ 1 Àå¿¡¼­ ¾ð±ÞÇßµíÀÌ, ¸¹Àº AI Àü¹®°¡µéÀº AI °¡ ÇØ°áÇÒ ¼ö ÀÖ´Â °¡Àå Áß¿äÇÑ ¾÷¹«´Â ÀÚ¿¬¾ð¾î 󸮶ó°í ¹Ï´Â´Ù. ÀÌ·¸°Ô ¹Ï´Â ÀÌÀ¯´Â ÀÏ´Ü ¿Ï¼º¸¸ µÇ¸é ÀÚ¿¬¾ð¾î 󸮴 »ç¶÷°ú ÄÄÇ»ÅÍ »çÀÌ¿¡ Á÷Á¢ÀûÀÎ ´ëÈ­ÀÇ ¹®À» ¿¬´Ù´Â °ÍÀ̸ç, ÀÌ°ÍÀº º¸Åë »ç¿ëµÇ´Â ÇÁ·Î±×·¡¹Ö°ú ¿î¿µÃ¼Á¦ ÇÁ·ÎÅäÄÝÀ» ÇÇÇÒ °ÍÀÌ´Ù ; ¸¸¾à ÄÄÇ»ÅÍ°¡ Àΰ£ÀÇ ¾ð¾î¸¦ ÀÌÇØÇÏ°í ¸»ÇÒ ¼ö ÀÖ´Ù¸é, ´ëºÎºÐÀÇ ÀÏÀÌ ¼ÒÇÁÆ®¿þ¾î ±â¼úÀڵ鿡 ÀÇÇØ ÇÁ·Î±×·¥À¸·Î ÀÛ¼ºµÉ ÇÊ¿ä°¡ ´õ ÀÌ»ó ¾ø°Ô µÉ °ÍÀÌ´Ù.

ÀÌ Àå¿¡¼­ º¸°ÚÁö¸¸, ÀÚ¿¬¾ð¾î 󸮴 "ÇàÇÒ ¼ö ÀÖ´Â (do-able)" ÀÏÀÌ´Ù. ±×·¯³ª Àΰ£ÀÇ ¾ð¾îÀÇ ¾öû³­ Å©±â¿Í º¹À⼺ ¶§¹®¿¡ ÀÌ·ç¾îÁöÁö ¸øÇß´Ù. ±×·¯³ª, ÀÚ¿¬¾ð¾î 󸮰¡ ¶§¶§·Î ¾ó¸¶³ª °£´ÜÇÒ ¼ö ÀÖ´ÂÁö ³î¶ö °ÍÀÌ´Ù. ¶ÇÇÑ ÄÄÇ»ÅÍ°¡ ¹®ÀåÀÇ ¶æÀ» ÀÌÇØÇϵµ·Ï ÇÏ´Â ¾î·Á¿òµµ ¾Ë°Ô µÉ °ÍÀÌ´Ù. ÀÌ Àå¿¡¼­´Â ÀÚ¿¬¾ð¾î 󸮷ÎÀÇ ¼¼°¡Áö ÀϹÝÀûÀÎ Á¢±Ù ¹æ½Ä¿¡ ´ëÇÏ¿© ¼³¸íÇÏ°í °¢ ¹æ¹ýÀÇ ¿¹¸¦ °³¹ßÇÑ´Ù.

1. ÀÚ¿¬¾ð¾î 󸮶õ ¹«¾ùÀΰ¡? (WHAT IS NATURAL-LANGUAGE PROCESSING ?)

¾Õ¿¡¼­ ¾ð±ÞÇßµíÀÌ, º¸Åë NLP ·Î »ý·«ÇÏ¿© ºÎ¸£´Â ÀÚ¿¬¾ð¾î 󸮴 ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý Àΰ£ÀÇ Ç¥ÁØ ¾ð¾î·Î ¾²¿©Áø ¸í·ÉµéÀ» ÀÌÇØÇÒ ¼ö ÀÖ°Ô ÇÏ·Á°í ½ÃµµÇÑ´Ù (ÀÌ Ã¥ÀÇ ³ª¸ÓÁö¿¡¼­´Â ¿µ¾î°¡ ó¸®µÉ Àΰ£ÀÇ ¾ð¾î¶ó°í °¡Á¤ÇÏÁö¸¸, Á¦½ÃµÈ ¸ðµç °³³äµéÀ» ´Ù¸¥ ¾î¶² ¾ð¾î¿¡µµ Àû¿ëÇÒ ¼ö ÀÖ´Ù). NLP ÀÇ ´Ù¼Ò ´ú Áß¿äÇÑ ºÎºÐÀº ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý ÀÚ¿¬¾ð¾î¿Í ºñ½ÁÇÑ ¹ÝÀÀÀ» ±¸¼ºÇÏ°Ô ÇÏ´Â °ÍÀÌ´Ù. ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý ÀÚ¿¬¾ð¾î¸¦ ÀÌÇØÇÏ°Ô ÇÑ ÈÄ¿¡, ¹ÝÀÀÀ» »ý¼ºÇÏ´Â °ÍÀº ÀÛÀº ´Ü°èÀÌ´Ù. ÀÌ Àå¿¡¼­´Â ¹ÝÀÀÀ» ±¸¼ºÇÏ´Â µ¥¿¡ °ü½ÉÀÌ ¾ø´Ù.

´ëºÎºÐ, ¾ð¾î ÇÕ¼º°ú ÀνÄÀº ½ÇÁ¦·Î NLP ÀÇ ºÐ¾ß°¡ ¾Æ´Ï´Ù. ÀÚ¿¬¾ð¾î 󸮱â´Â ¹®ÀåÀÌ ÄÄÇ»ÅÍ¿¡ ¾î¶»°Ô ÀԷµǴ Áö¿¡ °ü½ÉÀÌ ¾ø´Ù. ÀÏÀº ±× ¹®ÀåÀ¸·ÎºÎÅÍ Á¤º¸¸¦ »Ì¾Æ³»´Â °ÍÀÌ´Ù. ÀÌ Àå¿¡¼­´Â ¸ðµç ´ëÈ­°¡ Å͹̳ο¡¼­ ¹ß»ýÇÑ´Ù°í °¡Á¤ÇÑ´Ù.

¾Æ¸¶µµ »ý°¢ÇÒ ¼ö ÀÖµíÀÌ ÀÚ¿¬¾ð¾î 󸮱â´Â ¿¬±¸¸¦ À§ÇÑ °Í¸¸À» Á¦¿ÜÇÏ°í ´Üµ¶À¸·Î »ç¿ëµÇÁö ¾Ê´Â´Ù. ±×·¯³ª NLP ´Â ´Ù¸¥ ÄÄÇ»ÅÍ ÇÁ·Î±×·¥ - ƯÈ÷ µ¥ÀÌÅͺ£À̽º ¸Å´ÏÀú¿Í ¹ü¿ë ¹®Á¦ÇØ°á±â - ¸¦ À§ÇÑ ÀüÈÄ󸮱â (front - ends) ·Î Á¦°øÇÒ ¼ö ÀÖ´Ù. ¶ÇÇÑ, ¸¹Àº ÇÁ·Î±×·¡¸ÓµéÀº, °á±¹ ÄÄÇ»Å͸¦ »ç¿ëÇÏ´Â ¹ýÀ» ¹è¿ì´Âµ¥ °É¸®´Â ½Ã°£À» ¾ø¾ÖÁÙ NLP À§ÁÖÀÇ ¿î¿µÃ¼Á¦¿¡ °ü½ÉÀ» °®´Â´Ù. ±¸¹® ¹Î°¨ (Context-sensitive) ÀÎ ¿Ü±¹¾î ¹ø¿ª±âµéÀº Á¤È®ÇÑ ¹ø¿ªÀ» Çϱâ À§Çؼ­ ÀÚ¿¬¾ð¾î 󸮸¦ ÇÊ¿ä·Î ÇÑ´Ù. ¸¶Áö¸·À¸·Î, NLP °¡ °á±¹Àº Àΰ£¼¼°è¿Í »óÈ£ÀÛ¿ëÀ» ÇؾßÇÏ´Â ÀÚÀ²ÀûÀÎ ·Îº¿¿¡ ÇʼöÀûÀ̶ó´Â °ÍÀº ÀǽÉÇÒ ¹Ù°¡ ¾ø´Ù.

2. ÀÚ¿¬¾ð¾î ó¸® Á¢±Ù¹æ½Ä (APPROACHES TO NATURAL-LANGUAGE PROCESSING)

NLP ¶ó´Â ÁÖÁ¦´Â ¸Å¿ì Å©±â ¶§¹®¿¡, ÇÑ Àå¿¡ ±× ¸ðµÎ¸¦ ´Ù·ç·Á°í ÇÏ´Â °ÍÀº ¹Ù¶÷Á÷ÇÏÁö ¸øÇÏ´Ù. ´ë½Å, ÀÌ Àå¿¡¼­´Â ÀÚ¿¬¾ð¾î 󸮿¡ °üÇÑ ¼¼°¡Áö Á¢±Ù ¹æ½Ä¿¡ ÃÊÁ¡À» µÐ´Ù.

¾î¶² NLP ½Ã½ºÅÛÀÌ¶óµµ ±× ÇÙÀº Æļ­ (parser) ÀÌ´Ù. Æļ­´Â ¹¹°¡¹ºÁö °áÁ¤Çϱâ À§Çؼ­ °¢ ¹®ÀåÀ» ÇÑ ´Ü¾î¾¿ Àд ÄÚµå ºÎºÐÀÌ´Ù. °øºÎÇÒ ¼¼°¡Áö Æļ­´Â ´ÙÀ½°ú °°´Ù.

°¢ Æļ­´Â ¹®ÀåÀ» ´Ù¸£°Ô º¸°í ÀÚ½ÅÀÇ Æ¯º°ÇÑ ÀÀ¿ëÀ» °®´Â´Ù.

ÀÚ¿¬¾ð¾î 󸮷ÎÀÇ µÎ °¡Áö ¹Ý´ë Á¢±Ù¹æ½ÄÀÌ ÀÖ´Ù. Çϳª´Â Àΰ£¼¼°è¿Í ¶È°°ÀÌ, ¹®Àå¿¡ ÀÖ´Â ¸ðµç Á¤º¸¸¦ »ç¿ëÇÏ·Á°í ½ÃµµÇÑ´Ù. ÀÌ ¹æ½ÄÀÇ ¸ñÇ¥´Â ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý ´ëÈ­¸¦ ÇÒ ¼ö ÀÖ°Ô ¸¸µå´Â °ÍÀÌ´Ù. ±×·¯³ª, À̸¦ ÀÌ·ç±â´Â ¸Å¿ì ¾î·Æ´Ù. ´Ù¸¥ Á¢±Ù¹æ½ÄÀº ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý ÀÚ¿¬¾ð¾î ¸í·É¾î¸¦ ¹Þ¾ÆµéÀÏ ¼ö ÀÖ°Ô ÇÏ°í, ±×·¸Áö¸¸ ±× ¸í·É¾î¿¡ ÇʼöÀûÀÎ Á¤º¸¸¸ ²ø¾î³¾·Á°í ÇÑ´Ù - ÀÌ°ÍÀº ÇÁ·Î±×·¥ ÇϱⰡ ÈξÀ ´õ ½¬¿î ÀÏÀÌ´Ù. ÀÌ Àå¿¡¼­, ´Ü ÇÑ°¡Áö Æļ­¸¸ÀÌ Ã¹ ¹ø° ¸ñÇ¥¿¡ µµ´ÞÇÒ ±âȸ¸¦ °®°Ô µÇÁö¸¸ µÎ ¹ø° ¸ñÇ¥¸¦ ¼öÇàÇϱâ À§Çؼ­µµ Á¦½ÃµÈ ¸ðµç Æļ­µéÀ» »ç¿ëÇÒ ¼ö ÀÖ´Ù´Â °ÍÀ» ¾Ë°Ô µÉ °ÍÀÌ´Ù. ÀÌ Àå¿¡¼­´Â ÀÚ¿¬¾ð¾î°¡ ÄÄÇ»ÅÍ°¡ »ç¿ëÇÒ ¼ö ÀÖ´Â ÇüÅ·Πº¯È¯µÇ°Ô ÇÏ´Â ±â¹ýµé¸¸À» ´Ù·é´Ù.

3. ¾ð¾î Á¦ÇÑ (RESTRICTING LANGUAGE)

NLP Áß½ÉÀÇ ½Ã½ºÅÛÀ» ±¸¼ºÇÏ´Â °¡Àå ¾î·Á¿î ¸éµé ÁßÀÇ Çϳª´Â Àΰ£ÀÇ ¾ð¾îÀÇ º¹À⼺°ú À¶Å뼺ÀÌ´Ù. ÀÚ¿¬¾ð¾î 󸮱⸦ ±¸ÇöÇÒ ¶§, 󸮱Ⱑ ÀÌÇØÇÒ ¹®ÀåÀÇ À¯ÇüÀ» ÀÚ¿¬¾ð¾îÀÇ ÀϺηΠÁ¦ÇÑ ÇÏ·Á´Â À¯È¤À» ¹Þ°Ô µÈ´Ù. ¸¸¾à 󸮱Ⱑ ¹Þ¾Æµé¿©¾ß ÇÏ´Â ¹®¹ýÀ» Á¦ÇÑÇϸé, ÀÏÀº ÈÙ¾À ´õ ½¬¿öÁö°í, ¿Ã¹Ù·Î ÇàÇØÁø´Ù¸é, ±× Á¦ÇÑÀº °ÅÀÇ ´«¿¡ ¶çÁö ¾Ê´Â´Ù. ¾î¶² °æ¿ì¿¡µµ, ÀÌ Àå¿¡ Àִ ó¸®±âµéÀÌ ¹Þ¾ÆµéÀÏ ¼ö ÀÖ´Â ¹®¹ýÀ» Á¦ÇÑÇÏ´Â °ÍÀÌ ÇÊ¿äÇÏ´Ù. ÀÌ°ÍÀÌ ÇàÇØÁöÁö ¾ÊÀ¸¸é, °¢ ¿¹·ÎÀÇ ÄÚµå´Â ³Ê¹« ±æ °ÍÀÌ´Ù.

±×·¯¹Ç·Î ÀÌ Àå¿¡ ÀÖ´Â ´ëºÎºÐÀÇ ¿¹µé¿¡ ´ëÇÏ¿©, ¸ðµç ¹®ÀåÀº ¼±¾ðÀû (declarative) ÀÌ°í, Àǹ®Çü (interrogative) ÀÌ ¾Æ´Ï¶ó´Â °Í°ú, Ç¥ÁØÇü ÁÖ¾î, µ¿»ç, ¸ñÀû¾î (subject, verb, object) ¸¦ µû¸¥´Ù´Â °ÍÀ» °¡Á¤ÇÑ´Ù. ¶ÇÇÑ ¸ðµç Çü¿ë»ç´Â ¼ö½ÄÇÏ´Â ¸í»ç ¾Õ¿¡ ¿À´Â ¹Ý¸é¿¡, ¸ðµç ºÎ»ç´Â ¼ö½ÄÇÏ´Â µ¿»ç ´ÙÀ½¿¡ ¿Â´Ù´Â °ÍÀ» °¡Á¤ÇØ¾ß ÇÑ´Ù. ±×·¯¹Ç·Î ´ÙÀ½ ¹®ÀåÀº Ÿ´çÇÏ´Ù :

The child runs to the house.

The large child runs quickly to the window.

±×·¯³ª ÀÌ Àå¿¡ ÀÖ´Â Æļ­µéÀº ´ÙÀ½°ú °°Àº ¹®ÀåÀ» ºÎ»ç quickly °¡ µ¿»ç runs ¾Õ¿¡ ¿À±â ¶§¹®¿¡ Ÿ´çÇÏÁö ¸øÇÏ´Ù°í °áÁ¤ÇÒ °ÍÀÌ´Ù.

The child quickly runs to the house.

ÀÌ ÀåÀÇ ³ª¸ÓÁö ºÎºÐ¿¡ ´ëÇÏ¿©, ÀÌ Á¦ÇÑµÈ ¹®¹ýÀº G1 ¹®¹ýÀ¸·Î ÀÏÄþî Áú °ÍÀÌ´Ù. ÀÌ·¯ÇÑ ±ÔÄ¢µé ¿Ü¿¡, »ç¿ëÇÒ ¾îÈÖ°¡ ÇÊ¿äÇÒ °ÍÀÌ´Ù. ¿¹¸¦ °£´ÜÈ÷ Çϱâ À§Çؼ­, ÀÌ Àå¿¡¼­´Â ´Ü¾îÀÇ ¼ö¸¦ ÃÖ¼Ò·Î À¯ÁöÇÏ°ÚÁö¸¸, ¿øÇÑ´Ù¸é ¸®½ºÆ®¿¡ ÀÚÀ¯·Ó°Ô µ¡ºÙÀÏ ¼ö ÀÖ´Ù. ÀÌ Àå¿¡ ÀÖ´Â ¿¹¿¡ ´ëÇÏ¿© Æļ­´Â À¯Çü (type) °ú ÇÔ²² ´ÙÀ½¿¡ ÀÖ´Â ´Ü¾îµéÀ» ÀνÄÇÒ °ÍÀÌ´Ù.

´Ü¾î                                                                            À¯Çü

¹® (door), â¹® (window), Áý (house), ¾ÆÀÌ (child)             ¸í»ç (noun)

°¡Áö´Ù (has), ´Þ¸®´Ù (runs), ³î´Ù (plays)                         µ¿»ç (verb)

Å« (large)                                                                   Çü¿ë»ç (adjective)

»¡¸® (quickly)                                                              ºÎ»ç (adverb)

±× (the), ÇϳªÀÇ (a)                                                      ÇÑÁ¤»ç (determiner)

~·Î (to)                                                                     ÀüÄ¡»ç (preposition)

4. »óÅÂ-¸Ó½Å NLP Æļ­ (THE STATE-MACHINE NLP PARSER)

»óÅÂ-¸Ó½Å Æļ­´Â ¾î¶² À¯ÇüÀÇ ´Ü¾î°¡ ¿Ã¹Ù¸£°Ô ³ª¿Ã ¼ö ÀÖ´ÂÁö ¿¹ÃøÇϱâ À§Çؼ­ ¹®ÀåÀÇ ÇöÀç »óŸ¦ »ç¿ëÇÑ´Ù. ±×¸² 1 ÀÌ ÀÌ Àå¿¡¼­ »ç¿ëÇÒ »óÅÂ-¸Ó½ÅÀ» º¸¿©ÁØ´Ù. »óÅÂ-¸Ó½ÅÀº ÇÑ »óÅ¿¡¼­ ´Ù¸¥ »óÅ·ÎÀÇ À¯È¿ÇÑ º¯ÀÌ (transition) À» º¸¿©ÁÖ´Â ¹æÇ⼺ ±×·¡ÇÁÀÌ´Ù. ¿¹¸¦µé¾î, ¸í»ç ´ÙÀ½¿¡´Â µ¿»ç³ª ÀüÄ¡»ç°¡ ¿Ã ¼ö ÀÖ´Ù.ÀÌ »óŸӽÅÀº ¾Õ¿¡¼­ ¼³¸íµÈ G1 ¹®¹ýÀ» ¹Ý¿µÇÑ´Ù. ÀÌ »óŸӽÅÀº C ·Î ±¸ÇöÇÔÀ¸·Î½á, ±¸¼º¿ä¼Ò·Î ¹®ÀåÀ» ³ª´©±â À§ÇÏ¿© ±×°ÍÀ» »ç¿ëÇÒ ¼ö ÀÖ´Ù. ¶ÇÇÑ ¾î¶² ¹®ÀåÀÌ G1 ¹®¹ýÀÇ Á¦ÇÑ ³»¿¡¼­ ¿Ã¹Ù·Î ±¸¼ºµÇ¾ú´ÂÁö °áÁ¤Çϱâ À§ÇÏ¿© »ç¿ëÇÑ´Ù.

±×¸² 1  Á¦ÇÑµÈ G1 ¹®¹ýÀÇ »óŸӽÅ

»óŸӽÅÀ» ±¸ÇöÇϱâ Àü¿¡, Æļ­°¡ ÀνÄÇÒ ¼ö ÀÖ´Â ¾îÈÖ¿Í ´Ü¾îÀÇ À¯ÇüÀ» À¯ÁöÇÒ µ¥ÀÌÅͺ£À̽º¸¦ Á¤ÀÇÇØ¾ß ÇÑ´Ù. ¿©±â¼­ ÀÌ µ¥ÀÌÅͺ£À̽º´Â ±¸Á¶ (structures) ¹è¿­·Î Á¤ÀÇÇÑ´Ù.

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

      }

    struct word wdb[MAX];    /* array of db structure */

µ¥ÀÌÅͺ£À̽º´Â ¾ÕÀÇ ¿©·¯ Àå¿¡¼­ °³¹ßµÈ °Í°ú °°Àº À¯ÇüÀÇ ±â´ÉÀ» °®´Â ´Ù¸¥ µ¥ÀÌÅͺ£À̽ºµé°ú ¸Å¿ì À¯»çÇÏ°Ô ¸¸µé¾ú´Ù. ¶ÇÇÑ ¸Ó½ÅÀÇ ÇöÀç »óŸ¦ À¯ÁöÇÒ ÃÑ°ý º¯¼ö¸¦ ÇÊ¿ä·Î ÇÑ´Ù. ÀÌ º¯¼ö´Â state ¶ó°í ºÎ¸¥´Ù. ¼öÇàÀ» ½ÃÀÛÇÒ ¶§, ÇÁ·Î±×·¥Àº ¾îÈÖ¸¦ °®´Â word µ¥ÀÌÅͺ£À̽º¸¦ ·ÎµåÇÏ°í, ±×¸®°í ³ª¼­ state ¸¦ ƯÁ¤ÇÑ ÃʱⰪÀ» Æ÷ÇÔÇϵµ·Ï ÃʱâÈ­ ½ÃŲ´Ù.

ÇÁ·Î±×·¥ÀÌ Å°º¸µå·ÎºÎÅÍ ÇÑ ¹®ÀåÀ» ÀÐÀ» ¶§ ÇÔ¼ö parse ¸¦ È£ÃâÇϴµ¥ ¹®ÀåÀ» ±¸¼º ¼ººÐÀ¸·Î ÂÉ°³°í »óź¯À̸¦ °ü¸®ÇÑ´Ù. parse °¡ true ¸¦ ¸®ÅÏÇÏ¸é ±× ¹®ÀåÀº Á¦ÇÑµÈ ¹®¹ý¿¡ µû¶ó ¿Ç´Ù. parse °¡ false ¸¦ ¸®ÅÏÇÏ¸é ±× ¹®ÀåÀº Ÿ´çÇÏÁö ¸øÇÏ´Ù. main() °ú ÇÔ²² ¸ðµç Á¤ÀÇ¿Í ÃÑ°ýº¯¼ö ¼±¾ðÀÌ ´ÙÀ½¿¡ ÀÖ´Ù.

    /* state-machine NLP example */

    #include "stdio.h"

    #define MAX 100

    #define NOUN 1

    #define VERB 2

    #define ADJ 3

    #define ADV 4

    #define DET 5

    #define PREP 6

    #define TERM 7

    #define STARTUP-1

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

    };

    struct word wdb[MAX];  /* array of db structure */

    int db_pos=0;               /* number of entries in wdb */

    char state=STARTUP;    /* holds the current state of the machine */

    char s[80];                 /* holds the sentence */

    char *t_pos=0;             / *points into the sentence */

    char token[80];           /* contains the word */

    main()

      {

        setup();

        printf("Enter Sentence : ");

        gets(s);

        t_pos=s;

        if (parse())  printf("Sentence OK¡¬n");  

      }

ÃÑ°ýº¯¼ö t_pos ´Â ÀÔ·Â ¹®Àå¿¡ ´ëÇÑ Æ÷ÀÎÅ͸¦ °®°í, °¢ ´Ü¾î°¡ ÀÐÇôÁú ¶§¸¶´Ù ¾ÕÀ¸·Î ³ª¾Æ°£´Ù. Æļ­´Â Çѹø¿¡ ÇÑ ´Ü¾î¾¿ ó¸®ÇÑ´Ù. µû¶ó¼­ ÇÑ ¹®ÀåÀ» ±× ±¸¼º¼ººÐÀ¸·Î ÂÉ°¶ ·çƾÀÌ ÇÊ¿äÇÏ´Ù. ÇÔ¼ö get_token() Àº ´ÙÀ½ ÀÏÀ» ÇÑ´Ù : get_token() Àº ½ºÆäÀ̽º (ºóÄ­) À» ¸¸³¯ ¶§±îÁö ÇÑ ¹ø¿¡ ÇÑ ¹®ÀÚ¾¿ ÀÔ·Â ¹®ÀåÀ» Àд´Ù. ÀÐÇôÁö´Â ¹®ÀÚµéÀº ¹®ÀåÀ¸·ÎºÎÅÍ ´ÙÀ½ ´Ü¾î¸¦ Çü¼ºÇÑ´Ù. ±×¸®°í ³ª¼­, get_token() Àº ÀÌ ´Ü¾î¸¦ ÃÑ°ýº¯¼ö ½ºÆ®¸µÀÎ token ¿¡ ³Ö´Â´Ù. ÀÌ ¿¬»êÀº Á¦ 3 ÀåÀÇ Àü¹®°¡½Ã½ºÅÛÀÌ »ç¿ëÇÑ °Í°ú ºñ½ÁÇÏ´Ù. ¿©±â¿¡ get_token() ÀÌ ÀÖ´Ù.

    /* return one token from the input stream */

    get_token()

    {

      char *p;

      p=token;

      /* skip spaces */

      while(*t_pos==' ') t_pos++;

      if(t_pos=='.') {

        *p++='.';

        *p='¡¬0';

        return;

      }

      /* read word until a space or period */

      while(*t_pos!='' && *t_pos!='.'){

        *p=*t_pos++;

        p++;

      }

      *p='¡¬0';

    }

get_token() ÀÌ ´ÙÀ½ ¹®ÀåÀ» Àаí ÀÖ´Ù°í »ý°¢ÇØ º¸ÀÚ. get_token() À¸·ÎºÎÅÍ ¸®ÅÏÇÑ ÈÄ, token Àº this ¸¦ Æ÷ÇÔÇÒ °ÍÀÌ°í, t_pos ´Â this ¹Ù·Î µÚ¿¡ ¿À´Â ºóÄ­À» °¡¸®Å³ °ÍÀÌ´Ù. ÇÊ¿äÇÑ ¸ðµç Áö¿ø ÇÔ¼ö°¡ ÀûÀýÇÑ °÷¿¡ ÀÖÀ¸¹Ç·Î, ´ÙÀ½¿¡ ÀÖ´Â °Í°ú °°ÀÌ, ÁÖ¿ä Æļ­ ·çƾÀ» ÀÛ¼ºÇÒ ¼ö ÀÖ´Ù.

    /* state-machine parser */

    parse()

    {

      char type;

      do{

        get_token();

        /* transition to new state */

        if (!(state=is_legal(token, state))) {

          printf("Error in sentence. ¡¬n");

          return 0;

        }

      } while(*token!='.');

      return 1;

    }

ÀÌ°ÍÀÌ º¸¿©ÁÖµíÀÌ, parse() ´Â »óź¯À̸¦ ¼öÇàÇϱâ À§ÇÏ¿© is_legal() ÇÔ¼ö¸¦ »ç¿ëÇÑ´Ù. Á¤»óÀûÀÎ º¯ÀÌ¿¡¼­, is_legal Àº ¸Ó½ÅÀÇ »õ·Î¿î »óŸ¦ ¸®ÅÏÇÏ°í; ºÒÇÕ¸®ÇÑ º¯ÀÌ¿¡¼­, false ¸¦ ¸®ÅÏÇÑ´Ù - ¹®ÀåÀÇ ±¸Á¶ÀÇ ¿¡·¯¸¦ ÀǹÌÇÑ´Ù. ¿©±â¿¡ is_legal() ÀÌ ÀÖ´Ù.

    /* check for valid state transitions */

    is_legal(word, state)

    char *word;

    int state;

    {

      int type;

      type=find_type(word);

      if (type==DET) return state;   /* skip */

      if (type==TERM) return TERM;   /* end of sentence */

      switch(state) {

        case STARTUP : if (type!=DET) return type;

          else return STARTUP;

        case NOUN :

          if (type==VERB) return VERB;

          if (type==PREP) return PREP;

          break;

        case VERB :

          if (type==PREP) return PREP;

          if (type==NOUM) return NOUN;

          if (type==ADV) return ADV;

          if (type==ADJ) return ADJ;

          break;

        case ADV :

          if (type==NOUN) return NOUN;

          if (type==PREP) return PREP;

          break;

        case ADJ :

          if (type==NOUN) return NOUN;

          break;

        case PREP :

          if (type==ADJ) return ADJ;

          if (type==NOUN) return NOUN;

          break;

      }

      return 0;

    }

is_legal() ÇÔ¼ö´Â ´ÙÀ½°ú °°ÀÌ ÀÛµ¿ÇÑ´Ù. ´ÙÀ½ ´Ü¾î°¡ a ³ª the ¿Í °°Àº ÇÑÁ¤»çÀ̸é, »óź¯ÀÌ´Â ¹ß»ýÇÏÁö ¾Ê´Â´Ù. ¸¶Ä§Ç¥¿Í °°Àº Á¾°áÀÚ´Â ¹®ÀåÀÇ ³¡À» °¡¸®Å°°í 󸮸¦ ¸ØÃß°Ô ÇÑ´Ù. ´ÙÀ½ ´Ü¾î°¡ ÇÑÁ¤»ç°¡ ¾Æ´Ï¸é is_legal() Àº ´ÙÀ½ ÅäÅ«ÀÇ À¯ÇüÀ» ¸Ó½ÅÀÇ ÇöÀç »óÅÂ¿Í ºñ±³ÇÏ°í - ¸¸¾à ±× À¯ÇüÀÌ ¿Ã¹Ù¸£¸é Àû´çÇÑ º¯À̸¦ ÇÑ´Ù. »óŸӽÅÀº Ãʱ⠻óÅ·μ­ óÀ½ ¹àÇôÁø ÅäÅ«ÀÇ À¯ÇüÀ» »ç¿ëÇÑ´Ù. »óŸӽŠÆļ­ ÇÁ·Î±×·¥ Àüü°¡ ´ÙÀ½¿¡ ÀÖ´Ù. ÀÌÁ¦ ÄÄÇ»ÅÍ¿¡ ÀÔ·ÂÇØ¾ß ÇÑ´Ù.

    /* state-machine NLP example */

    #include "stdio.h"

    #define MAX 100

    #define NOUN 1

    #define VERB 2

    #define ADJ 3

    #define ADV 4

    #define DET 5

    #define PREP 6

    #define TERM 7

    #define STARTUP-1

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

    };

    struct word wdb[MAX];  /* array of db structure */

    int db_pos=0;               /* number of entries in wdb */

    char state=STARTUP;    /* holds the current state of the machine */

    char s[80];                 /* holds the sentence */

    char *t_pos=0;             / *points into the sentence */

    char token[80];           /* contains the word */

    main()

    {

      setup();

      printf("Enter Sentence : ");

      gets(s);

      t_pos=s;

      if (parse())  printf("Sentence OK¡¬n");  

    }

    setup()

    {

      assert_wdb("door", NOUN);

      assert_wdb("window", NOUN);

      assert_wdb("house", NOUN);

      assert_wdb("child", NOUN);

      assert_wdb("has", VERB);

      assert_wdb("runs", VERB);

      assert_wdb("plays", VERB);

      assert_wdb("large", ADJ);

      assert_wdb("quickly", ADV);

      assert_wdb("the", DET);

      assert_wdb("a", DET);

      assert_wdb("to", PREP);

      assert_wdb(".", TERM);

    }

    /* place facts into database */

    assert_wdb(word, type)

    char *word;

    int type;

    {

      if (db_pos<MAX) {

        strcpy(wdb[db_pos].word, word);

        wdb[db_pos].type=type;

        db_pos++;

      }

      else printf("Word database full. ¡¬n");

    }

    /* state-machine parser */

    parse()

    {

      char type;

      do{

        get_token();

        /* transition to new state */

        if (!(state=is_legal(token, state))) {

          printf("Error in sentence. ¡¬n");

          return 0;

        }

      } while(*token!='.');

      return 1;

    }

    /* check for valid state transitions */

    is_legal(word, state)

    char *word;

    int state;

    {

      int type;

      type=find_type(word);

      if (type==DET) return state;   /* skip */

      if (type==TERM) return TERM;   /* end of sentence */

      switch(state) {

        case STARTUP : if (type!=DET) return type;

          else return STARTUP;

        case NOUN :

          if (type==VERB) return VERB;

          if (type==PREP) return PREP;

          break;

        case VERB :

          if (type==PREP) return PREP;

          if (type==NOUM) return NOUN;

          if (type==ADV) return ADV;

          if (type==ADJ) return ADJ;

          break;

        case ADV :

          if (type==NOUN) return NOUN;

          if (type==PREP) return PREP;

          break;

        case ADJ :

          if (type==NOUN) return NOUN;

          break;

        case PREP :

          if (type==ADJ) return ADJ;

          if (type==NOUN) return NOUN;

          break;

      }

      return 0;

    }

    /* find the type G1ven the word */

    find_type(word)

    char *word;

    {

      int t;

      for(t=0 ; t<db_pos ; t++)

        if(!strcmp(word, wdb[t].word))

          return wdb[t].type;

        return 0;

    }

    /* return one token from the input stream */

    get_token()

    {

      char *p;

      p=token;

      /* skip spaces */

      while(*t_pos==' ') t_pos++;

      if(t_pos=='.') {

        *p++='.';

        *p='¡¬0';

        return;

      }

      /* read word until a space or period */

      while(*t_pos!='' && *t_pos!='.'){

        *p=*t_pos++;

        p++;

      }

      *p='¡¬0';

    }

ÀÌ ÇÁ·Î±×·¥À» ¼öÇàÇϸé, G1 ¹®¹ýÀ» °¡Á¤ÇÏ¿© ¾îÈÖ words ¸¦ »ç¿ëÇÏ´Â ¿Ã¹Ù·Î ±¸¼ºµÈ ¹®ÀåÀ» ¹Þ¾ÆµéÀÏ °ÍÀ̶ó´Â °ÍÀ» ¾Ë°Ô µÈ´Ù. ¾î¶»°Ô µ¿ÀÛÇÏ´ÂÁö ¾Ë±â À§Çؼ­, ´ÙÀ½ ¹®ÀåÀ¸·Î °£´ÜÇÑ ¿¹¸¦ ÅëÇØ ½ÇÇàÇØ º¸ÀÚ.

the child runs quickly to the large house.

¹®Àå¿¡ ´ëÇÑ ´ÙÀ̾Ʊ׷¥ (diagram) Àº G1 ¹®¹ýÀ» µû¸¥´Ù´Â °ÍÀ» Áõ¸íÇÑ´Ù.

the

child

runs

quickly

to

the

large

house.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ÇÑÁ¤»ç

¸í»ç

µ¿»ç

ºÎ»ç

ÀüÄ¡»ç

ÇÑÁ¤»ç

Çü¿ë»ç

¸í»ç

(DET)

(NOUN)

(VERB)

(ADVERB)

(PREP)

(DET)

(ADJ)

(NOUN)

»óŸӽŠÆļ­°¡ ÇÏ´Â °Í°ú °°Àº ¹æ¹ýÀ¸·Î ÀÌ ¹®ÀåÀ» ¼öÇàÇÒ ¶§ ±×¸² 1 ¿¡¼­ Á¦½ÃÇÑ »óŸӽŠ´ÙÀ̾î±×·¥À» ±â¾ïÇϸé ÁÁ´Ù. ÃßÃâµÈ ù ¹ø° ´Ü¾î´Â the ÀÌ´Ù. the ´Â ÇÑÁ¤»ç À̱⠶§¹®¿¡, Æļ­´Â ±×°ÍÀ» ¹ö¸®°í »óź¯È­´Â ÀϾÁö ¾Ê´Â´Ù. ´ÙÀ½ ´Ü¾î´Â ¸í»çÀÎ child ÀÌ°í ÀÌ°ÍÀº ¸Ó½ÅÀÇ ÇöÀç »óÅ°¡ NOUN ÀÌ µÇ°Ô ÇÑ´Ù. »óŸӽÅÀÌ ÀÌÁ¦ ½ÃÀ۵Ǿú´Ù. ´ÙÀ½ ´Ü¾î´Â µ¿»çÀÎ runs ÀÌ´Ù. ±×¸² 1 ¿¡¼­ º¸¿©ÁÖ¾úµíÀÌ, ¸í»ç·ÎºÎÅÍ µÎ°¡Áö °¡´ÉÇÑ º¯ÀÌ°¡ ÀÖ´Ù : µ¿»ç·ÎÀÇ º¯ÀÌ ¶Ç´Â ÀüÄ¡»ç·ÎÀÇ º¯ÀÌ. runs °¡ µ¿»çÀ̱⠶§¹®¿¡, ÀÌ º¯ÀÌ´Â ¼º°øÇÏ°í »õ·Î¿î »óÅ VERB ¸¦ state ¿¡ ³õ´Â´Ù ; ±×·¯¹Ç·Î, »óŸӽÅÀº ÀÌÁ¦ ³ëµå VERB ¿¡ ÀÖ´Ù. ´ÙÀ½ ´Ü¾î´Â ºÎ»ç quickly ÀÌ´Ù. »óŸӽŠ´ÙÀ̾î±×·¥À» º¸¸é, VERB ·Î ºÎÅÍÀÇ ÇÑ °¡Áö À¯È¿ÇÑ º¯ÀÌ´Â ADVERB ·Î ¶ó´Â °ÍÀ» ¾Ë ¼ö ÀÖ´Ù ; µû¶ó¼­ ½Ã½ºÅÛÀº »óÅ ADVERB ·Î °£´Ù. ADVERB ·Î ºÎÅÍ, ´ÙÀ½¿¡ ¿Ã ¼ö ÀÖ´Â »óÅ´ µÎ °¡ÁöÀÌ´Ù : NOUN ¶Ç´Â PRPOSITION. ¹®ÀåÀÇ ´ÙÀ½ ´Ü¾î´Â ÀüÄ¡»ç to À̱⠶§¹®¿¡ Æļ­´Â »óÅ PREPOSITION À¸·Î °£´Ù. ¿©±â¼­´Â ¸í»ç³ª Çü¿ë»ç°¡ µû¶ó¿Í¾ß ÇÑ´Ù. ´ÙÀ½ ´Ü¾î´Â ÇÑÁ¤»ç the À̹ǷΠÆļ­´Â ±×°ÍÀ» ¹ö¸®¹Ç·Î »óź¯È­´Â ÀϾÁö ¾Ê´Â´Ù. ±×¸®°í ³ª¼­, Çü¿ë»ç large °¡ ¿À°í, ±×°ÍÀº ADJECTIVE ·ÎÀÇ º¯À̸¦ ÀÏÀ¸Å²´Ù. »óÅ ADJECTIVE ´Â NOUN ¸¸ÀÌ µÚ¿¡ ¿Ã ¼ö Àִµ¥, ÀÌ°ÍÀº house °¡ ÆĽºµÉ ¶§ ¹ß»ýÇÏ´Â °ÍÀÌ´Ù. ¸¶Áö¸·À¸·Î, Æļ­´Â ¸¶Ä§Ç¥¸¦ ÀÐ°í ¹®ÀåÀ» ¿ÏÀüÈ÷ ÆĽºÇÑ´Ù.

À߸ø ¾²¿©Áø ¹®Àå¿¡¼­ ¹«½¼ ÀÏÀÌ ÀϾ´ÂÁö º¸±â À§ÇÏ¿© ´ÙÀ½À» »ý°¢ÇØ º¸ÀÚ.

the house child runs to the.

¿©±â´Â µÎ °³ÀÇ ¸í»ç°¡ ¾ÕµÚ¿¡ ÀÖ´Ù. Æļ­°¡ ù ¹ø° ¸í»ç¸¦ ¸¸³¯ ¶§, »óŸӽÅÀº NOUN »óÅ°¡ µÈ´Ù. »óŸӽÅÀÇ G1 ¹®¹ý¿¡ µû¶ó, NOUN À¸·ÎºÎÅÍÀÇ À¯ÀÏÇÑ º¯ÀÌ´Â VERB ³ª PREPOSITION À¸·ÎÀÇ º¯ÀÌÀÌ´Ù. ±×·¯¹Ç·Î, ¸í»ç child °¡ ³ª¿Ã ¶§, »óź¯ÀÌ´Â ¼º°øÇÒ ¼ö ¾ø°í, is_legal() Àº false ¸¦ ¸®ÅÏÇϹǷΠÀÌ°ÍÀº parse() ·Î ÇÏ¿©±Ý ¿¡·¯¸¦ º¸°íÇÏ°Ô ÇÑ´Ù.

°á°ú¸¦ º¸±â À§ÇÏ¿© ¿©·¯ ¹®ÀåÀ» ½ÃµµÇØ º¸¶ó. ÀÌ »óŸӽÅÀº ¸Å¿ì ¹Ì¼÷ÇÏ°í ½ÇÁ¦ ¿µ¾îÀÇ ¸ðµç ¼¼¹ÐÇÑ ºÎºÐµéÀ» ÀνÄÇÏÁö´Â ¸øÇϱ⠶§¹®¿¡ ½±°Ô È¥µ¿ÇÒ ¼ö ÀÖ´Ù´Â °ÍÀ» ±â¾ïÇØ¾ß ÇÑ´Ù.

(1) »óŸӽŠNLP Æļ­ÀÇ ºÐ¼® (Analysis the state-machine NLP parser)

ÀÚ¿¬¾ð¾î¿¡ Àû¿ëµÉ ¶§, »óŸӽŠÆļ­°¡ °®´Â °¡Àå ³ª»Û ¹®Á¦´Â º¹À⼺ÀÌ´Ù. °£´ÜÇÑ G1 ¹®¹ý¿¡ ´ëÇÏ¿©¼­ Á¶Â÷µµ, »óź¯ÀÌÀÇ ÀûÇÕ¼ºÀ» °áÁ¤Çϱâ ÀÌÇؼ­ µ¶¸³µÈ Á¶°Ç¹®À» ¿©·¯°³ ÇÊ¿ä·Î ÇÑ´Ù. ¶§¶§·Î ÇÑ°¡Áö »óÅ¿¡¼­ ´Ù¸¥ »óÅ·ΠÇÕÄ¥ ¼ö ÀÖÁö¸¸, Àüü ¿µ¾î ¹®¹ý¿¡ ´ëÇؼ­ ¾ó¸¶³ª ¸¹Àº »óŵéÀÌ ÀÖÀ» °ÍÀÎÁö »ó»óÇØ º¸¶ó. ÀÌ ÀÌÀ¯ ¶§¹®¿¡, »óŸӽŠÆļ­´Â ¹®¹ýÀÇ ±ØÈ÷ ÀϺθ¦ ÀÌ¿ëÇÒ ¼ö ÀÖ´Â »óȲÀ» Á¦¿ÜÇÏ°í´Â °ÅÀÇ »ç¿ëµÇÁö ¾Ê´Â´Ù.

»óŸӽŠÆļ­°¡ °®´Â ¶Ç ´Ù¸¥ ¹®Á¦´Â, Æļ­°¡ ¾î¶² ƯÁ¤ÇÑ »óÅ¿¡¼­ ¾î¶»°Ô µµ´ÞÇß´ÂÁö ¸ð¸¥´Ù´Â °ÍÀÌ´Ù. ¿¹¸¦µé¸é, ÇÑÁ¤¼ö (¼ö½Ä±¸) ¸¦ ƯÁ¤ÇÑ ¸í»ç¿Í °ü·Ã½Ãų ¼ö ¾ø´Ù. ÀÌ°ÍÀº ÇöÀç»óÅ ÀÌ¿ÜÀÇ ¾î¶² Á¤º¸¶óµµ °ø±ÞÇϵµ·Ï »óŸӽŠÆļ­¿¡°Ô ¿ä±¸ÇÒ ¼ö ¾ø´Ù´Â °ÍÀ» ÀǹÌÇÑ´Ù.

±àÁ¤ÀûÀÎ ¸é¿¡¼­, »óŸӽŠÆļ­´Â ¿î¿µÃ¼Á¦ ÀÛ¾÷Á¦¾î ¾ð¾î¿Í ¾î¶² µ¥ÀÌÅͺ£À̽º ÀÀ¿ë°ú °°Àº ƯÁ¤ ÀÀ¿ë¿¡´Â ÀÌ»óÀûÀÌ´Ù. ÀÌ·¯ÇÑ È¯°æ¿¡¼­, »ç¿ëÀÚ°¡ ¸í·ÉÀ» À¯È¿ÇÑ Æ÷¸ËÀ¸·Î ³Ö¾î¾ß ÇÏ´Â °Í°ú ÄÄÇ»ÅÍ°¡ °¢ ´Ü¾î¸¦ ¾È´Ù´Â °Í¸¸À» º¸ÁõÇÒ ÇÊ¿ä°¡ ÀÖ´Ù. »óŸӽŠÆļ­´Â ¹®ÀåÀÇ À¯È¿ÇÑ À¯ÇüÀÌ ¾ó¸¶ ¾øÀ¸¹Ç·Î, »óÅ°¡ ¸î °³ ¾ø±â ¶§¹®¿¡ ÀÌ·± »óȲ¿¡¼­´Â ÀÛµ¿ÇÒ ¼ö ÀÖ´Ù.

5. ¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâ Æļ­ (THE CONTEXT-FREE RECURSIVE-DESCENT NLP PARSER)

¹®¸ÆÀÚÀ¯ Æļ­¸¦ ÀÌÇØÇϱâ À§Çؼ­, ¹®ÀåÀÇ ±¸¼ºÀ» »óŸӽŠ¸ðµ¨¿¡¼­ º» °Í°ú ¿ÏÀüÈ÷ ´Ù¸¥ ½Ã°¢¿¡¼­ ºÁ¾ßÇÑ´Ù. ¹®¸ÆÀÚÀ¯ Æļ­¿¡ ´ëÇÏ¿©, ¹®ÀåÀÌ ¿©·¯ Ç׸ñµé·Î ±¸¼ºµÇ¾î ÀÖ°í, °¢ Ç׸ñµéÀº ´Ù¸¥ Ç׸ñµé·Î ±¸¼ºµÇ¾î ÀÖ´Â µî, ¹®ÀåÀ» ÀÛÀº ¿ä¼Ò - ¸í»ç, µ¿»ç, Çü¿ë»ç µî - ·Î ÂÉ°¶ ¶§ ±îÁö ±×·¸°Ô ±¸¼ºµÇ¾î ÀÖ´Â °ÍÀ¸·Î »ý°¢ÇÏÀÚ. °¢ ºÎºÐÀÌ ¾î¶»°Ô ±¸¼ºµÉ ¼ö Àִ°¡¸¦ Áö¹èÇÏ´Â ±ÔÄ¢À» ¹®¹ýÀÇ »ý¼º ±ÔÄ¢À̶ó°í ºÎ¸¥´Ù. ¹®¸ÆÀÚÀ¯ Æļ­´Â ¹®ÀåÀ» ºÐ¼®Çϱâ À§ÇÏ¿© ÀÌ »ý¼º ±ÔÄ¢µéÀ» »ç¿ëÇÑ´Ù.

¹®Àå ¡æ ¸í»ç±¸ + µ¿»ç±¸

¸í»ç±¸ ¡æ ÇÑÁ¤»ç + ¸í»ç,    ¸í»ç±¸ ¡æ ÇÑÁ¤»ç + Çü¿ë»ç + ¸í»ç,   ¸í»ç±¸ ¡æ ÀüÄ¡»ç + ¸í»ç±¸

µ¿»ç±¸ ¡æ µ¿»ç + ¸í»ç±¸,    µ¿»ç±¸ ¡æ µ¿»ç + ºÎ»ç + ¸í»ç±¸,   µ¿»ç±¸ ¡æ µ¿»ç + ºÎ»ç,   µ¿»ç±¸ ¡æ µ¿»ç

±×¸² 2  G1 ¹®¹ýÀÇ »ý¼º ±ÔÄ¢

±×¸² 2 °¡ G1 ¹®¹ý¿¡ ´ëÇÑ »ý¼º ±ÔÄ¢À» º¸¿©ÁØ´Ù. ÀÌ ±ÔÄ¢µéÀ» °øºÎÇÒ ¶§, ¿À¸¥ÂÊ È­»ìÇ¥¸¦ "»ý¼ºÇÑ´Ù (produces)" ¶ó°í Àоî¾ß ÇÑ´Ù. ÀÌ ±×¸²¿¡¼­, NP ´Â "¸í»ç±¸ (noun phrase)" ¸¦ ³ªÅ¸³»°í, VP ´Â "µ¿»ç±¸ (verb phrase)" ¸¦ ³ªÅ¸³½´Ù.

¸í»ç±¸´Â ÀüÄ¡»ç±¸¸¦ ¸¸³¯ ¶§, ÀÚ±â È£ÃâÀû (recursive) ÀÌ´Ù. ±×¸®°í µ¿»ç±¸´Â ¸í»ç±¸¸¦ È£ÃâÇÒ ¶§ °£Á¢ ÀÚ±â È£ÃâÀû (indirectly recursive) ÀÌ´Ù.

ÀÌ ±ÔÄ¢µéÀ» ¹®Àå¿¡ ¾î¶»°Ô Àû¿ëÇÒ ¼ö ÀÖÀ»Áö º¸±â À§ÇÏ¿©, ´ÙÀ½ ´ÙÀ̾Ʊ׷¥À» »ý°¢ÇØ º¸ÀÚ.

 

»ý¼º ±ÔÄ¢Àº ÀÏÁ¾ÀÇ Æ®¸®¸¦ Çü¼ºÇÑ´Ù. ÀÌ Æ®¸®´Â Æļ­°¡ ¹®ÀåÀ» º¸´Â ¹æ¹ýÀ» Ç¥ÇöÇϱ⠶§¹®¿¡ ÆĽº Æ®¸® (parse tree) ¶ó°í ÇÑ´Ù. ÀÌ·± À¯ÇüÀÇ ÆĽº Æ®¸®¸¦ »ý¼ºÇÏ´Â Æļ­´Â, Æ®¸®°¡ °¢ ¿ä¼ÒÀÇ ¹®¸Æ¿¡ ±âÃÊÇÑ °ÍÀÌ ¾Æ´Ï±â ¶§¹®¿¡ ¹®¸ÆÀÚÀ¯ (context-free) ¶ó°í ºÎ¸¥´Ù ; ±ÔÄ¢µéÀº °¢ ±¸ (phrase) ÀÇ ¹®¸Æ¿¡ °ü°è¾øÀÌ, G1 ¹®¹ý¿¡ µû¸£´Â ¾î¶² ¹®Àå¿¡ ´ëÇؼ­µµ ÀÛµ¿ÇÒ °ÍÀÌ´Ù.

¿©±â¿¡, ¹®¸ÆÀÚÀ¯ Æļ­ÀÇ Àǹ̸¦ ÀÌÇØÇϴµ¥ µµ¿òÀ» ÁÙ ¸î°¡Áö Á¡ÀÌ ÀÖ´Ù. ¸ÕÀú, ¹®¸ÆÀÚÀ¯ ÆĽÌÀº AI NLP ÇÁ·Î±×·¥ »Ó¸¸ ¾Æ´Ï¶ó °á±¹ ¸ðµç ÄÄÇ»ÅÍ ¾ð¾î¿¡¼­µµ »ç¿ëµÈ´Ù. ¿¹¸¦µé¾î, Pascal, BASIC, C, Modula-2, ±×¸®°í ´Ù¸¥ ¾ð¾îµéÀ» ¹®¸ÆÀÚÀ¯ Æļ­·Î ÆĽºÇÒ ¼ö ÀÖ´Ù. ¹®¸ÆÀÚÀ¯ Æļ­°¡ ÆĽºÇÒ ¼ö ÀÖ´Â »ý¼º ±ÔÄ¢À» »ç¿ëÇÏ¿© ¿µ¾îÀÇ ÀÏºÎ¶óµµ Ç¥ÇöÇÒ ¼ö ÀÖ´Ù´Â »ç½ÇÀº »ó´çÇÑ Àǹ̸¦ °®´Â´Ù. ¸ÕÀú, ¾î¶² ¹æ¹ýÀ¸·Î, ¿µ¾î´Â ¾ö¹ÐÇÑ ±ÔÄ¢µéÀ» µû¸¥´Ù´Â °ÍÀ» º¸ÁõÇÑ´Ù - Áï, ¿µ¾î´Â ´Ü¼øÈ÷ ¾ÕµÚ°¡ ¸ÂÁö ¾Ê´Â ÀÓÀÇÀÇ Á¦ÇÑÀÌ È¥ÇÕµÈ °ÍÀÌ ¾Æ´Ï´Ù. µÎ ¹ø°, ÄÄÇ»ÅÍ ¾ð¾î¸¦ À§ÇÏ¿© °³¹ßµÈ ÀÌÇØÇϱ⠽¬¿î ÆḺ̌â¹ý ¸î°¡Áö¸¦ ÀÚ¿¬¾ð¾î¿¡ Àû¿ëÇÏ°Ô ÇÑ´Ù - ¹ÙÄû¸¦ Àç¹ß¸íÇÒ ÇÊ¿ä°¡ ¾ø´Ù. ¸¶Áö¸·À¸·Î, ¹®¸ÆÀÚÀ¯ »ý¼º ±ÔÄ¢Àº ±¸¿¡¼­ºÎÅÍ ½ÇÁ¦·Î ±×°ÍµéÀ» ±¸¼ºÇÏ´Â ´Ü¾î·Î ±¸¼ºµÇ±â ¶§¹®¿¡, °³°³ÀÇ ´Ü¾î»Ó¸¸ ¾Æ´Ï¶ó Àüü ±¸¸¦ À̲ø¾î³»´Â °ÍÀº ½±´Ù. ±×·¯¹Ç·Î, °³°³ÀÇ ´Ü¾î»Ó ¾Æ´Ï¶ó ±¸¸¦ ÆĽºÇÒ ¼ö ÀÖ°í - °¢ ±¸°¡ ¾îµð¼­ ¿Ô´ÂÁö ¾Ë ¼ö ÀÖ´Ù ; ÀÇ¹Ì Á¤º¸ (semantic information) °¡ ¸ð¾ÆÁú ¼ö ÀÖ´Â ±âÃÊ. ÀÌ ¸ðµç Á¡µéÀÌ ¹®¸ÆÀÚÀ¯ Æļ­¸¦ »óŸӽŠº¸´Ù ÇÑ ´Ü°è ³´°Ô ¸¸µç´Ù.

¾Õ¿¡¼­ ÁÖ¾îÁø »ý¼º ±ÔÄ¢À» »ç¿ëÇÏ´Â ¹®¸ÆÀÚÀ¯ Æļ­¸¦ ±¸ÇöÇÒ ¸¹Àº ¹æ¹ýµéÀÌ ÀÖ´Ù. ƯÈ÷ C ¸¦ »ç¿ëÇÒ ¶§ °¡Àå ½¬¿î °ÍÀº, µÇºÎ¸§ ÇÏÇâ Æļ­¸¦ ¸¸µå´Â °ÍÀε¥, ÀÌ°ÍÀº ¹®ÀåÀÌ ¿ÏÀüÈ÷ ÆĽºµÉ ¶§ ±îÁö »ý¼º ±ÔÄ¢À» µû¶ó ÇÏÇâ (descend) ÇÏ´Â »óÈ£ ÀÚ±â È£ÃâÀûÀÎ (recursive) ·çƾµéÀÇ ¸ðÀÓÀ» »ç¿ëÇÑ´Ù.

¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâ Æļ­¸¦ ¸¸µå´Â °ÍÀº »óŸӽŠÆļ­°¡ »ç¿ëÇß´ø °Í°ú, °°Àº ¾îÈÖ µ¥ÀÌÅͺ£À̽º¿Í ¹®ÀåÀ¸·ÎºÎÅÍ ´Ü¾î¸¦ ÃßÃâÇØ ³»´Â Áö¿øÇÔ¼ö¸¦ ¿ä±¸ÇÑ´Ù. ÀÌ ·çƾµéÀ» °¡Á¤ÇÑ, ¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâÆļ­°¡ ´ÙÀ½¿¡ ÀÖ´Ù.

    /* Context-free recursive-descent NLP parser */

    parse()

    {

        if (!nounphrase()) return 0;

        if (!verbphrase()) return 0;

        if (!terminator ()) return 0;

        return 1;

    }

    /* read a noun phrase from the input stream */

    nounphrase()

    {

        char type;

        get_token();

        type=find_type(token);

        switch(type) {

            case DET :

                get_token();

                type=find_type(token);

                if (type=NOUN) return 1;

                else if (type==ADJ) {

                    get_token();

                    type=find_type(token);

                    if (type==NOUN) return 1;

                }

                break;

            case PREP :

                return nounphrase();

            }

            return 0;

    }

    /* read a verb phrase */

    verbphrase()

    {

        char type, *pos:

        get_token();

        type=find_type(token);

        if (type!=VERB) return 0;    /* must start with a verb */

        pos=t_pos;                     /* save current position for backtracking */

        /* verb + adverb + NP */

        if (verb_adv_np()) return 1;

        /* verb + NP */

        t_pos=pos;                     /* back up */

        if (verb_np()) return 1;

        /* verb + adverb -- no NP */

        t_pos=pos;

        if (verb_adv()) return 1;

        /* just adverb */

        return 1;

    }

    verb_np()

    {

        /* verb + NP */

        return nounphrase();

    }

    verb_adv_np()

    {

        char type;

        get_token();

        type=find_type(token);

        if (type==ADV && nounphrase()) return 1;

        return 0;

    }

    verb_adv()

    {

        char type;

        get_token();

        type=find_type(token);

        return(type==ADV);

    }

    terminator()

    {

        get_token();

        return(find_type(token)==TERM);

    }

Æļ­´Â ´ÙÀ½°ú °°ÀÌ ÀÛµ¿ÇÑ´Ù. ÃÖ»óÀ§ ¼öÁØ¿¡¼­, ¹®ÀåÀº ¸í»ç±¸, µ¿»ç±¸, ±×¸®°í ÀÌ °æ¿ì Á¾°áÀÚ (terminator) ·Î¼­ ¸¶Ä§Ç¥·Î ±¸¼ºµÈ´Ù. ±×·¯¹Ç·Î ÇÔ¼ö parse() ´Â ·çƾ nounphrase() ¸¦ È£ÃâÇÏ°í, ´ÙÀ½¿¡´Â verbphrase() ¸¦ È£ÃâÇÑ´Ù. ÀÌ°ÍÀÌ ¼º°øÇÑ´Ù°í °¡Á¤Çϸé - ¿©±â¼­ ¼º°øÇÑ´Ù´Â °ÍÀº ¹®ÀåÀÌ G1 ¹®¹ý±ÔÄ¢À» ¸¸Á·ÇÑ´Ù´Â °ÍÀ» ÀǹÌÇÑ´Ù - Æļ­´Â ¹®ÀåÀÌ ¸¶Ä§Ç¥·Î ³¡³ª´Â °ÍÀ» È®ÀÎÇϱâ À§ÇÏ¿© terminator ¸¦ È£ÃâÇÑ´Ù. nounphrase() ¿Í verbphrase() ÇÔ¼ö´Â ¾Õ¿¡¼­ ¼³¸íµÈ ¹æ½ÄÀ¸·Î ¹®¸ÆÀÚÀ¯ ±ÔÄ¢À» ±¸ÇöÇϱâ À§Çؼ­ ¿©·¯ °¡Áö Áö¿ø ÇÔ¼ö¸¦ »ç¿ëÇÑ´Ù. ¾î¶² ¹®ÀåÀÌ ÀÌ ±ÔÄ¢¿¡ µû¸£Áö ¾ÊÀ¸¸é, nounphrase() ¿Í verbphrase() ´Â ½ÇÆÐÇÒ °ÍÀÌ°í, ¹®ÀåÀº °ÅÀý´çÇÒ (reject) °ÍÀÌ´Ù.

Æļ­°¡ ¾î¶»°Ô ÀÛµ¿ÇÏ´ÂÁö Á¤È®È÷ º¸±â À§Çؼ­, ´ÙÀ½ ¹®ÀåÀ» ÆĽºÇÏ´Â ¹æ¹ýÀ» µû¶ó°¡ º¸ÀÚ.

the child runs quickly to the large house.

¸ÕÀú, parse() ´Â nounphrase() ¸¦ È£ÃâÇÏ°í, ÀÌ°ÍÀº ÇÑÁ¤»ç the ¿Í ¸í»ç child ¸¦ ¹ß°ßÇϱ⠶§¹®¿¡ ¼º°øÇÑ´Ù. ±×·¯°í³ª¼­ parse() ´Â verbphrase() ¸¦ È£ÃâÇÏ°í, ÀÌ°ÍÀº ºÎ»ç°¡ µû¶ó ³ª¿À°í Â÷·Ê·Î ¸í»ç±¸°¡ µû¶ó³ª¿À´Â µ¿»ç·Î ±¸¼ºµÇ¾î ÀÖ´ÂÁö ¾Ë±âÀ§ÇÏ¿© verb_adv_np() ¸¦ È£ÃâÇÑ´Ù. ÀÌ °æ¿ì¿¡, µ¿»ç±¸´Â ´ÙÀ½À» Æ÷ÇÔÇÑ´Ù : µ¿»ç runs ´ÙÀ½¿¡´Â ºÎ»ç quickly °¡ ¿À°í, ÀÌ°Í ´ÙÀ½¿¡´Â ÀüÄ¡»çÀû ¸í»ç±¸ (prepositional noun phrase) to the large house °¡ ¿Â´Ù. ´ÙÀ½, ÀÌ ÀüÄ¡»çÀû ¸í»ç±¸´Â nounphrase() ·Î ÇÏ¿©±Ý ÀÚ±â ÀÚ½ÅÀ» (recursively) È£ÃâÇÏ°Ô ÇÑ´Ù. ¸¶Áö¸· ±¸°¡ ÀÐÇôÁø ÈÄ, ¸ðµç ÀÚ±â È£ÃâÀº Á¾°áµÇ°í verbphrase() ´Â ¼º°øÇÏ¿© phrase() ·Î ¸®ÅÏÇÑ´Ù. Æļ­´Â terminator() ¸¦ »ç¿ëÇÏ¿© ¸¶Ä§Ç¥¸¦ È®ÀÎÇÑ´Ù. ¸¶Áö¸·À¸·Î, parse() °¡ ¼º°øÇϴµ¥, ÀÌ´Â ±× ¹®ÀåÀÌ ½ÇÁ¦·Î G1 ¹®¹ý ±ÔÄ¢À» ¸¸Á·½ÃŲ´Ù´Â °ÍÀ» ÀǹÌÇÑ´Ù.

¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâ Æļ­ ÇÁ·Î±×·¥ Àüü°¡ ´ÙÀ½¿¡ ÀÖ´Ù. ÄÄÇ»ÅÍ¿¡ ³Ö¾î, ÀÛµ¿À» Àß ÀÌÇØÇϵµ·Ï ¿©·¯ ¹®ÀåÀ» °¡Áö°í ½ÃµµÇØ º¸¾Æ¾ß ÇÑ´Ù.

    /* Recursive-descent NLP example */

    #include "stdio.h"

    #define MAX 100

    #define NOUN 1

    #define VERB 2

    #define ADJ 3

    #define ADV 4

    #define DET 5

    #define PREP 6

    #define TERM 7

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

    };

    struct word wdb[MAX];  /* array of db structure */

    int db_pos=0;               /* number of entries in wdb */

    char s[80];                 /* holds the sentence */

    char *t_pos=0;             / *points into the sentence */

    char token[80];           /* contains the word */

    main()

    {

      setup();

      printf("Enter Sentence : ");

      gets(s);

      t_pos=s;

      if (parse())  printf("Sentence OK¡¬n");  

      else printf("Error in sentence¡¬n");

    }

    setup()

    {

      assert_wdb("door", NOUN);

      assert_wdb("window", NOUN);

      assert_wdb("house", NOUN);

      assert_wdb("child", NOUN);

      assert_wdb("has", VERB);

      assert_wdb("runs", VERB);

      assert_wdb("plays", VERB);

      assert_wdb("large", ADJ);

      assert_wdb("quickly", ADV);

      assert_wdb("the", DET);

      assert_wdb("a", DET);

      assert_wdb("to", PREP);

      assert_wdb(".", TERM);

    }

    /* place facts into database */

    assert_wdb(word, type)

    char *word;

    int type;

    {

      if (db_pos<MAX) {

        strcpy(wdb[db_pos].word, word);

        wdb[db_pos].type=type;

        db_pos++;

      }

      else printf("Word database full. ¡¬n");

    }

     

    /* Context-free recursive-descent NLP parser */

    parse()

    {

        if (!nounphrase()) return 0;

        if (!verbphrase()) return 0;

        if (!terminator ()) return 0;

        return 1;

    }

    /* read a noun phrase from the input stream */

    nounphrase()

    {

        char type;

        get_token();

        type=find_type(token);

        switch(type) {

            case DET :

                get_token();

                type=find_type(token);

                if (type=NOUN) return 1;

                else if (type==ADJ) {

                    get_token();

                    type=find_type(token);

                    if (type==NOUN) return 1;

                }

                break;

            case PREP :

                return nounphrase();

            }

            return 0;

    }

    /* read a verb phrase */

    verbphrase()

    {

        char type, *pos:

        get_token();

        type=find_type(token);

        if (type!=VERB) return 0;    /* must start with a verb */

        pos=t_pos;                     /* save current position for backtracking */

        /* verb + adverb + NP */

        if (verb_adv_np()) return 1;

        /* verb + NP */

        t_pos=pos;                     /* back up */

        if (verb_np()) return 1;

        /* verb + adverb -- no NP */

        t_pos=pos;

        if (verb_adv()) return 1;

        /* just adverb */

        return 1;

    }

    verb_np()

    {

        /* verb + NP */

        return nounphrase();

    }

    verb_adv_np()

    {

        char type;

        get_token();

        type=find_type(token);

        if (type==ADV && nounphrase()) return 1;

        return 0;

    }

    verb_adv()

    {

        char type;

        get_token();

        type=find_type(token);

        return(type==ADV);

    }

    terminator()

    {

        get_token();

        return(find_type(token)==TERM);

    }

    /* find the type G1ven the word */

    find_type(word)

    char *word;

    {

        int t;

        for (t=0; t<db_pos; t++)

            if (!strcmp(word, wdb[t].word))

                return wdb[t].type;

            return 0;

    }

    /* return one token from the input stream */

    get_token()

    {

        char *p;

        p=token;

        /* skip spaces */

        while(*t_pos==' ') t_pos++;

        if (*t_pos=='.') {

        *p++='.';

        *p='¡¬0';

        return;

        }

    /* read word until a space or period */

    while (*t_pos != ' ' && *t_pos != '.') {

        *p=*t_pos++;

        p++;

        }

        *p='¡¬0';

    }

parse(), nounphrase(), verbphrase() ±×¸®°í Áö¿øÇÔ¼öµéÀ» ¾à°£ º¯ÇüÇÏ¿©, ¹®ÀåÀ» ±× ¼ººÐ±¸µé·Î ÂÉ°³±â À§ÇÏ¿© ÀÌ ·çƾµéÀ» »ç¿ëÇÒ ¼ö ÀÖ´Ù. ´Ü¾î¸¸À» ´Ù·ê ¼ö ÀÖ´Â »óŸӽŠÆļ­¿Í´Â ´Þ¸®, ±¸ (phrase) ¸¦ ÃßÃâÇØ ³»±â À§ÇÏ¿© ¹®¸ÆÀÚÀ¯ µÇºÎ¸§ ÇÏÇâ Æļ­¸¦ »ç¿ëÇÒ ¼ö ÀÖ´Ù. ÀÌ°ÍÀº, ±× Æļ­°¡ ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý ¹®ÀåÀ» ´Ü¼øÈ÷ °ËÁõÇÏÁö ¾Ê°í ½ÇÁúÀûÀ¸·Î ÀÌÇØÇÒ ¼ö ÀÖ´Â ±æÀ» ¿­¾îÁֱ⠶§¹®¿¡ Áß¿äÇÑ ±â´ÉÀÌ´Ù : ±×·¯¹Ç·Î, ÄÄÇ»ÅÍ·Î ÇÏ¿©±Ý °ü·ÃµÈ ´Ü¾îµéÀÇ ±×·ìÀ» °¡Áö°í ÀÛµ¿ÇÒ ¼ö ÀÖ°Ô ÇÑ´Ù. ¿©±â¿¡ ÀÖ´Â °³Á¤µÈ Æļ­´Â ÇÑ ¹®ÀåÀÇ ¸í»ç±¸¿Í µ¿»ç±¸¸¦ ¸®ÅÏÇÑ´Ù.

    /* Context-free recursive-descent NLP parser that displays phrases */

    parse()

    {

        char noun[80], verb[80];

        noun[0]='¡¬0';    verb[0]='¡¬0';

        if (!nounphrase(noun))  return 0;

        if (!verbphrase(verb))  return 0;

        if (!terminator())  return 0;

        printf("noun phrase : %s¡¬n", noun);

        printf("verb phrase : %s¡¬n", verb);

        return 1;

    }

    /* read a noun phrase from the input stream */

    nounphrase(s)

    char *s;

    {

        char type;

        get_token();

        type=find_type(token);

        switch(type) {

            case DET :

        strcat(s, token);

        strcat(s, " ");

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        if (type==NOUN) return 1;

        else if (type==ADJ) {

          get_token();

          strcat(s, token);

          strcat(s, " ");

          type=find_type(token);

          if (type==NOUN) return 1;

        }

        break;

      case PREP :

        strcat(s, token);

        strcat(s, " ");

        return nounphrase(s);

      }

      return 0;

    }

    /* read a verb phrase */

    verbphrase(s)

    char *s;

    {

        char type, *pos, temp[80];

        get_token();

        type=find_type(token);

        if (type!=VERB) return 0;    /* must start with a verb */

        strcat(s, token);

        strcat(s, " ");

        strcpy(temp, s);              /* save for backtracking */

        pos=t_pos;                     /* save current position for backtracking */

        /* verb + adverb + NP */

        if (verb_adv_np()) return 1;

        /* verb + NP */

        t_pos=pos;                     /* back up */

        strcpy(s, temp);

        if (verb_np()) return 1;

        /* verb + adverb -- no NP */

        t_pos=pos;

        strcpy(s, temp);

        if (verb_adv(s)) return 1;

        /* just adverb */

        return 1;

    }

    verb_np(s)

    char *s;

    {

        /* verb + NP */

        return nounphrase();

    }

    verb_adv_np(s)

    char *s;

    {

        char type, temp[80];

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        temp[0]='¡¬0';

        if (type==ADV && nounphrase(temp)) {

            strcat(s, temp);

            return 1;

        }

        return 0;

    }

    verb_adv(s)

    char *s;

    {

        char type;

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        return(type==ADV);

    }

    terminator()

    {

        get_token();

        return(find_type(token)==TERM);

    }

¿©±â¼­ ÀÌ ¹öÀüÀÇ Æļ­¸¦ ÀÌ¿ëÇÏ´Â ¿ÏÀüÇÑ ÇÁ·Î±×·¥À» Á¦½ÃÇÑ´Ù.

    /* Recursive-descent NLP example that reports the phrases */

    #include "stdio.h"

    #define MAX 100

    #define NOUN 1

    #define VERB 2

    #define ADJ 3

    #define ADV 4

    #define DET 5

    #define PREP 6

    #define TERM 7

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

    };

    struct word wdb[MAX];  /* array of db structure */

    int db_pos=0;               /* number of entries in wdb */

    char s[80];                 /* holds the sentence */

    char *t_pos=0;             / *points into the sentence */

    char token[80];           /* contains the word */

    main()

    {

      setup();

      printf("Enter Sentence : ");

      gets(s);

      t_pos=s;

      if (parse())  printf("Sentence OK¡¬n");  

      else printf("Error in sentence¡¬n");

    }

    setup()

    {

      assert_wdb("door", NOUN);

      assert_wdb("window", NOUN);

      assert_wdb("house", NOUN);

      assert_wdb("child", NOUN);

      assert_wdb("has", VERB);

      assert_wdb("runs", VERB);

      assert_wdb("plays", VERB);

      assert_wdb("large", ADJ);

      assert_wdb("quickly", ADV);

      assert_wdb("the", DET);

      assert_wdb("a", DET);

      assert_wdb("to", PREP);

      assert_wdb(".", TERM);

    }

    /* place facts into database */

    assert_wdb(word, type)

    char *word;

    int type;

    {

      if (db_pos<MAX) {

        strcpy(wdb[db_pos].word, word);

        wdb[db_pos].type=type;

        db_pos++;

      }

      else printf("Word database full. ¡¬n");

    }

    /* Context-free recursive-descent NLP parser that displays phrases */

    parse()

    {

        char noun[80], verb[80];

        noun[0]='¡¬0';    verb[0]='¡¬0';

        if (!nounphrase(noun))  return 0;

        if (!verbphrase(verb))  return 0;

        if (!terminator())  return 0;

        printf("noun phrase : %s¡¬n", noun);

        printf("verb phrase : %s¡¬n", verb);

        return 1;

    }

    /* read a noun phrase from the input stream */

    nounphrase(s)

    char *s;

    {

        char type;

        get_token();

        type=find_type(token);

        switch(type) {

            case DET :

        strcat(s, token);

        strcat(s, " ");

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        if (type==NOUN) return 1;

        else if (type==ADJ) {

          get_token();

          strcat(s, token);

          strcat(s, " ");

          type=find_type(token);

          if (type==NOUN) return 1;

        }

        break;

      case PREP :

        strcat(s, token);

        strcat(s, " ");

        return nounphrase(s);

      }

      return 0;

    }

    /* read a verb phrase */

    verbphrase(s)

    char *s;

    {

        char type, *pos, temp[80];

        get_token();

        type=find_type(token);

        if (type!=VERB) return 0;    /* must start with a verb */

        strcat(s, token);

        strcat(s, " ");

        strcpy(temp, s);              /* save for backtracking */

        pos=t_pos;                     /* save current position for backtracking */

        /* verb + adverb + NP */

        if (verb_adv_np()) return 1;

        /* verb + NP */

        t_pos=pos;                     /* back up */

        strcpy(s, temp);

        if (verb_np()) return 1;

        /* verb + adverb -- no NP */

        t_pos=pos;

        strcpy(s, temp);

        if (verb_adv(s)) return 1;

        /* just adverb */

        return 1;

    }

    verb_np(s)

    char *s;

    {

        /* verb + NP */

        return nounphrase();

    }

    verb_adv_np(s)

    char *s;

    {

        char type, temp[80];

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        temp[0]='¡¬0';

        if (type==ADV && nounphrase(temp)) {

            strcat(s, temp);

            return 1;

        }

        return 0;

    }

    verb_adv(s)

    char *s;

    {

        char type;

        get_token();

        type=find_type(token);

        strcat(s, token);

        strcat(s, " ");

        return(type==ADV);

    }

    terminator()

    {

        get_token();

        return(find_type(token)==TERM);

    }

    /* find the type G1ven the word */

    find_type(word)

    char *word;

    {

        int t;

        for (t=0; t<db_pos; t++)

            if (!strcmp(word, wdb[t].word))

                return wdb[t].type;

            return 0;

    }

    /* return one token from the input stream */

    get_token()

    {

        char *p;

        p=token;

        /* skip spaces */

        while(*t_pos==' ') t_pos++;

        if (*t_pos=='.') {

        *p++='.';

        *p='¡¬0';

        return;

        }

    /* read word until a space or period */

    while (*t_pos != ' ' && *t_pos != '.') {

        *p=*t_pos++;

        p++;

        }

        *p='¡¬0';

    }

ÀÌ ¹öÀüÀÇ Æļ­¸¦ ÀÌ¿ëÇϸé, ´ÙÀ½ ÀÔ·Â ¹®Àå¿¡ ´ëÇÏ¿©

the child runs quickly to the house.

½ºÅ©¸°¿¡ ´ÙÀ½ Ãâ·ÂÀ» º¸°Ô µÉ °ÍÀÌ´Ù.

noun phrase : the child

verb phrase : runs quickly to the house

(1)  ¹®¸ÆÀÚÀ¯ Recursive-Descent NLP Æļ­ÀÇ ºÐ¼® (Analysis of the Context-Free Recursive-Descent NLP Parser)

¹®¸ÆÀÚÀ¯ ÆĽÌÀº ¸¹Àº ÀåÁ¡À» °®´Â´Ù. ù°, C ·Î ±¸ÇöÇϱⰡ ½±´Ù. µÑ°, ´Ü¾î ¼öÁØ°ú ±¸ ¼öÁØ¿¡¼­ ¹®ÀåÀ» ´Ù·ç±â À§ÇÏ¿© »ç¿ëÇÒ ¼ö ÀÖ´Ù. ¼Â°, Ç×»ó ¹®ÀåÀÇ ¾îµð¿¡ ÀÖ´ÂÁö "¾È´Ù (knows)". ÀÌ°ÍÀº ÀڱⰡ ¹®ÀåÀÇ ¾îµð¿¡ ÀÖ¾ú´ÂÁö ¸ð¸£´Â »óŸӽÅÀÇ ¿¹¿Í ´Ù¸£´Ù.

¹®¸ÆÀÚÀ¯ Æļ­°¡ °®´Â ÁÖ¿ä ´ÜÁ¡Àº ¿µ¾î ¹®ÀåÀÌ ±¸¼ºµÉ ¼ö ÀÖ´Â ¸¹Àº À¯È¿ÇÑ ¹æ¹ýµéÀ» ´Ù·ê ¼ö ¾øÀ»·±Áöµµ ¸ð¸¥´Ù´Â °ÍÀÌ´Ù. ¾Õ¿¡¼­ ÁÖ¾îÁø °£´ÜÇÑ G1 ¹®¹ýÀ» »ç¿ëÇßÀ» ¶§, ¹®¹ýÀ» ÃæºÐÈ÷ ±â¼úÇÑ ÀÏ´ÜÀÇ »ý¼º ±ÔÄ¢µéÀ» Á¤ÀÇÇÏ´Â °ÍÀº ½¬¿ü´Ù. ±×·¯³ª, ½Ç¼¼°èÀÇ ¿µ¾î (¶Ç´Â ´Ù¸¥ ¾ð¾î¶óµµ) ¿¡¼­, ±ÔÄ¢Àº ¸Å¿ì º¹ÀâÇÒ °ÍÀÌ°í, ÀÌ°ÍÀº Æø¹ßÀû Áõ°¡ (combinatoric explosion) °¡ ÀϾ °ÍÀÌ°í, ÀÌ ¹æ¹ýÀ» »ç¿ëÇÒ ¼ö ¾ø°Ô ¸¸µç´Ù. ±×·¯³ª, ÀÌ°ÍÀÌ À¯ÀÏÇÑ °¡´É¼ºÀÌ´Ù - ¾ÆÁ÷ ¾Æ¹«µµ À̸¦ Áõ¸íÇÏÁö ¸øÇß´Ù.

6. ÀâÀ½Á¦°Å Æļ­ (THE NOISE-DISPOSAL PARSER)

¾î¶² ÀÀ¿ë ÇÁ·Î±×·¥¿¡¼­´Â ¹®ÀåÀÌ Æ÷ÇԵǴ ¸î°¡Áö Å°¿öµå¿¡¸¸ °ü°èµÇ°í, ¾ð¾î¸¦ ±¸¼ºÇÏ´Â ¸ðµç °ü·Ã ´Ü¾î¿¡ °ü·ÃÀÖ´Â °ÍÀº ¾Æ´Ï´Ù. º»ÁúÀûÀ¸·Î, ÀÌ·± À¯ÇüÀÇ ÀÀ¿ëµéÀº ¹®ÀåÀÌ Æ÷ÇÔÇÏ´Â Á¤º¸¿¡¸¸ °ü½ÉÀÌ ÀÖ´Ù. ÀÌ »ý°¢Àº ÀâÀ½Á¦°Å Æļ­ ¶ó°íÇÏ´Â ¹®¸ÆÀÚÀ¯ Æļ­ÀÇ º¯ÇüÀ» ¸¸µé°Ô ÇÑ´Ù. ÀÌ Æļ­´Â ¸ð¸£´Â ¸ðµç ´Ü¾î¸¦ ÀâÀ½À¸·Î ´Ù·ç°í ±×°ÍµéÀ» Á¦°ÅÇÑ´Ù. ÀüÇüÀûÀ¸·Î, ¸ðµç ¹®ÀåµéÀº ÀÚ¿¬¾ð¾î¸¦ ´àÀº °íÁ¤µÈ Æ÷¸ËÀ» µû¶ó¾ß ÇÑ´Ù.

ÀÌ·± À¯ÇüÀÇ Æļ­´Â ½ÇÁ¦·Î ¸í·É¾î 󸮱â¿Í °°Àº µ¥ÀÌÅͺ£À̽º À¯ÇüÀÇ ÀÀ¿ë¿¡¼­ ¾ÆÁÖ ÈçÇÏ´Ù. ¿¹¸¦µé¾î, ȸ»ç À̸§°ú Áֽİ¡°ÝÀ¸·Î ±¸¼ºµÇ´Â µ¥ÀÌÅͺ£À̽º¸¦ »ý°¢Çغ¸ÀÚ. µ¥ÀÌÅͺ£À̽º´Â ´ÙÀ½°ú °°Àº Áú¹®À» ¹ÞÀ» °ÍÀ̶ó°í °¡Á¤ÇÏÀÚ.

show me all companies with stock prices > 100.

show me all.

show me XYZ.

show me one with stock price < 100.

is XYZ stock selling at > 40 ?

¾Ë ¼ö ÀÖµíÀÌ, ÀÌ·± À¯ÇüÀÇ Áú¹®µéÀº ±âº» À¯Çü¿¡ ¸Â´Â´Ù.

¸í·É¾î <¼ö½Ä¾î> <À̸§> <¿¬»êÀÚ> <°ª>.

(command <modifier> <name> <operator> <value>.)

¿©±â¼­, ¸í·É¾î (command) ´Â Ç×»ó ³ªÅ¸³ª¾ß ÇÏÁö¸¸, ´Ù¸¥ ³× °³ÀÇ ¿ä¼Ò´Â ¼±ÅÃÀûÀÌ´Ù. ±×·¯³ª, ¸¸¾à ¿¬»êÀÚ°¡ ÀÖ´Ù¸é °ªµµ ¶ÇÇÑ ³ªÅ¸³ª¾ß ÇÑ´Ù´Â °ÍÀ» ÁÖ¸ñÇØ¾ß ÇÑ´Ù. ÇÊ¿äÇÑ ¿ä¼ÒµéÀÌ ³ªÅ¸³ª ÀÖ´Â ÇÑ, ¾ó¸¶³ª ¸¹Àº ÀâÀ½ ´Ü¾îµéÀÌ ³ªÅ¸³ª ÀÖ´ÂÁö´Â Áß¿äÇÏÁö ¾Ê´Ù. ¿Ö³ÄÇÏ¸é ±×°ÍµéÀº Æļ­ÀÇ ¿¬»ê¿¡ ¿µÇâÀ» ¹ÌÄ¡Áö ¾Ê±â ¶§¹®ÀÌ´Ù. ±×·¯³ª, Áú¹®ÀÇ °¢ ºÎºÐÀÇ ¼ø¼­´Â Á¤È®È÷ °°¾Æ¾ß ÇÑ´Ù.

±×·¯ÇÑ Æļ­ÀÇ ¿¹¸¦ º¸±â À§ÇÏ¿©, °£´ÜÇÑ µ¥ÀÌÅͺ£À̽º¿¡ ´ëÇÑ Áú¹® 󸮱â (query processor) ¸¦ ¸¸µé¾îº¸ÀÚ. ÀÌ ¿¹¿¡¼­, ȸ»ç¿Í ±× ȸ»çÀÇ ÇöÀç Áֽİ¡°ÝÀÌ µé¾îÀÖ´Â stock À̶ó´Â µ¥ÀÌÅͺ£À̽º¸¦ »ç¿ëÇÑ´Ù. stock ÀÇ Á¤ÀÇ´Â ´ÙÀ½°ú °°´Ù.

´ÙÀ½, À¯È¿ÇÑ ¸í·É¿¡¼­ »ç¿ëµÉ ¼ö ÀÖ´Â ´Ü¾î¿Í ±âÈ£µéÀÇ ¸®½ºÆ®¸¦ ÇÊ¿ä·Î ÇÒ °ÍÀÌ´Ù. ÀÌ ¿¹¿¡ ÀÖ´Â °Í°ú °°Àº ¸í·É¾î 󸮱⿡ ´ëÇؼ­, ´Ü¾î¿Í, ±× »ç¿ëÀ» ¹Ý¿µÇϱâ À§ÇÑ ´Ü¾îÀÇ À¯ÇüÀ» º¯°æ½ÃÄÑ¾ß ÇÑ´Ù. »õ·Î¿î ´Ü¾î¿Í À¯ÇüµéÀÌ ´ÙÀ½¿¡ ÀÖ´Ù.

¿©±â¼­, xyz, abc, ±×¸®°í ucl Àº ȸ»ç À̸§ÀÌ´Ù. ÀÌ ¿¹¿¡¼­´Â Áú¹®À» Çã¿ëÇϱ⠶§¹®¿¡, ¹°À½Ç¥·Î ¹®ÀåÀ» ³¡³¾ ¼ö ÀÖ´Ù.

ÇÁ·Î±×·¥À» ½ÃÀÛÇÒ ¶§, stock µ¥ÀÌÅͺ£À̽º´Â ´ÙÀ½ Á¤º¸¸¦ °¡Áö°í ·ÎµåµÉ °ÍÀÌ´Ù.

Æļ­´Â ¾Õ Àý¿¡¼­ °³¹ßµÈ ¹®¸ÆÀÚÀ¯ Æļ­¿Í À¯»çÇÏ´Ù ; ±×·¯³ª, ´ÜÁö ¸í·É¾î ¹®ÀåÀÇ ¼ø¼­¸¦ °­¿äÇÏ´Â ¹Ý¸é, ºÒÇÊ¿äÇÑ ¸ðµç ´Ü¾î¸¦ ¹ö¸°´Ù. ´ÙÀ½À» ±â¾ïÇØ¾ß ÇÑ´Ù : ¸¸¾à word µ¥ÀÌÅͺ£À̽º¿¡ ÀÖÁö ¾ÊÀº ´Ü¾î¸¦ ÀÐÀ¸¸é, ´Ü¼øÈ÷ ±× ´Ü¾î¸¦ ¹ö¸°´Ù. Æļ­°¡ ¿©±â¿¡ ÀÖ´Ù :

    /* Noise-disposal NLP parser */

    parser()

    {

      char com[80], mod[80], name[80], op;

      float quant;

      if (!get_com(com)) return 0;

      get_mod(mod);

      get_name(name);

      op=get_op();

      quant=get_quant();

      perform(com, mod, name, op, quant);

      return 1;

    }

ÀÌ¿Í°°ÀÌ, Æļ­´Â ¸¶Ä¡ ´Ù¼¸ ºÎºÐ, Áï ¸í·É¾î ÀÚü, ¼ö½Ä¾î, À̸§, ¿¬»êÀÚ, ±×¸®°í ÇÑÁ¤»ç·Î ±¸¼ºµÈ °Íó·³ Áú¹®À» ÆĽºÇÑ´Ù. ¸ðµç ¸í·É¾î´Â ¸¶Ä§Ç¥³ª ¹°À½Ç¥·Î ³¡³ª¾ß ÇÑ´Ù. ¸ðµç ¸í·É¾î°¡ ÆĽºµÈ ÈÄ¿¡, perform() Àº ÁöÁ¤µÈ ¸í·ÉÀ» ¼öÇàÇÑ´Ù. ´ÙÀ½¿¡ Áö¿øÇÔ¼öµéÀÌ ³ª¿ÍÀÖ´Ù.

    /* get the command */

    get_com(s)

    char *s;

    {

      get_noise();

      get_token();

      if (find_type(token)!=COMMAND)  {

        strcpy(s, "");

        return 0;

      }

      strcpy(s, token);

      return 1;

    }

    /* get a modifier or default to "all" */

    get_mod(s)

    char *s;

    {

      get_noise();

      get_token();

      if(find_type(token)!=MODIFIER)  {

        strcpy(s, "all");

        put_back();

        return;

      }

      strcpy(s, token);

    }

    /* get the name */

    get_name(s)

    char *s;

    {

      get_noise();

      get_token();

      if(find_type(token)!=NAME)  {

        strcpy(s, "");

        put_back();

        return;

      }

      strcpy(s, token);

    }

    /* get an operator */

    get_op()

    {

      get_noise();

      get_token();

      if(find_type(token)==OPERATOR)

        return *token;

      else return 0;

    }

    /* get a number */

    float get_quant()

    {

      float t;

      get_token();

      if (*token!='.')

        sscanf(token, "%f", &t);

      return t;

    }

    /* check for noise */

    get_noise()

    {

      do{

        get_token();

      } while(!find_type(token));

      put_back();         /* return a valid token to stream */

    }

°¢ ´Ü°è¿¡¼­, ÇÁ·Î±×·¥Àº ¸ÕÀú get_noise() ¸¦ È£ÃâÇÑ´Ù´Â °ÍÀ» ÁÖ¸ñÇØ¾ß ÇÑ´Ù. ÀÌ°ÍÀº Áß¿äÇÑ ´Ü¾î ¾Õ¿¡ ¿Ã ¼ö ÀÖ´Â ¾î¶² ÀâÀ½ ´Ü¾îµµ Á¦°ÅÇÑ´Ù. ¶ÇÇÑ ¸í·É¾î ¹®¹ýÀÇ ¼±ÅÃÀû ºÎºÐÀº ±âÁ¤°ª (default) ÀÎ °æ¿ì°¡ ÀÖÀ½À» ÁÖ¸ñÇØ¾ß ÇÑ´Ù. ÀÌ ±âÁ¤°ªÀÇ °æ¿ì´Â ¸í·ÉÀ» ¼öÇàÇÏ´Â ·çƾ°ú ÀÏ°üµÈ ÀÎÅÍÆäÀ̽º¸¦ À¯ÁöÇϱâ À§ÇÏ¿© ÇÊ¿äÇÏ´Ù. get_noise() °¡ Áß¿äÇÑ ´Ü¾î ¾ÕÀÇ ÀâÀ½À» Á¦°ÅÇÒ ¶§, ÇÔ¼ö put_back() Àº ÀÐÇôÁø ¸¶Áö¸· ÅäÅ«À» ÀÔ·Â ½ºÆ®¸µÀ¸·Î ¸®ÅÏÇÑ´Ù.

Áú¹®ÀÌ Ã³¸®µÈ ÈÄ¿¡, Áú¹®ÀÌ ¿ä±¸ÇÏ´Â °ÍÀ» ½ÇÁ¦·Î ÇÏ´Â °ÍÀº perform() ÀÌ´Ù. ÀÌ Ã¥Àº µ¥ÀÌÅͺ£À̽º°¡ ¾Æ´Ï¶ó ÀΰøÁö´É¿¡ °üÇÑ °ÍÀ̱⠶§¹®¿¡, perform() ÇÔ¼ö´Â ÃæºÐÇÑ ¹üÀ§ÀÇ Áú¹® °¡´É¼ºµé¿¡ ¹ÝÀÀÇÒ ¼ö ¾ø´Ù. ±×·¯³ª, ³íÁõ¿¡´Â ÀûÀýÇÏ´Ù (¾à°£ÀÇ ³ë·ÂÀ¸·Î, ±×°ÍÀ» È®Àå½Ãų ¼ö ÀÖ´Ù). perform() ÇÔ¼ö¿Í Áö¿ø ÇÔ¼ö, ±×¸®°í Àüü ÇÁ·Î±×·¥À» ´ÙÀ½¿¡ Á¦½ÃÇÑ´Ù.  

     /* Noise-disposal NLP database query example. */

    # include "stdio.h"

    # define MAX           100

    # define NOISE         0

    # define COMMAND   1

    # define NAME          2

    # define MODIFIER     3

    # define OPERATOR   4

    # define TERM           5

     

    /* structure of the word database (wdb) */

    struct word {

      char word[20];

      char type;

    };

    struct word wdb[MAX];              /* array of db structures */

     

    /* structure of the stock db */

    struct companies {

      char name[20];

      float price;

    };

    struct companies stock[MAX];       /* array of db structures */

     

    int db_pos=0;                               /* number of entries in wdb */

    int stock_pos=0;                          /* number fo entries in stock db */

    char s[80];                                 /* holds the sentence */

    char *t_pos=0;                            /* points into the sentence */

    char token[80];                           /* contains the word */

     

    float get_quant(), find_price(), find_name();

     

    main()

    {

      setup();

      printf("Enter query :  ");

      gets(s);

      printf("¡¬n");

      t_pos=s;

      if (!parse())   printf("Query not understand. ¡¬n");

    }

     

    setup()
    {

      assert_wdb("xyz", NAME);

      assert_wdb("ucl", NAME);

      assert_wdb("abc", NAME);

      assert_wdb("show", COMMAND);

      assert_wdb("is", COMMAND);

      assert_wdb("all", MODIFIER);

      assert_wdb("one", MODIFIER);

      assert_wdb(">", OPERATOR);

      assert_wdb("<", OPERATOR);

      assert_wdb("=", OPERATOR);

      assert_wdb(".", TERM);

      assert_wdb("?", TERM);

      assert_stock("ucl", 123.00);

      assert_stock("abc", 35.75);

      assert_stock("xyz", 100.0);

    }

     

    /* place facts into database */

    assert_wdb(word, type)

    char *word;

    int type;

    {

      if (db_pos < MAX)  {

        strcpy(wdb[db_pos].word, word);

        wdb[db_pos].type=type;

        db_pos++;

      }

      else printf("Word database full. ¡¬n");

    }

     

    /* place companies into database */

    assert_stock(name, price)

    char *name;

    float price;

    {

      if (stock_pos<MAX)  {

        strcpy(stock[stock_pos].name, name);

        stock[stock_pos].price=price;

        stock_pos++;

      }

      else printf("Stock database full. ¡¬n");

    }

     

    /* Noise-disposal NLP parser */

    parser()

    {

      char com[80], mod[80], name[80], op;

      float quant;

       

      if (!get_com(com)) return 0;

      get_mod(mod);

      get_name(name);

      op=get_op();

      quant=get_quant();

      perform(com, mod, name, op, quant);

      return 1;

    }

     

    /* get the command */

    get_com(s)

    char *s;

    {

      get_noise();

      get_token();

      if (find_type(token)!=COMMAND)  {

        strcpy(s, "");

        return 0;

      }

      strcpy(s, token);

      return 1;

    }

    /* get a modifier or default to "all" */

    get_mod(s)

    char *s;

    {

      get_noise();

      get_token();

      if (find_type(token)!=MODIFIER)  {

        strcpy(s, "all");

        put_back();

        return;

      }

      strcpy(s, token);

    }

     

    /* get the name */

    get_name(s)

    char *s;

    {

      get_noise();

      get_token();

      if (find_type(token)!=NAME)  {

        strcpy(s, "");

        put_back();

        return;

      }

      strcpy(s, token);

    }

     

    /* get an operator */

    get_op()

    {

      get_noise();

      get_token();

      if (find_type(token)==OPERATOR)

        return *token;

      else return 0;

    }

     

    /* get a number */

    float get_quant()

    {

      float t;

      get_token();

      if (*token!='.')

        sscanf(token, "%f", &t);

      return t;

    }

     

    /* check for noise */

    get_noise()

    {

      do{

        get_token();

      } while(!find_type(token));

      put_back();         /* return a valid token to stream */

    }

     

    perform(com, mod, name, op, quant)

    char *com, *mod, *name, op;

    float quant;

    {

      if (!strcmp("show", com) {

        if (*name)  printf("%f¡¬n", find_price(name));

        else search(mod, op, quant);

      }

      else {  /* is */

        switch(op) {

          case '=' :

            if (quant==find_price(name))

              printf(" yes¡¬n");

            else printf(" no¡¬n");

            break;

          case '<' :

            if (find_price(name)<quant)

              printf(" yes¡¬n");

            else printf(" no¡¬n");

            break;

           case '>' :

            if (find_price(name)<quant)

              printf(" yes¡¬n");

            else printf(" no¡¬n");

            break;

        }

      }

    }

     

    search(mod, op, quant)

    char *mod, op;

    float quant;

    {

      char name[80];

      float price;

      find_name("", quant, op, 1);         /* restart */

      if (!strcmp(mod, "all")) {

        while ((price=find_name(name, quant, op, 0))!=0) {

          printf("%s", name);

          printf(" %f¡¬n", price);

        }

      }

      else {

        price=find_name(name, quant, op, 0);

        printf("%s", name);

        printf(" %f¡¬n", price);

        }

      }

       

    /* return the last token to the input stream */

    put_back()

    {

      int t;

      for (t=0; t<strlen(token); t++) t_pos--;

    }

    terminator()

    {

      get_token();

      return(find_type(token)==TERM);

    }

     

    /* given name find price */

    float find_price(name)

    char *name;

    {

      int t;

      for (t=0; t<stock_pos; t++) {

        if (!strcmp(name, stock[t].name))

          return stock[t].price;

      }

      return -1.0;

    }

     

    /* given price and op find name */

    float find_name(name, price, op, restart)

    char *name, op, restart;

    float price;

    {

      static int t;

      if (restart) {

        t=0;

        return;

      }

      do {

        switch(op) {

          case '=' :

            if (stock[t].price==price) {

              strcpy(name, stock[t].name);

              t++;

              return stock[t-1].price;

            }

            break;

          case '<' :

            if (stock[t].price<price) {

              strcpy(name, stock[t].name);

              t++;

              return stock[t-1].price;

            }

            break;

          case '>' :

            if (stock[t].price>price) {

              strcpy(name, stock[t].name);

              t++;

              return stock[t-1].price;

            }

            break;

          case '¡¬0' :

            strcpy(name, stock[t].name);

            t++;

            return stock[t-1].price;

        }

        t++;

      } while(t<=stock_pos);

      return 0;

    }

     

    /* find the type given the word */

    find_type(word)

    char *word;

    {

      int t;

      for (t=0; t<db_pos; t++)

        if (!strcmp(word, wdb[t].word))

          return wdb[t].type;

      return 0;

    }

    get_token()

    {

      char *p;

      p=token;

      /* skip spaces */

      while (*t_pos==' ')  t_pos++;

        if (*t_pos=='.') {

          *p++='.';

          *p='¡¬0';

          return;

        }

      /* read word until a space or period */

      while (*t_pos!=' ' && *t_pos!='.') {

        *p=*t_pos++;

        p++;

      }

      *p='¡¬0';

    }

¼öÇàµÉ ¶§, ÀÌ ÇÁ·Î±×·¥Àº ´ÙÀ½ Áú¹®µéÀ» ¿Å°Ô Çؼ®Çؼ­ ¹ÝÀÀÇÒ °ÍÀÌ´Ù.

show me all companies with prices > 100.

ucl 123

show me all.

ucl 123

abc 35

xyz 100

please show me one with a price < 50

abc 35

is the price of ucl > 100 ?

yes

ºñ·Ï ÀÌ °£´ÜÇÑ ÇÁ·Î±×·¥Àº µÎ °³ÀÇ ¸í·É¾î¸¸À» °®Áö¸¸, ¼±ÅÃÇÑ´Ù¸é ´Ù¸¥ °ÍµéÀº ½±°Ô ÷°¡ÇÒ ¼ö ÀÖ´Ù.

(1) ÀâÀ½Á¦°Å Æļ­ÀÇ ºÐ¼® (Analysis of the Noise-Disposal Parser)

ÀâÀ½Á¦°Å Æļ­ÀÇ ÇÑ°¡Áö ´ÜÁ¡Àº ¹æ±Ý ÁÖ¾îÁø µ¥ÀÌÅͺ£À̽ºÀÇ ¿¹¿Í °°Àº Á¦ÇÑµÈ »óȲ ¹Û¿¡´Â À¯¿ëÇÏÁö ¸øÇѵ¥, ¿Ö³ÄÇÏ¸é µÎ°¡Áö °¡Á¤¿¡ ±âÃÊÇϱ⠶§¹®ÀÌ´Ù. ù ¹ø° °¡Á¤Àº ±× ¹®ÀåÀÌ ¾ö¹ÐÇÑ Æ÷¸ËÀ» µû¸¥´Ù´Â °ÍÀÌ´Ù. µÎ ¹ø° °¡Á¤Àº ´Ü ¸î °³ÀÇ Å°¿öµå³ª ±âÈ£µé¸¸ÀÌ Áß¿äÇÏ´Ù´Â °ÍÀÌ´Ù. ¾Ë°ÚÁö¸¸, ½ÇÁ¦ÀÇ ´ëÈ­¿¡¼­, ´ëºÎºÐÀÇ ´Ü¾îµéÀº ¾î¶² ´Ù¸¥ °Íº¸´Ù ´õ Áß¿äÇÏ´Ù.

µÎ ¹ø° ´ÜÁ¡Àº, ¸¹Àº »óȲ¿¡¼­ ´ÙÀ½°ú °°Àº ¹®ÀåÀ» ¹Þ¾ÆµéÀÌ°í, stock µ¥ÀÌÅͺ£À̽ºÀÇ ³»¿ëÀ» µð½ºÇ÷¹ÀÌÇÑ´Ù´Â °ÍÀÌ´Ù. ÀÌ·± À¯ÇüÀÇ ´ÜÁ¡Àº ÇÁ·Î±×·¥À» ½Å·ÚÇϱ⠾î·Æ°Ô ÇÑ´Ù.

the big pig show house all > 100.

ÀâÀ½Á¦°Å Æļ­ÀÇ ÁÖ¿äÇÑ ÀåÁ¡Àº ±¸ÇöÇϱⰡ °£´ÜÇÏ°í, ¹®Àå¿¡ ÀÖ´Â Á¤º¸¸¦ »¡¸® ¹Þ¾ÆµéÀδٴ °ÍÀÌ´Ù. ´õ¿íÀÌ, ´Ù¸¥ À¯ÇüÀÇ Æļ­¿Í ÇÔ²² »ç¿ëµÉ ¶§, ¹Þ¾Æµé¿©Á®¾ß ÇÏ´Â º¯È­ (variation) ÀÇ ¼ö¸¦ ÁÙÀÏ ¼ö ÀÖ´Ù. »ç½Ç, ¾î¶² ¼º°øÀûÀÎ ÀÚ¿¬¾ð¾î 󸮱⵵ ÃÖ¼ÒÇÑ ¾î¶² ÀâÀ½Á¦°Å ½Ã½ºÅÛÀ» °¡Áú °ÍÀ̶ó´Â °ÍÀº ÀǽÉÇÒ ¿©Áö ¾ø´Ù.