CYK 알고리즘

CYK 알고리즘 (Cocke-Younger-Kasami 알고리즘, CYK Algorithm) 또는 CKY 알고리즘 은 特定한 文字列 에 對해, 그 文字列이 特定한 文脈自由文法 에 屬하는지를 判斷하고, 또한 어떠한 方式으로 生成되는지를 判斷하는 파싱 알고리즘이다. 이 알고리즘은 動的計劃法 을 使用하며, 上向式 파싱 構造를 가지고 있다.

基本的으로 CYK 알고리즘에서는 촘스키 正規形式 으로 表現된 文脈自由文法을 使用하지만, 모든 文脈自由文法은 촘스키 正規形式으로 變換이 可能하기 때문에 CYK 알고리즘을 使用할 수 있다.

文字列의 길이가 $n$ 일 때, CYK 알고리즘은 Θ (n ³)의 時間複雜度를 가진다. 이것은 現在 모든 文脈自由文法을 파싱할 수 있는 가장 效率的인 알고리즘이다. (CYK 알고리즘보다 效率的으로 動作하는 몇몇 알고리즘이 있지만, 그 알고리즘은 특정한 文法의 境遇에만 使用이 可能하다.)

알고리즘 [ 編輯 ]

membership 問題를 푸는 CYK 알고리즘은 아래와 같다:

Let
 the input string consist of 
n
 letters, 
a
₁ ... 
a
_n.
Let
 the grammar contain 
r
 terminal and nonterminal symbols 
R
₁ ... 
R
_r.
This grammar contains the subset R
_s which is the set of start symbols.
Let
 P[n,n,r] be an array of booleans. Initialize all elements of P to false.
For each
 i = 1 to n
  
For each
 unit production R
_j -> a
_i, 
set
 P[i,1,j] = true.
For each
 i = 2 to n -- Length of span
  
For each
 j = 1 to n-i+1 -- Start of span
    
For each
 k = 1 to i-1 -- Partition of span
      
For each
 production R
_A -> R
_B R
_C
        If
 P[j,k,B] and P[j+k,i-k,C] 
then
 set P[j,i,A] = true
If
 any of P[1,n,x] is true (x is iterated over the set s, where s are all the indices for R
_s)
  
Then
 string is member of language
  
Else
 string is not member of language

예 [ 編輯 ]

例示文法은 다음과 같다.

{\begin{aligned}{\ce {S}}&\ {\ce {->NP\ VP}}\\{\ce {VP}}&\ {\ce {->VP\ PP}}\\{\ce {VP}}&\ {\ce {->V\ NP}}\\{\ce {VP}}&\ {\ce {->eats}}\\{\ce {PP}}&\ {\ce {->P\ NP}}\\{\ce {NP}}&\ {\ce {->Det\ N}}\\{\ce {NP}}&\ {\ce {->she}}\\{\ce {V}}&\ {\ce {->eats}}\\{\ce {P}}&\ {\ce {->with}}\\{\ce {N}}&\ {\ce {->fish}}\\{\ce {N}}&\ {\ce {->fork}}\\{\ce {Det}}&\ {\ce {->a}}\end{aligned}}

參考文獻 [ 編輯 ]

John Cocke and Jacob T. Schwartz (1970). Programming languages and their compilers: Preliminary notes. Technical report, Courant Institute of Mathematical Sciences, New York University.
T. Kasami (1965). An efficient recognition and syntax-analysis algorithm for context-free languages. Scientific report AFCRL-65-758, Air Force Cambridge Research Lab, Bedford, MA.
Daniel H. Younger (1967). Recognition and parsing of context-free languages in time n ³. Information and Control 10(2): 189?208.

알고리즘 [ 編輯 ]

예 [ 編輯 ]

參考 文獻 [ 編輯 ]

參考文獻 [ 編輯 ]