TL; DR:如何计算一个语法产品的模型,以便为同一个左手边存在不确定数量的产品?
我正在研究一个关于形式语言理论的项目,我正在尝试编写一个用于构建常规语法对象的类,这些对象可以传递给有限状态机.我天真的尝试是创建一个API,为每个允许的输入添加一个生产.我的尝试的精简版本如下(基于正式语法的正式定义G = (N, ?, P, S)
):
class ContextFreeGrammar: def __init__(self, variables, alphabet, production_rules, start_variable): self.variables = variables self.alphabet = alphabet self.production_rules = production_rules self.start_variable = start_variable def __repr__(self): return '{}({}, {}, {}, {})'.format( self.__class__.__name__, self.variables, self.alphabet, self.production_rules, self.start_variable ) class RegularGrammar(ContextFreeGrammar): _regular_expression_grammar = None # TODO @classmethod def from_regular_expression(cls, regular_expression): raise NotImplementedError()
我还没有达到实际编写有限状态自动机或下推自动机的程度.
正则表达式的语法是无上下文的,所以我在下面的WSN中包含了我的定义:
syntax = expression . expression = term "|" expression . expression = term . term = factor repetition term . term = factor term . term = . repetition = "*" . repetition = "+" . repetition = "?" . repetition = "{" nonnegative_integer "," nonnegative_integer "}" . repetition = "{" nonnegative_integer ",}" . repetition = "{," nonnegative_integer "}" . nonnegative_integer = nonzero_arabic_numeral arabic_numerals . nonnegative_integer = arabic_numeral . nonzero_arabic_numeral = "1" . nonzero_arabic_numeral = "2" . nonzero_arabic_numeral = "3" . nonzero_arabic_numeral = "4" . nonzero_arabic_numeral = "5" . nonzero_arabic_numeral = "6" . nonzero_arabic_numeral = "7" . nonzero_arabic_numeral = "8" . nonzero_arabic_numeral = "9" . arabic_numeral = nonzero_arabic_numeral . arabic_numeral = "0" . arabic_numerals = arabic_numeral . arabic_numerals = arabic_numeral arabic_numerals . factor = "(" expression ")" . factor = character_class . factor = character . escaped_character = "\\." . escaped_character = "\\(" . escaped_character = "\\)" . escaped_character = "\\+" . escaped_character = "\\*" . escaped_character = "\\?" . escaped_character = "\\[" . escaped_character = "\\]" . escaped_character = "\\\\" . escaped_character = "\\{" . escaped_character = "\\}" . escaped_character = "\\|" . character -> TODO ; character_class = TODO .
人们可以很容易地看到我明确地将交替分成单独的作品.我这样做是为了便于实施.但我仍然坚持如何进行角色等等.我想要production_rules
成为从左手边到每个相应右手边的一套地图.但现在这看起来不可行.