我需要使解析器能够从文本输入中提取逻辑结构,以便为某些Web服务构造查询.
我试图使用正则表达式但处理叠加逻辑变得非常复杂,所以我决定寻求帮助,也许我做错了.
例如:
( (foo1 and bar) or (foo2 and bar2) ) and ( (foo3 and bar3) or foo4 ) and "this is quoted"
结果应该是这样的:
{ { foo1 AND bar } OR { foo2 AND bar2 } } AND { { foo3 AND bar3 } OR foo4 } AND { "this is quoted" }
使用的语言是actionscript 3,但我可以调整Java版本.
好吧,解析器很简单......
首先你需要很多东西(我会省略构造函数,因为我猜你可以自己编写):
表达式(输出):
class Expression {} class Operation extends Expression { public var operand1:Expression; public var operator:String; public var operand2:Expression; } class Atom extends Expression { public var ident:String; }
令牌(中介格式):
class Token { public var source:String; public var pos:uint; } class Identiefier extends Token { public var ident:String; } class OpenParenthesis extends Token {} class CloseParenthesis extends Token {} class Operator extends Token { public var operator:String; } class Eof extends Token {}
和一个应该实现此接口的标记化器
interface TokenStream { function read():Token; }
我猜你会弄清楚如何标记......
所以方法是源 - (tokenizer) - >令牌 - (解析器) - >表达式......
在这里解析例程,带一个小助手:
function parse(t:TokenStream):Expression { var tk:Token = t.read(); switch ((tk as Object).constructor) {//this is a really weird thing about AS3 ... need to cast to object, before you can access the constructor case OpenParanthesis: var e1:Expression = parse(t); tk = t.read(); switch ((tk as Object).constructor) { case CloseParenthesis: return e1; case Operator: var op:String = (tk as Operator).operator; var e2:Expression = parse(t); tk = t.read(); if (tk is CloseParenthesis) return new Operation(e1,op,e2); else unexpected(tk); } else unexpected(tk); break; case Identifier: return new Atom((tk as Identifier).ident); default: unexpected(tk); } } function unexpected(tk:Token) { throw "unexpected token "+tk.source+" at position "+tk.pos; }
这不是一个特别好的解析器,但它显示了解析例程的基本原理......实际上,我没有检查实现,但它应该工作......它是非常原始的和不允许的......像运算符优先级等完全缺失,等等...但是如果你想要它,那就去吧...
顺便说一句.将haXe与enum一起使用,整个代码看起来会更短更漂亮......你可能想看看它......
那么祝你好运吧 ... ;)
格尔茨
back2dos