Understanding Pratt Parsing: Top‑Down Operator Precedence

How Pratt parsing uses binding power to build expression trees efficiently for expression evaluation, with a step‑by‑step Go implementation and extensions for unary minus, parentheses, exponentiation, and variables.

Parsing arithmetic expressions is a foundational problem in compiler design. Humans instantly know that 3 + 4 * 5 should be read as 3 + (4 * 5) rather than (3 + 4) * 5, but a machine needs explicit rules for operator precedence and associativity. The Pratt parsing algorithm, named after Vaughan Pratt, is a concise top‑down way to turn an infix expression into an abstract syntax tree (AST) using left and right binding powers (plus a binding power for prefix operators). Once you have an AST, evaluation is straightforward: recursively evaluate each subtree and apply the operator at the root.

This article explains the core idea behind Pratt parsing, walks through how operands “bind” to operators, and then implements a small recursive parser in Go. We extend it step by step to support unary minus, parentheses, exponentiation, and variables with assignment.

What you’ll see below: intuition and examples first, a short pseudocode sketch, then the Go code and incremental diffs for each feature.

Expressions can be written in several notations, depending on where operators sit relative to their operands:

NotationPatternExample for “3 plus 4”
Infixoperand -- operator -- operand3 + 4
Prefix (Polish)operator -- operands+ 3 4
Postfix (reverse Polish)operands -- operator3 4 +

Infix is what most programming languages use for arithmetic: the operator appears between the things it combines. That matches how people read math on paper, but it is the hardest of the three to parse from a flat stream of tokens, because the structure is not explicit: you have to decide where one subexpression ends and another begins. Precedence (* before +), associativity (a - b - c as (a - b) - c), and parentheses are the rules that disambiguate infix.

Prefix and postfix notations need no parentheses for the same expressions (the order of tokens fixes the tree), which is why some calculators and VMs use them internally. This article stays with infix: Pratt parsing is a compact way to turn infix into an AST despite that ambiguity.

Pratt parsing builds a tree: each internal node is an operator, and its left and right children are subexpressions (which may be single values or subtrees). The simplest case is a binary operator with two operands.

For example, the expression:

3 + 4
3 + 4

can be drawn as:

    +
   / \
  3   4
    +
   / \
  3   4

A denser example:

3 + 4 * 5 / 2
3 + 4 * 5 / 2

has this tree (multiplication and division bind tighter than addition):

      +
     / \
    3   /
       / \
      *   2
     / \
    4   5
      +
     / \
    3   /
       / \
      *   2
     / \
    4   5

Evaluating the expression means evaluating the children first, then applying the operator at the parent.

  • Atom: a value such as 1, 20, or a larger subtree treated as one unit.
  • Op (operator): a symbol such as +, -, *, /.

In a normal infix expression, atoms and operators alternate. For instance, atom 3 has operator + on its right, while atom 4 has + on its left and * on its right.

To convert an infix expression into an AST, we need to understand how operators bind to their operands. In infix notation, each binary operator sits between two operands. In 3 + 4, + sits between 3 and 4, on the other hand, 4 sits between + and *.

The question now is: which operator should the operand bind to?

Pratt parsing gives every operator two numbers:

  • Left binding power (LBP): how strongly an operator wants to grab the value on its left.
  • Right binding power (RBP): how strongly an operator wants to grab the value on its right.

So, the binding rule is: The operand binds to the operator with higher binding power on either side. If both adjacent operators have the same binding power, bind the atom to the operator on the left.

For left-associative operators such as +, -, *, and /, we choose a right binding power slightly greater than the left binding power. This prevents an operator of the same precedence from being absorbed into the recursive call, which gives us left associativity. For right-associative operators such as ^, we choose a right binding power slightly smaller than the left binding power. This prevents an operator of the same precedence from being absorbed into the recursive call, which gives us right associativity.

We can freely choose the left & right binding power for each operator. But we must retain the precedence of the operators. For example, * should have higher binding powers (both left & right) than + or -. Here's one way to do it:

OperatorLeftRight
=0.10.0
+, -11.1
*, /22.1
^3.13.0

Let's take this as an example:

Expression:         3     +       4       *        5
Binding power:        1.0    1.1     2.0   2.1
Expression:         3     +       4       *        5
Binding power:        1.0    1.1     2.0   2.1

Applying the binding rule, we have:

  • 3 binds to the + on its right.
  • 4 binds to the * on its right.
  • 5 binds to the * on its left.

Now * has 4 on its left and 5 on its right. It can form a subtree with the * as the root. We call this subtree as x.

The expression becomes:

Expression:         3      +       x
Binding power:        1.0    2.1
Expression:         3      +       x
Binding power:        1.0    2.1

Now + has 3 on its left and x on its right. It can form a subtree with the + as the root. Finally, we have this tree

      +
     / \
    3   *
       / \
      4   5
      +
     / \
    3   *
       / \
      4   5

Expression:         3     +       4       *        5        /       2
Binding power:        1.0    1.1     2.0   2.1         2.0    2.1
Expression:         3     +       4       *        5        /       2
Binding power:        1.0    1.1     2.0   2.1         2.0    2.1
  • 3 binds to the + on its right.
  • 4 binds to the * on its right.
  • 5 binds to the * on its left. Now * has 4 on its left and 5 on its right. It can form a subtree with the * as the root. We call this subtree as x.

The expresstion turns to:

Expression:         3      +       x       /       2
Binding power:        1.0    1.1      2.0    2.1
Expression:         3      +       x       /       2
Binding power:        1.0    1.1      2.0    2.1
  • 3 still binds to the + on its right.
  • x binds to the / on its right
  • 2 binds to the / on its left.

Now / has x on its left and 2 on its right. It can form a subtree with the / as the root.

So, the expression turns to:

Expression:         3      +      y
Binding power:        1.0    1.1
Expression:         3      +      y
Binding power:        1.0    1.1

Now + has 3 on its left and y on its right. It can form a subtree with the + as the root.

So finally, we have this tree

      +
     / \
    3   /
       / \
      *   2
     / \
    4   5
      +
     / \
    3   /
       / \
      *   2
     / \
    4   5

I hope that two examples above give you the intuition of how Pratt parsing works. Now let's see how to implement it.

Pay attention to the example #2, we can see that, after bind 4 and 5 to *, forming the subtree x, we can't just bind x to + on its left. Because we still have the / operator on its right. So, we need to bind x to / on its right. In that example, we know that 3 will be the left child of +, and our job is to find the right child of +. But how?

How much of the input belongs to the right child?

3 + 4 * 5 / 2
3 + 4 * 5 / 2

After reading 3 +, part of the picture is:

left child: 3
operator: +
right child: ?
left child: 3
operator: +
right child: ?

The right operand of + is not necessarily the single token 4. It might be 4, or 4 * 5, or 4 * 5 / 2. From the example, we observe that, after forming the subtree, we still look for the next operator on the right and bind to it if the operator on the right has higher binding power than the current operator. That leads us to the following rule:

The right child is the largest sub-expression which the right operator's binding power is higher than the current operator's binding power.

In the previous example, the right child of + is 4 * 5 / 2. As * and / all have greater binding power than +. If we have the expression 3 + 4 * 5 + 2, the right child of + is 4 * 5. As * has greater binding power than +, while the second + does not. Finding the largest sub-expression is a recursive problem, as in sub-expression, we still need to build the tree recursively. In the example above, 4 * 5 / 2 also forms to another tree.

The parser does not explicitly scan both sides of every operand. Instead, it builds the tree incrementally by parsing the left atom first, then recursively parsing the right-hand side while operator binding power allows.

minBinding is the minimum left binding required to keep looping. The implementation breaks when leftBindingPower(op) < minBinding.

function parse(tokens, minBinding):
  left = atom from tokens
  while tokens not empty:
    op = peek infix operator
    if leftBindingPower(op) < minBinding:
      break
    consume op
    left = Node(op, left, parse(tokens, rightBindingPower(op)))
  return left
function parse(tokens, minBinding):
  left = atom from tokens
  while tokens not empty:
    op = peek infix operator
    if leftBindingPower(op) < minBinding:
      break
    consume op
    left = Node(op, left, parse(tokens, rightBindingPower(op)))
  return left

Using the table from §2, parsing 3 + 4 * 5 + 2. Here's the call stack when we parse it.

parse("3 + 4 * 5 + 2", 0)

├── parse("4 * 5 + 2", 2)      // right child of first '+'
│   │
│   ├── parse("5 + 2", 3)      // right child of '*'
│   │   └── stop at '+', return 5
│   │
│   └── build (* 4 5), stop at '+', return (4 * 5)

├── build (+ 3 (4 * 5))

├── parse("2", 2)              // right child of second '+'
│   └── return 2

└── build (+ (3 + 4 * 5) 2)
parse("3 + 4 * 5 + 2", 0)

├── parse("4 * 5 + 2", 2)      // right child of first '+'
│   │
│   ├── parse("5 + 2", 3)      // right child of '*'
│   │   └── stop at '+', return 5
│   │
│   └── build (* 4 5), stop at '+', return (4 * 5)

├── build (+ 3 (4 * 5))

├── parse("2", 2)              // right child of second '+'
│   └── return 2

└── build (+ (3 + 4 * 5) 2)

A Lexer is a component that reads a string, breaks it into a list of tokens, each token is either Atom or Operator. Lexer exposes three methods: More(), Peek(), and Pop().

  • More(): returns true if there are more tokens left.
  • Peek(): returns the next token without removing it.
  • Pop(): returns the next token and removes it.
go
type Lexer interface {
	More() bool
	Peek() string
	Pop() string
}

type stackLexer struct {
	tokens []string
}

func NewLexer(input string) Lexer {
	tokens := tokenize(input)
	reverseStrings(tokens)
	return &stackLexer{tokens: tokens}
}

func tokenize(s string) []string {
	var ans []string
	for i := 0; i < len(s); {
		ch := s[i]
		if ch == ' ' {
			i++
			continue
		}
		if unicode.IsDigit(rune(ch)) {
			j := i + 1
			for j < len(s) && unicode.IsDigit(rune(s[j])) {
				j++
			}
			ans = append(ans, s[i:j])
			i = j
			continue
		}
		if unicode.IsLetter(rune(ch)) || ch == '_' {
			j := i + 1
			for j < len(s) && (unicode.IsLetter(rune(s[j])) || unicode.IsDigit(rune(s[j])) || s[j] == '_') {
				j++
			}
			ans = append(ans, s[i:j])
			i = j
			continue
		}
		ans = append(ans, string(ch))
		i++
	}
	return ans
}

func reverseStrings(a []string) {
	for i, j := 0, len(a)-1; i < j; i, j = i+1, j-1 {
		a[i], a[j] = a[j], a[i]
	}
}

func (l *stackLexer) More() bool {
	return len(l.tokens) > 0
}

func (l *stackLexer) Peek() string {
	if len(l.tokens) == 0 {
		return ""
	}
	return l.tokens[len(l.tokens)-1]
}

func (l *stackLexer) Pop() string {
	if len(l.tokens) == 0 {
		panic("lexer: Pop on empty stream")
	}
	v := l.tokens[len(l.tokens)-1]
	l.tokens = l.tokens[:len(l.tokens)-1]
	return v
}
type Lexer interface {
	More() bool
	Peek() string
	Pop() string
}

type stackLexer struct {
	tokens []string
}

func NewLexer(input string) Lexer {
	tokens := tokenize(input)
	reverseStrings(tokens)
	return &stackLexer{tokens: tokens}
}

func tokenize(s string) []string {
	var ans []string
	for i := 0; i < len(s); {
		ch := s[i]
		if ch == ' ' {
			i++
			continue
		}
		if unicode.IsDigit(rune(ch)) {
			j := i + 1
			for j < len(s) && unicode.IsDigit(rune(s[j])) {
				j++
			}
			ans = append(ans, s[i:j])
			i = j
			continue
		}
		if unicode.IsLetter(rune(ch)) || ch == '_' {
			j := i + 1
			for j < len(s) && (unicode.IsLetter(rune(s[j])) || unicode.IsDigit(rune(s[j])) || s[j] == '_') {
				j++
			}
			ans = append(ans, s[i:j])
			i = j
			continue
		}
		ans = append(ans, string(ch))
		i++
	}
	return ans
}

func reverseStrings(a []string) {
	for i, j := 0, len(a)-1; i < j; i, j = i+1, j-1 {
		a[i], a[j] = a[j], a[i]
	}
}

func (l *stackLexer) More() bool {
	return len(l.tokens) > 0
}

func (l *stackLexer) Peek() string {
	if len(l.tokens) == 0 {
		return ""
	}
	return l.tokens[len(l.tokens)-1]
}

func (l *stackLexer) Pop() string {
	if len(l.tokens) == 0 {
		panic("lexer: Pop on empty stream")
	}
	v := l.tokens[len(l.tokens)-1]
	l.tokens = l.tokens[:len(l.tokens)-1]
	return v
}

The Expression is the core data data structure which represent a node in the tree. It has the following fields:

  • value: In case the ndoe is a left, the value usually a number, and later we will extend it to also support variable names
  • varName: the name of the variable if it is a leaf.
  • lhs: the left child of the node.
  • rhs: the right child of the node.
  • isLeaf: whether the node is a leaf.
  • ope: the operator of the node.
go
type Expression struct {
	value   int64
	varName string
	lhs     *Expression
	rhs     *Expression
	isLeaf  bool
	ope     byte
}

func leafExpr(v int64) *Expression {
	return &Expression{value: v, isLeaf: true}
}

func leafVar(name string) *Expression {
	return &Expression{varName: name, isLeaf: true}
}

func nodeExpr(lhs, rhs *Expression, ope byte) *Expression {
	return &Expression{lhs: lhs, rhs: rhs, isLeaf: false, ope: ope}
}

func (e *Expression) Eval() int64 {
	if e.isLeaf {
		return e.value
	}
	leftValue := e.lhs.Eval()
	rightValue := e.rhs.Eval()
	switch e.ope {
	case '+':
		return leftValue + rightValue
	case '-':
		return leftValue - rightValue
	case '*':
		return leftValue * rightValue
	case '/':
		if rightValue == 0 {
			return 0
		}
		return leftValue / rightValue
	default:
		return 0
	}
}
type Expression struct {
	value   int64
	varName string
	lhs     *Expression
	rhs     *Expression
	isLeaf  bool
	ope     byte
}

func leafExpr(v int64) *Expression {
	return &Expression{value: v, isLeaf: true}
}

func leafVar(name string) *Expression {
	return &Expression{varName: name, isLeaf: true}
}

func nodeExpr(lhs, rhs *Expression, ope byte) *Expression {
	return &Expression{lhs: lhs, rhs: rhs, isLeaf: false, ope: ope}
}

func (e *Expression) Eval() int64 {
	if e.isLeaf {
		return e.value
	}
	leftValue := e.lhs.Eval()
	rightValue := e.rhs.Eval()
	switch e.ope {
	case '+':
		return leftValue + rightValue
	case '-':
		return leftValue - rightValue
	case '*':
		return leftValue * rightValue
	case '/':
		if rightValue == 0 {
			return 0
		}
		return leftValue / rightValue
	default:
		return 0
	}
}

go
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
	default:
		return 0, 0
	}
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
	default:
		return 0, 0
	}

go
func parse(l Lexer, minBinding int) *Expression {
	var lhs *Expression
	value, err := strconv.ParseInt(l.Pop(), 10, 64)
	if err != nil {
		panic(err)
	}
	lhs = leafExpr(value)

	for l.More() {
		opeTok := l.Peek()
    leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
		if leftBindingPower < minBinding {
			break
		}
		l.Pop()
		rhs := parse(l, rightBindingPower)
		lhs = nodeExpr(lhs, rhs, opeTok[0])
	}

	return lhs
}
func parse(l Lexer, minBinding int) *Expression {
	var lhs *Expression
	value, err := strconv.ParseInt(l.Pop(), 10, 64)
	if err != nil {
		panic(err)
	}
	lhs = leafExpr(value)

	for l.More() {
		opeTok := l.Peek()
    leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
		if leftBindingPower < minBinding {
			break
		}
		l.Pop()
		rhs := parse(l, rightBindingPower)
		lhs = nodeExpr(lhs, rhs, opeTok[0])
	}

	return lhs
}
  1. Parse the first token as an integer that becomes the left child of the root node.
  2. While there are more tokens, peek the next infix operator. If leftBindingPower(op) < minBinding, we break the loop.
  3. Otherwise consume op and set rhs := parse(l, rightBindingPower(op)). The RHS is parsed in a context where only operators stronger than that RBP bind inside; equal-LBP neighbors are excluded when rightBindingPower(op) == leftBindingPower(op) (left assoc), and a second ^ is still allowed when rightBindingPower(op) < leftBindingPower(op) (right assoc).
  4. Fold lhs = nodeExpr(lhs, rhs, op) and repeat.

Thin wrapper for testing:

go
type Parser struct{}

func (p *Parser) Parse(s string) int {
	lex := NewLexer(s)
	expr := parse(lex, -1)
	return int(expr.Eval())
}
type Parser struct{}

func (p *Parser) Parse(s string) int {
	lex := NewLexer(s)
	expr := parse(lex, -1)
	return int(expr.Eval())
}

parse(lex, -1) sets minBinding below every binary LBP so the first infix operator on the stream is always eligible. Parser.Parse builds the lexer, runs parse, and evaluates the tree.

The current parser assumes the first token is a number. Tokenization treats any non-digit single character as an operator, so -3+4 becomes ["-", "3", "+", "4"]. The leading - is not a binary operator here; parsing it like a digit causes a panic.

Two common fixes:

  • Prepend 0 so the expression becomes 0-3+4, or
  • Treat a leading - as unary minus, e.g. desugar to 0 - (parsed operand) with a precedence that binds the operand tightly.

We use the second approach:

In the recursive call for prefix -, we pass a very large minBinding so the right-hand side is parsed as a single tight sub-expression. In this parser, we choose to make unary minus bind more strongly than all currently supported infix operators by passing a very large minBinding.

diff
func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
-	value, err := strconv.ParseInt(l.Pop(), 10, 64)
-	if err != nil {
-		panic(err)
-	}
-	lhs = leafExpr(value)
+	token := l.Peek()
+	if token == "-" {
+		l.Pop()
+		lhs = nodeExpr(leafExpr(0), parse(l, 99.9), '-')
+	} else {
+		v, err := strconv.ParseInt(l.Pop(), 10, 64)
+		if err != nil {
+			panic(err)
+		}
+		lhs = leafExpr(v)
+	}

 	for l.More() {
 		opeTok := l.Peek()
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }
func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
-	value, err := strconv.ParseInt(l.Pop(), 10, 64)
-	if err != nil {
-		panic(err)
-	}
-	lhs = leafExpr(value)
+	token := l.Peek()
+	if token == "-" {
+		l.Pop()
+		lhs = nodeExpr(leafExpr(0), parse(l, 99.9), '-')
+	} else {
+		v, err := strconv.ParseInt(l.Pop(), 10, 64)
+		if err != nil {
+			panic(err)
+		}
+		lhs = leafExpr(v)
+	}

 	for l.More() {
 		opeTok := l.Peek()
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }

We also want grouped subexpressions. Assume valid input: every ( has a matching ).

  • On (, recurse into a nested parse with the same minBinding as the top level (-1 here), so grouping does not change relative binding.
  • Stop the main loop when the next token is ), so the closing paren ends the current subexpression instead of being treated as an operator.
diff
 func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
 	t := l.Peek()
+	if len(t) > 0 && t[0] == '(' {
+		l.Pop()
+		lhs = parse(l, -1)
+		l.Pop()
+	} else if t == "-" {
 		l.Pop()
 		lhs = nodeExpr(leafExpr(0), parse(l, 99.9), '-')
 	} else {
 		v, err := strconv.ParseInt(l.Pop(), 10, 64)
 		if err != nil {
 			panic(err)
 		}
 		lhs = leafExpr(v)
 	}

-	for l.More() {
+	for l.More() && l.Peek()[0] != ')' {
 		opeTok := l.Peek()
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }
 func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
 	t := l.Peek()
+	if len(t) > 0 && t[0] == '(' {
+		l.Pop()
+		lhs = parse(l, -1)
+		l.Pop()
+	} else if t == "-" {
 		l.Pop()
 		lhs = nodeExpr(leafExpr(0), parse(l, 99.9), '-')
 	} else {
 		v, err := strconv.ParseInt(l.Pop(), 10, 64)
 		if err != nil {
 			panic(err)
 		}
 		lhs = leafExpr(v)
 	}

-	for l.More() {
+	for l.More() && l.Peek()[0] != ')' {
 		opeTok := l.Peek()
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }

Register ^ with LBP 3.1 and RBP 3.0 (one less than LBP). That single RBP gap is what makes ^ right-associative: the inner parse(3.0) still accepts another ^ because 3.1 > 3.0, so 2^3^4 becomes 2^(3^4). Then teach Eval about ^ (here via math.Pow; production code may want integer exponentiation rules).

diff
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
+  case '^':
+		return 3.1, 3.0
	default:
		return 0, 0
	}
}
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
+  case '^':
+		return 3.1, 3.0
	default:
		return 0, 0
	}
}

Updating the Eval method to support exponentiation.

diff
 func (e *Expression) Eval() int64 {
 	if e.isLeaf {
 		return e.value
 	}
 	leftValue := e.lhs.Eval()
 	rightValue := e.rhs.Eval()
 	switch e.ope {
 	case '+':
 		return leftValue + rightValue
 	case '-':
 		return leftValue - rightValue
 	case '*':
 		return leftValue * rightValue
 	case '/':
 		if rightValue == 0 {
 			return 0
 		}
 		return leftValue / rightValue
+	case '^':
+		return int64(math.Pow(float64(leftValue), float64(rightValue)))
 	default:
 		return 0
 	}
 }
 func (e *Expression) Eval() int64 {
 	if e.isLeaf {
 		return e.value
 	}
 	leftValue := e.lhs.Eval()
 	rightValue := e.rhs.Eval()
 	switch e.ope {
 	case '+':
 		return leftValue + rightValue
 	case '-':
 		return leftValue - rightValue
 	case '*':
 		return leftValue * rightValue
 	case '/':
 		if rightValue == 0 {
 			return 0
 		}
 		return leftValue / rightValue
+	case '^':
+		return int64(math.Pow(float64(leftValue), float64(rightValue)))
 	default:
 		return 0
 	}
 }

To support identifiers and =, extend the expression type, keep a map of variable values on the parser, and give = the lowest LBP in the table. Use rightBindingPower('=') = 0.0: the RHS is parsed in a very loose context, so a = 1 + 2 keeps the whole addition on the right, and a = b = 2 chains because the inner assignment still sees leftBindingPower('=')=0.1 > 0.

diff
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
+  case '=':
+		return 0.1, 0.0
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
  case '^':
		return 3.1, 3.0
	default:
		return 0, 0
	}
}
func getBindingPower(ope byte) (float64, float64) {
	switch ope {
+  case '=':
+		return 0.1, 0.0
	case '+', '-':
		return 1.0, 1.1
	case '*', '/':
		return 2.0, 2.1
  case '^':
		return 3.1, 3.0
	default:
		return 0, 0
	}
}

Variable leaves: varName marks an identifier leaf; leafVar constructs it. Numeric literals still use leafExpr, Eval will look up the name in a map.

diff
 type Expression struct {
+	varName string // non-empty => leaf variable reference
 	lhs     *Expression
 	rhs     *Expression
 	isLeaf  bool
 	ope     byte
 }

 func leafExpr(v int64) *Expression {
 	return &Expression{value: v, isLeaf: true}
 }

+func leafVar(name string) *Expression {
+	return &Expression{varName: name, isLeaf: true}
+}
+
 func nodeExpr(lhs, rhs *Expression, ope byte) *Expression {
 type Expression struct {
+	varName string // non-empty => leaf variable reference
 	lhs     *Expression
 	rhs     *Expression
 	isLeaf  bool
 	ope     byte
 }

 func leafExpr(v int64) *Expression {
 	return &Expression{value: v, isLeaf: true}
 }

+func leafVar(name string) *Expression {
+	return &Expression{varName: name, isLeaf: true}
+}
+
 func nodeExpr(lhs, rhs *Expression, ope byte) *Expression {

Threading the environment: Eval now takes vars so every subtree shares the same map. Identifier leaves read from it; the = case writes the evaluated RHS into vars under the LHS identifier and returns that value (expression-statement style).

diff
-func (e *Expression) Eval() int64 {
+func (e *Expression) Eval(vars map[string]int64) int64 {
 	if e.isLeaf {
+		if e.varName != "" {
+			return vars[e.varName]
+		}
 		return e.value
 	}
-	leftValue := e.lhs.Eval()
-	rightValue := e.rhs.Eval()
+	leftValue := e.lhs.Eval(vars)
+	rightValue := e.rhs.Eval(vars)
 	switch e.ope {

 	case '^':
 		return int64(math.Pow(float64(leftValue), float64(rightValue)))
+	case '=':
+		if e.lhs.varName == "" {
+			panic("assignment lhs must be an identifier")
+		}
+		vars[e.lhs.varName] = rightValue
+		return rightValue
 	default:
 		return 0
 	}
 }
-func (e *Expression) Eval() int64 {
+func (e *Expression) Eval(vars map[string]int64) int64 {
 	if e.isLeaf {
+		if e.varName != "" {
+			return vars[e.varName]
+		}
 		return e.value
 	}
-	leftValue := e.lhs.Eval()
-	rightValue := e.rhs.Eval()
+	leftValue := e.lhs.Eval(vars)
+	rightValue := e.rhs.Eval(vars)
 	switch e.ope {

 	case '^':
 		return int64(math.Pow(float64(leftValue), float64(rightValue)))
+	case '=':
+		if e.lhs.varName == "" {
+			panic("assignment lhs must be an identifier")
+		}
+		vars[e.lhs.varName] = rightValue
+		return rightValue
 	default:
 		return 0
 	}
 }
diff
 func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
 	t := l.Peek()
 	if len(t) > 0 && t[0] == '(' {
 		l.Pop()
 		lhs = parse(l, -1)
 		l.Pop()
 	} else if t == "-" {
 		l.Pop()
 		lhs = nodeExpr(leafExpr(0), parse(l, prefixBindingPower), '-')
+	} else if len(t) > 0 && unicode.IsLetter(rune(t[0])) {
+		lhs = leafVar(l.Pop())
 	} else {
 		v, err := strconv.ParseInt(l.Pop(), 10, 64)
 		if err != nil {
 			panic(err)
 		}
 		lhs = leafExpr(v)
 	}

 	for l.More() && l.Peek()[0] != ')' {
 		opeTok := l.Peek()
+		if opeTok == "" {
+			break
+		}
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }
 func parse(l Lexer, minBinding float64) *Expression {
 	var lhs *Expression
 	t := l.Peek()
 	if len(t) > 0 && t[0] == '(' {
 		l.Pop()
 		lhs = parse(l, -1)
 		l.Pop()
 	} else if t == "-" {
 		l.Pop()
 		lhs = nodeExpr(leafExpr(0), parse(l, prefixBindingPower), '-')
+	} else if len(t) > 0 && unicode.IsLetter(rune(t[0])) {
+		lhs = leafVar(l.Pop())
 	} else {
 		v, err := strconv.ParseInt(l.Pop(), 10, 64)
 		if err != nil {
 			panic(err)
 		}
 		lhs = leafExpr(v)
 	}

 	for l.More() && l.Peek()[0] != ')' {
 		opeTok := l.Peek()
+		if opeTok == "" {
+			break
+		}
 		leftBindingPower, rightBindingPower := getBindingPower(opeTok[0])
 		if leftBindingPower < minBinding {
 			break
 		}
 		l.Pop()
 		rhs := parse(l, rightBindingPower)
 		lhs = nodeExpr(lhs, rhs, opeTok[0])
 	}

 	return lhs
 }

Parser state: Vars persists across Parse calls if you reuse the same Parser, lazy make avoids nil maps.

diff
-type Parser struct{}
+type Parser struct {
+	Vars map[string]int64
+}

 func (p *Parser) Parse(s string) int {
+	if p.Vars == nil {
+		p.Vars = make(map[string]int64)
+	}
 	lex := NewLexer(s)
 	expr := parse(lex, -1)
-	return int(expr.Eval())
+	return int(expr.Eval(p.Vars))
 }
-type Parser struct{}
+type Parser struct {
+	Vars map[string]int64
+}

 func (p *Parser) Parse(s string) int {
+	if p.Vars == nil {
+		p.Vars = make(map[string]int64)
+	}
 	lex := NewLexer(s)
 	expr := parse(lex, -1)
-	return int(expr.Eval())
+	return int(expr.Eval(p.Vars))
 }

You can view the complete implementation here.

There are some extensions that you can implement to make the parser more powerful. For example:

  • Floating point support
  • Unary plus support
  • ...

Tagged:#Backend
0