## How Desmos uses Pratt Parsers

# What is Parsing?

When you read an expression, like `1/2+3.4`

, you can immediately understand some of its meaning. You recognize that there are three numbers, and that the numbers are combined with operators. You may also recall that division has higher precedence than addition, so you would divide `1/2`

before adding `+3.4`

when evaluating the expression.

Compare this to how you perceive `2H3SGKHJD`

. At first glance it appears to be a nonsense sequence of characters. If I told you that letters should be grouped in pairs with `G`

being a separator, your mental model might look closer to `2H 3S ; KH JD`

, which takes us a step towards understanding that this string represents hands in a card game.

Parsing is the process of taking a string of characters and converting them into an Abstract Syntax Tree (or, AST). This is a representation of the structure of the expression, for example:

```
OperatorNode(
operator: "+",
left: OperatorNode(
operator: "/",
left: 1,
right: 2
),
right: 3.4
)
```

or, as a diagram,

```
"+"
/ \
"/" 3.4
/ \
1 2
```

Such a tree is a first step towards computing the value of the expression, or rendering it beautifully.

So, how does one create an AST? At Desmos we use the approach described by Vaughan Pratt. This article will begin with what is hopefully a clear and concise explanation of how Pratt Parsing works. We will then explain our motivations for adopting this technique at Desmos and compare it to the jison parser generator, our previous approach.

Finally, we provide a sample implementation of the parser (and a lexer) in Typescript, integrated with CodeMirror. We hope this will be a useful reference and starting point for anyone interested in doing parsing in the browser.

# How does it work?

Our parse function will operate over a `tokens`

object. This is a sequence of tokens, like `[1, "/", 2, "+", 3.4]`

that is generated from our input through a process called lexing. We will not go into the details of lexing here, other than to point you at our sample implementation.

The `tokens`

object is a token stream, which allows us to `consume`

a token, returning the next token and advancing the stream. We can also peek a token, which gives us the next token without advancing the stream.

```
[1, "/", 2, "+", 3.4].consume() -> 1, ["/", 2, "+", 3.4]
[1, "/", 2, "+", 3.4].peek() -> 1, [1, "/", 2, "+", 3.4]
[].consume() -> empty token, []
[].peek() -> empty token, []
```

Let’s start with a recursive call and fill things out as we go along. We will present our approach in pseudocode, but you are welcome to reference the Typescript implementation as we go along.

```
function parse(tokens):
firstToken = tokens.consume()
if firstToken is a number
leftNode = NumberNode(firstToken)
otherwise
Error("Expected a number")
nextToken = tokens.consume()
if nextToken is an operator token
rightNode = parse(tokens)
return OperatorNode(
operator: nextToken,
leftNode,
rightNode
)
otherwise
return leftNode
```

We expect a number token followed by an optional operator. We then perform a recursive call to find the sub-expression to the right. So far so good – we start getting an idea of how parsing an expression like `3 * 2 + 1`

might work:

```
parse [3, "*", 2, "+", 1]
1 consume 3 as a NumberNode into leftNode
1 consume "*" to be nextToken
1 rightNode = parse [2, "+", 1]
2 consume 2 as a NumberNode into leftNode
2 consume "+" to be nextToken
2 rightNode = parse [1]
3 consume 1 as a NumberNode into leftNode
3 no next token
3 return:
1
2 combine 2, "+" and 1 into an OperatorNode
2 return:
"+"
/ \
2 1
1 combine 3, "*" and rightNode into an OperatorNode
1 return:
"*"
/ \
3 "+"
/ \
2 1
```