Parser
Parser
This package implements a basic Parser Combinator for Roc which is useful for transforming input into a more useful structure.
Example
For example, say we wanted to parse the following string from in
to out
:
in = "Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green" out = { id: 1, requirements: [ [Blue 3, Red 4], [Red 1, Green 2, Blue 6], [Green 2], ] }
We could do this using the following:
Requirement : [ Green U64, Red U64, Blue U64 ] RequirementSet : List Requirement Game : { id: U64, requirements: List RequirementSet } parseGame : Str -> Result Game [ParsingError] parseGame = \s -> green = const Green |> keep digits |> skip (string " green") red = const Red |> keep digits |> skip (string " red") blue = const Blue |> keep digits |> skip (string " blue") requirementSet : Parser _ RequirementSet requirementSet = (oneOf [green, red, blue]) |> sepBy (string ", ") requirements : Parser _ (List RequirementSet) requirements = requirementSet |> sepBy (string "; ") game : Parser _ Game game = const (\id -> \r -> { id, requirements: r }) |> skip (string "Game ") |> keep digits |> skip (string ": ") |> keep requirements when parseStr game s is Ok g -> Ok g Err (ParsingFailure _) | Err (ParsingIncomplete _) -> Err ParsingError
Parser input a
Opaque type for a parser that will try to parse an a
from an input
.
As such, a parser can be considered a recipe for a function of the type
input -> Result {val: a, input: input} [ParsingFailure Str]
How a parser is actually implemented internally is not important and this might change between versions; for instance to improve efficiency or error messages on parsing failures.
ParseResult input a
ParseResult input a : Result { val : a, input : input } [ParsingFailure Str]
buildPrimitiveParser : (input -> ParseResult input a) -> Parser input a
Write a custom parser without using provided combintors.
parsePartial : Parser input a, input -> ParseResult input a
Most general way of running a parser.
Can be thought of as turning the recipe of a parser into its actual parsing function and running this function on the given input.
Moat parsers consume part of input
when they succeed. This allows you to string parsers
together that run one after the other. The part of the input that the first
parser did not consume, is used by the next parser.
This is why a parser returns on success both the resulting value and the leftover part of the input.
This is mostly useful when creating your own internal parsing building blocks.
parse : Parser input a, input, (input -> Bool) -> Result a [ ParsingFailure Str, ParsingIncomplete input ]
Runs a parser on the given input, expecting it to fully consume the input
The input -> Bool
parameter is used to check whether parsing has 'completed',
i.e. how to determine if all of the input has been consumed.
For most input types, a parsing run that leaves some unparsed input behind should be considered an error.
fail : Str -> Parser * *
Parser that can never succeed, regardless of the given input. It will always fail with the given error message.
This is mostly useful as a 'base case' if all other parsers
in a oneOf
or alt
have failed, to provide some more descriptive error message.
const : a -> Parser * a
Parser that will always produce the given a
, without looking at the actual input.
This is useful as a basic building block, especially in combination with
map
and apply
.
parseU32 : Parser (List U8) U32 parseU32 = const Num.toU32 |> keep digits expect parseStr parseU32 "123" == Ok 123u32
alt : Parser input a, Parser input a -> Parser input a
Try the first
parser and (only) if it fails, try the second
parser as fallback.
apply : Parser input (a -> b), Parser input a -> Parser input b
Runs a parser building a function, then a parser building a value, and finally returns the result of calling the function with the value.
This is useful if you are building up a structure that requires more parameters
than there are variants of map
, map2
, map3
etc. for.
For instance, the following two are the same:
const (\x, y, z -> Triple x y z) |> map3 String.digits String.digits String.digits const (\x -> \y -> \z -> Triple x y z) |> apply String.digits |> apply String.digits |> apply String.digits
Indeed, this is how map
, map2
, map3
etc. are implemented under the hood.
Currying
Be aware that when using apply
, you need to explicitly 'curry' the parameters to the construction function.
This means that instead of writing \x, y, z -> ...
you'll need to write \x -> \y -> \z -> ...
.
This is because the parameters of the function will be applied one by one as parsing continues.
oneOf : List (Parser input a) -> Parser input a
Try a list of parsers in turn, until one of them succeeds.
color : Parser Utf8 [Red, Green, Blue] color = oneOf [ const Red |> skip (string "red"), const Green |> skip (string "green"), const Blue |> skip (string "blue"), ] expect parseStr color "green" == Ok Green
map : Parser input a, (a -> b) -> Parser input b
Transforms the result of parsing into something else, using the given transformation function.
map2 : Parser input a, Parser input b, (a, b -> c) -> Parser input c
Transforms the result of parsing into something else, using the given two-parameter transformation function.
map3 : Parser input a, Parser input b, Parser input c, (a, b, c -> d) -> Parser input d
Transforms the result of parsing into something else, using the given three-parameter transformation function.
If you need transformations with more inputs,
take a look at apply
.
flatten : Parser input (Result a Str) -> Parser input a
Removes a layer of Result
from running the parser.
Use this to map functions that return a result over the parser,
where errors are turned into ParsingFailure
s.
# Parse a number from a List U8 u64 : Parser Utf8 U64 u64 = string |> map \val -> when Str.toU64 val is Ok num -> Ok num Err _ -> Err "$(val) is not a U64." |> flatten
lazy : ({} -> Parser input a) -> Parser input a
Runs a parser lazily
This is (only) useful when dealing with a recursive structure.
For instance, consider a type Comment : { message: String, responses: List Comment }
.
Without lazy
, you would ask the compiler to build an infinitely deep parser.
(Resulting in a compiler error.)
maybe : Parser input a -> Parser input (Result a [Nothing])
many : Parser input a -> Parser input (List a)
A parser which runs the element parser zero or more times on the input, returning a list containing all the parsed elements.
Also see Parser.oneOrMore
.
oneOrMore : Parser input a -> Parser input (List a)
A parser which runs the element parser one or more times on the input, returning a list containing all the parsed elements.
Also see Parser.many
.
between : Parser input a, Parser input open, Parser input close -> Parser input a
Runs a parser for an 'opening' delimiter, then your main parser, then the 'closing' delimiter, and only returns the result of your main parser.
Useful to recognize structures surrounded by delimiters (like braces, parentheses, quotes, etc.)
betweenBraces = \parser -> parser |> between (scalar '[') (scalar ']')
sepBy1 : Parser input a, Parser input sep -> Parser input (List a)
sepBy : Parser input a, Parser input sep -> Parser input (List a)
parseNumbers : Parser (List U8) (List U64) parseNumbers = digits |> sepBy (codeunit ',') expect parseStr parseNumbers "1,2,3" == Ok [1,2,3]
ignore : Parser input a -> Parser input {}
keep : Parser input (a -> b), Parser input a -> Parser input b
skip : Parser input a, Parser input * -> Parser input a
chompUntil : a -> Parser (List a) (List a) where a implements Eq
Match zero or more codeunits until the it reaches the given codeunit. The given codeunit is not included in the match.
This can be used with Parser.skip
to ignore text.
ignoreText : Parser (List U8) U64 ignoreText = const (\d -> d) |> skip (chompUntil ':') |> skip (codeunit ':') |> keep digits expect parseStr ignoreText "ignore preceding text:123" == Ok 123
This can be used with Parser.keep
to capture a list of U8
codeunits.
captureText : Parser (List U8) (List U8) captureText = const (\codeunits -> codeunits) |> keep (chompUntil ':') |> skip (codeunit ':') expect parseStr captureText "Roc:" == Ok ['R', 'o', 'c']
Use String.strFromUtf8
to turn the results into a Str
.
Also see Parser.chompWhile
.
chompWhile : (a -> Bool) -> Parser (List a) (List a) where a implements Eq
Match zero or more codeunits until the check returns false. The codeunit that returned false is not included in the match. Note: a chompWhile parser always succeeds!
This can be used with Parser.skip
to ignore text.
This is useful for chomping whitespace or variable names.
ignoreNumbers : Parser (List U8) Str ignoreNumbers = const (\str -> str) |> skip (chompWhile \b -> b >= '0' && b <= '9') |> keep (string "TEXT") expect parseStr ignoreNumbers "0123456789876543210TEXT" == Ok "TEXT"
This can be used with Parser.keep
to capture a list of U8
codeunits.
captureNumbers : Parser (List U8) (List U8) captureNumbers = const (\codeunits -> codeunits) |> keep (chompWhile \b -> b >= '0' && b <= '9') |> skip (string "TEXT") expect parseStr captureNumbers "123TEXT" == Ok ['1', '2', '3']
Use String.strFromUtf8
to turn the results into a Str
.
Also see Parser.chompUntil
.