Tuesday, 15 July 2014

haskell - Constructing a tuple of length n from a sequence of parser combinators -



haskell - Constructing a tuple of length n from a sequence of parser combinators -

question. there way define operator such sequence of values separated operator yields flat tuple?

i have struggled find concise phrasing of question, read on details , examples...

description

i'm writing myself few helper operators parsec, starting following:

(<@) :: genparser tok st -> (a -> b) -> genparser tok st b p <@ f = { x <- p ; homecoming (f x) } (<>) :: genparser tok st -> genparser tok st b -> genparser tok st (a, b) p <> q = { x <- p ; y <- q ; homecoming (x, y) }

these work follows:

parsetest ((many upper) <@ length) "hello" 5 parsetest (many upper) <> (many lower) "abcdef" ("abc", "def")

unfortunately, sequence of parsers separated <> result in nested tuple, e.g.:

parsetest (subject <> verb <> object) "edmund learns haskell" (("edmund", "learns"), "haskell")

instead of relatively more cromulent:

("edmund", "learns", "haskell")

i'm looking way define <> that

p1 :: genparser tok st ; p2 :: genparser tok st b ; p3 :: genparser tok st c p1 <> p2 :: genparser tok st (a, b) p1 <> p2 <> p3:: genparser tok st (a, b, c) ...

i not think have ever seen haskell programme tuple type of length n (known @ compile time) constructed this. , suspect may hard define operator both types:

genparser tok st -> genparser tok st b) -> genparser tok st (a, b) genparser tok st (a, b) -> genparser tok st c) -> genparser tok st (a, b, c)

-- how 1 tell, @ compile time, difference between tuple resulting <>, , 1 intended homecoming type other parser? can speculate additional syntax required.

so, not @ sure it's thought or possible. curious know how if not thought case (and love know how if it's impossible!).

followup question (if crazy scheme possible). how 1 annotate 1 item in chain of <>s left out of result tuple?

for example, assuming postfix annotation <#:

p1 :: genparser tok st p2 :: genparser tok st b p1 <> keyword "is" <# <> p2 :: genparser tok st (a, b) background

circa 2006 learned parser combinators @ university. used library in <@, , believe <> operator, nowadays , performed attempts. not know library was; may have been written graduate pupil teaching our class. in case, not seem either parsec nor base of operations parser combinators in `text.parser.combinators.

bonus question. difference between base of operations parser combinators in text.parsercombinators.readp/readprec , ones in parsec?

i seem recall library nondeterministic, each parser invocation returning set of possible parses , remaining unparsed input each. (a successful, complete, unambiguous parse result in [(parseresult, "")].)

final question. if sounds you've heard of, allow me know (for nostalgia's sake) ?

functor

i’d direct attending something:

(<@) :: genparser tok st -> (a -> b) -> genparser tok st b flip fmap :: (functor f) => f -> (a -> b) -> f b

do notice similarity? if replace f genparser tok st in type of flip fmap, type of (<@). furthermore, not theoretical: genparser tok st instance of functor. fmap available in operator form, named <$>. so, can rewrite code in 2 ways (original first):

ghci> parsetest ((many upper) <@ length) "hello" 5 ghci> parsetest (fmap length (many upper)) "hello" 5 ghci> parsetest (length <$> (many upper)) "hello" 5 applicative

functors nice, aren’t powerful plenty subsume sec example. fortunately, there typeclass is: applicative. now, applicative not have function form monadic action yielding pair out of 2 monadic actions, does provide helpful building blocks. in particular, provides <*>:

(<*>) :: f (a -> b) -> f -> f b

it turns out can compose <$> rewrite sec example:

ghci> parsetest (many upper) <> (many lower) "abcdef" ("abc", "def") ghci> parsetest ((,) <$> many upper <*> many lower) "abcdef" ("abc", "def")

if you’re not familiar syntax, (,) function creates pair; has type a -> b -> (a, b).

but applicative not stop there. looking way flatten tuple; rather creating nested tuples , flattening them out, can utilize applicative create triple directly:

ghci> parsetest ((,,) <$> subject <*> verb <*> object) "edmund learns haskell" ("edmund", "learns", "haskell")

and happens applicative has another operator help lastly request, too: <* , *>, ignore result of sec or result of first operand, respectively. if want ignore verb:

ghci> parsetest ((,) <$> subject <* verb <*> object) "edmund learns haskell" ("edmund", "haskell") bonus question

if recall correctly, readp allows backtracking @ each step; parsec, on other hand, not allow backtracking unless straight or indirectly annotate in parser using try combinator or other combinator uses combinator. such, parsec parsers not backtrack much, , can have improve worst-case performance.

appendix: implementing <>…almost (not faint of heart)

i don’t know of way implement exact <>, or create work all sizes of tuples, if you’re willing enable few haskell extensions, can eliminate need counting, working arbitrary fixed size of tuples. first, we’ll want have type 1-tuple:

newtype 1 = 1 deriving (eq, read, show) -- derive more if want

now, want function these types:

(<>) :: (applicative f) => f () -> f -> f (one a) (<>) :: (applicative f) => f (one a) -> f b -> f (a, b) (<>) :: (applicative f) => f (a, b) -> f c -> f (a, b, c) (<>) :: (applicative f) => f (a, b, c) -> f d -> f (a, b, c, d) -- ...

how have function multiple types in haskell? typeclasses, of course! ordinary typeclass won’t it: need functional dependencies. also, can’t think of appropriate name, i’ll phone call c. (if can think of improve name, tell me in comments , i’ll edit.)

{-# language functionaldependencies #-} class c b c | b -> c, c -> b (<>) :: (applicative f) => f -> f b -> f c

then, implement our instances, need flexibleinstances:

{-# language flexibleinstances #-} instance c () (one a) (<>) _ = fmap 1 instance c (one a) b (a, b) (<>) = lifta2 $ \(one a) b -> (a, b) instance c (a, b) c (a, b, c) (<>) = lifta2 $ \(a, b) c -> (a, b, c) instance c (a, b, c) d (a, b, c, d) (<>) = lifta2 $ \(a, b, c) d -> (a, b, c, d) -- ...

now can write parsers this:

parsetest (return () <> subject <> verb <> object) "edmund learns haskell" ("edmund", "learns", "haskell")

we did have write return () <> before it, undesirable. retain prior implementation of <> , rename our new implementation <+>, , write parsers this:

parsetest (subject <+> verb <> object) "edmund learns haskell" ("edmund", "learns", "haskell")

(<+> used bring together parsers after first two) there might way create single operator using overlappinginstances, obvious solution violates functional dependencies.

the question needs asked

if you’re considering using latter approach, advise not to. first, if you’re going utilize tuples, it’s not hard count number of items want parse. second, won’t using tuples in first place, , approach works if trying tuples. makes more sense tuple? well, ast node. example, if writing parser programming language, might have type this:

data look = ... | ifexpression look expression look | ... deriving (...)

then, parse it, can utilize applicative:

parseifexpression = ifexpression <$> keyword "if" *> look <* keyword "then" <*> look <* keyword "else" <*> look

here, don’t need count number of items; it’s encapsulated in type of ifexpression type constructor. since you’re going parsing asts, tuples going irrelevant, , such complex solution seems unjustified, particularly since alternative when have utilize tuples uncomplicated (counting number of items , inserting appropriate tuple constructor).

haskell parser-combinators

No comments:

Post a Comment