Haskell Parsec Unexpected End of Input -
here's example of file i'm trying parse:
xx00135 abcdefghij risk solutions page no : 7 beg per: 03/17/2014 current company 03/18/2014 end per: 03/18/2014 qa process - reject report 20:28:36 batch: 123456789 contrib: 987654321 - abcde fghi-san diego quote back: 1a23b45c79 code account no typ company name beg date end date err ------ -------------------- --- -------------------- -------- -------- --- 12345 1234567890001 ab abcde fghi products 20140314 20140914 059 xx00135 abcdefghij risk solutions page no : 8 beg per: 03/17/2014 current company 03/18/2014 end per: 03/18/2014 qa process - reject report 20:28:36 batch: 234567890 contrib: 987654321 - abcde fghi-san diego quote back: 5f7a657g87 code account no typ company name beg date end date err ------ -------------------- --- -------------------- -------- -------- --- 12346 2345678901 ab abcde fghi products 20140129 20140729 059 12346 3456789012 ab abcde fghi products 20140317 20140917 059 xx00135 abcdefghij risk solutions page no : 9 beg per: 03/17/2014 current company 03/18/2014 end per: 03/18/2014 qa process - reject report 20:28:36 batch: 345678901 contrib: 987654321 - abcde fghi-san diego quote back: 6k75l8791l code account no typ company name beg date end date err ------ -------------------- --- -------------------- -------- -------- --- 12346 4567890123 ab abcde fghi products 20140317 20140917 059 12346 4567890123 ab abcde fghi products 20140317 20140917 059 number of sets rejected : 13 total sets in batch: 16,940 *** end of report ***
and here collection of snippets module:
module xx00135 (parsefile) import control.applicative ((<$>), (<*>), (<*)) import text.parsercombinators.parsec hiding (line) data line = line { code :: string , account :: string , atype :: string , company :: string , begdate :: string , enddate :: string , errcode :: string } data page = page { periodbeginning :: string , periodend :: string , reportdate :: string , batch :: string , contrib :: string , quoteback :: string , linelist :: [line] } data report = report { pages :: [page] } parsereportdate :: parser string parsereportdate = manytill anychar (string "current company") >> spaces >> count 10 anychar headers :: parser string headers = choice [ try (string "\n") , try (string "code account no typ company name beg date end date err") , try (string "------ -------------------- --- -------------------- -------- -------- ---") ] line :: parser line line = line <$> count 6 anychar <* space <*> count 20 anychar <* space <*> count 3 anychar <* space <*> count 20 anychar <* space <*> count 8 anychar <* space <*> count 8 anychar <* space <*> count 3 anychar <* newline page :: parser page page = page <$> (manytill anychar (string "beg per:") >> space >> count 10 anychar) <*> parsereportdate <*> (manytill anychar (string "end per:") >> space >> count 10 anychar) <*> (manytill anychar (string "batch:") >> space >> count 9 anychar) <*> (space >> string "contrib:" >> space >> count 9 anychar) <*> (manytill anychar (string "quote back:") >> space >> count 10 anychar <* skipmany1 headers) <*> (manytill line (twonewlines <|> footer)) report :: parser report report = report <$> manytill page (try footer) twonewlines :: parser () twonewlines = (count 2 newline) >> return () footer :: parser () footer = (space >> string "number of sets rejected" >> manytill anychar (string "*** end of report ***") >> optional eof) >> return () parsefile :: [(string, string)] -> string -> string parsefile errors text = let rs = case parse (manytill report eof) "" text of ...
there 115 lines in full file. when cat
file , pipe haskell, get:
(line 116, column 1); unexpected end of input expecting "beg per:"
i had working ignoring footer , followed. full use case cat
multiple files , pipe haskell, meaning cannot throw away footer , follows it. once started trying ignore footer instead of throwing away, problems began. it's simple, , i'm confused , over-looking obvious.
let me know if need more code. few transformations after parsing, , didn't want clutter code unnecessary detail.
thanks!
i've resolved problem. code little different, , i'm not sure solved problem. spent lot of time staring @ code , making little changes here , there. think, though, had cat
appending newline
file. changed footer
:
footer = space >> string "number of sets rejected" >> anychar `manytill` (string "*** end of report ***") >> newline >> string ""
now footer consumes newline
@ end of file, , returns string. use footer
in eop
(end of page):
eop = choice [ count 2 newline , footer ]
and use eop
in last line of page
:
<*> line `manytill` eop
report
now:
report = count 2 newline >> report <$> many page
i changed page
. think consuming anychar
in unexpected ways. throw away first line of each page:
page = firstline >> page <$> (string "beg per:" >> space >> count 10 anychar) ... firstline = string "xx00135 abcdefghij risk solutions page no :" >> spaces > many digit >> newline
i think covers important changes made made parse successful. parses single file cat
command, multiple files concatenated cat
command. yay! love haskell.
Comments
Post a Comment