* Replace Parsec with Megaparsec (see #289)
This builds upon PR #289 by @rasendubi
* Revert renaming of parseWithState to parseWithCtx
* Fix doctests
* Update for Megaparsec 5
* Specialize parser to improve performance
* Pretty print errors
* Swap StateT and ParsecT
This is necessary to get the correct backtracking behavior, i.e. discard
state changes if the parsing fails.
With --debug=2, better information about assertions is printed.
Balance assertion errors now have a more standard and parseable layout.
The asserted balance is now shown with the diff, let's see if that's better.
There is a limitation/bug: disabling real mode in the transaction screen
won't show the non-real postings if it was entered from a real-mode
register screen.
The journal/timeclock/timedot parsers, instead of constructing (opaque)
journal update functions which are later applied to build the journal,
now construct the journal directly (by modifying the parser state). This
is easier to understand and debug. It also removes any possibility of
the journal updates being a space leak. (They weren't, in fact memory
usage is now slightly higher, but that will be addressed in other ways.)
Also:
Journal data and journal parse info have been merged into one type (for
now), and field names are more consistent.
The ParsedJournal type alias has been added to distinguish being-parsed
and finalised journals.
Journal is now a monoid.
stats: fixed an issue with ordering of include files
journal: fixed an issue with ordering of included same-date transactions
timeclock: sessions can no longer span file boundaries (unclocked-out
sessions will be auto-closed at the end of the file).
expandPath now throws a proper IO error (and requires the IO monad).
The commodity directive's format subdirective can now be used to
override the inferred style for a commodity, eg to increase or decrease
the precision. This doesn't fix the root cause of #295 but is at least a
good workaround.
Bracketed posting dates were fragile; they worked only if you wrote full
10-character dates. Also some semantics were a bit unclear. Now they
should be robust, and have been documented more clearly. This is a
legacy undocumented Ledger syntax, but it improves compatibility and
might be preferable to the more verbose "date:" tags if you write
posting dates often (as I do).
Internally, bracketed posting dates are no longer considered to be tags.
Journal comment, tag, and posting date parsers have been reworked, all
with doctests. Also the journal parser types generally have been
tightened up and clarified, making it much easier to know how to combine
and run them. There's now
-- | A parser of strings with generic user state, monad and return type.
type StringParser u m a = ParsecT String u m a
-- | A string parser with journal-parsing state.
type JournalParser m a = StringParser JournalContext m a
-- | A journal parser that runs in IO and can throw an error mid-parse.
type ErroringJournalParser a = JournalParser (ExceptT String IO) a
and corresponding convenience functions (and short aliases) for running them.
We now parse account directives, like Ledger's. We don't do anything
with them yet. The default parent account feature must now be spelled
"apply account"/"end apply account".
Amount display style canonicalisation code and terminology has been
clarified a bit. Individual amounts still have styles; from these we
derive the standard "commodity styles". In user docs, we might call
these "commodity formats" since a Ledger-compatible commodity directive
would use the "format" keyword.
Since market price amounts didn't contribute to the canonical commodity
styles, they were being reset to the null style. And this propagated to
the reported amounts when -V was in effect, causing much confusion.
Now, market prices contribute to canonicalisation and the expected
styles are preserved even with -V.
cf https://github.com/simonmichael/hledger/issues/131#issuecomment-133545140
print now always right-aligns the amounts in an entry, even when they
are wider than 12 characters.
If there is a price, it's considered part of the amount for
right-alignment. Maybe it would be nicer to put amounts and prices in
separate columns ? That will get a little complicated, needs more
discussion/design.
Also some cleanup of postingAsLines.
The print command wasn't lining up amounts with wide chars in account
names, fixed it properly this time. Transaction and Posting's Show instances
should also be wide-char-aware now.
Simple (non-multicolumn) balance reports containing wide characters
should now align correctly (in apps and fonts that show wide chars as
double width). Likewise, the print command.
Wide characters, eg chinese/japanese/korean characters, are typically
rendered wider than latin characters. In some applications (eg gnome
terminal or osx terminal) and fonts (eg monaco) they are exactly double
width. This is a start at making hledger aware of this. A register
report containing wide characters (in descriptions, account names, or
commodity symbols) should now align its columns correctly, when viewed
with a suitable font and application.
This adds a accountNameApplyAliasesMemo, which memoises the result of
applying a set of aliases (simple and regex) to an account name. In
theory this should reduce more repetitive work, but in practice it
doesn't seem to make a difference, so it's unused for now.
Roughly speaking, the time to apply regular expression account aliases
was O(aliases x transactions), and should now be O(aliases x accounts).
Also, the constant factor was reduced a lot by the recent commit
memoising toRegex. So now, regex aliases should be "free" like simple
aliases - use as many as you want, the slowdown shouldn't be noticeable.