* csv rules: Show prettier parsing errors
This goes from
hledger: user error ("ParseError {errorPos = SourcePos {sourceName = \"foo.csv.rules\",
sourceLine = Pos 20, sourceColumn = Pos 1} :| [], errorUnexpected =
fromList [Tokens (' ' :| \"\")], errorExpected = fromList [Label ('b' :| \"lank or comment
line\"),EndOfInput], errorCustom = fromList []}")
to
hledger: user error (foo.csv.rules:20:1:
unexpected space
expecting blank or comment line or end of input
)
* csv rules: Fix parsing of empty field values
A single line containing `account1 ` (note the space at the end) should
parse as assignment of the empty string to account1. At least it did
until commit 4141067.
The problem is that megaparsec's `space` parses multiple space
characters as opposed to parsec. So in the example above it would
incorrectly consume the newline.
This commit also adds a new test case for this bug.
Transactions are now numbered consistently during journal finalisation,
rather than just in the journal reader. Also transaction knot-tying has been
moved out of journalBalanceTransactions.
* Replace Parsec with Megaparsec (see #289)
This builds upon PR #289 by @rasendubi
* Revert renaming of parseWithState to parseWithCtx
* Fix doctests
* Update for Megaparsec 5
* Specialize parser to improve performance
* Pretty print errors
* Swap StateT and ParsecT
This is necessary to get the correct backtracking behavior, i.e. discard
state changes if the parsing fails.
The journal/timeclock/timedot parsers, instead of constructing (opaque)
journal update functions which are later applied to build the journal,
now construct the journal directly (by modifying the parser state). This
is easier to understand and debug. It also removes any possibility of
the journal updates being a space leak. (They weren't, in fact memory
usage is now slightly higher, but that will be addressed in other ways.)
Also:
Journal data and journal parse info have been merged into one type (for
now), and field names are more consistent.
The ParsedJournal type alias has been added to distinguish being-parsed
and finalised journals.
Journal is now a monoid.
stats: fixed an issue with ordering of include files
journal: fixed an issue with ordering of included same-date transactions
timeclock: sessions can no longer span file boundaries (unclocked-out
sessions will be auto-closed at the end of the file).
expandPath now throws a proper IO error (and requires the IO monad).
journal files can now include journal, timeclock or timedot files (but
not yet CSV files). Also timeclock/timedot files no longer support
default year directives.
The Hledger.Read.* modules have been reorganised for better reuse.
Hledger.Read.Utils has been renamed Hledger.Read.Common and holds
low-level parsers & utilities; high-level read utilities have moved to
Hledger.Read.
A transaction/posting status of ! (pending) was effectively equivalent
to * (cleared). Now it's a separate state, not matched by --cleared.
The new Ledger-compatible --pending flag matches it, and so does
--uncleared. The equivalent search queries are now status:*, status:!
and status: (the old status:1 and status:0 spellings are deprecated).
Since we interpret --uncleared and status: as "any state except cleared",
it's not currently possible to match things which are neither cleared
nor pending.
If the CSV records appear to have been in reverse date order,
we'll now reverse them all before also sorting by transaction date,
so that the original order of same-day transactions is preserved.
We detect this using a simple heuristic: if the first converted
transaction's date is later than the last's.
Can be helpful when reading Ledger files, where assertions may have
different semantics; or for getting some answers from your journal
to help you fix your assertions.
Could be called --no-assertions, but this might create surprise when it
has an effect contrary to --no-new-accounts.
I had to add another flag throughout the parsers & journal read
functions, ok for now.
- The CSV reader no longer writes a "(stdin).rules" file when reading
from stdin.
- Selection of reader(s) is now smarter when input is coming from stdin.
Previously, all readers were considered applicable for stdin. This
meant that when reading a CSV file from stdin, the journal and timelog
readers were always tried first, and if the CSV file was unparseable,
you'd see the first (journal) reader's error instead of the CSV
reader's. Now, the readers do some basic content sniffing when
reading stdin, so it generally tries only the one right reader and
we'll see the right errors.
- The read system now has more debug output.
The debug level set by `--debug[=N]` is now available to pure and
startup code as debugLevel, using unsafePerformIO.
`dbg LABEL ...` is now the go-to helper for tracing values on the
console; it produces output when the debug level is non-zero. `dbgExit`
is similar but exits immediately, avoiding further output. The
`dbgshow`, `dbgppshow` and `dbgpprint` variants allow control over the
pretty-printing method and required debug level, allowing more control
over what is displayed when.
Other cleanups: lstrace -> ltrace, pdbgAt -> pdbg, tracewith -> traceWith.
At long last. The main change is a new rules file format that aims to
be more powerful and more intuitive than v1 (hledger 0.19.x and older).
Existing rules files will need to be adapted manually to the new format.