parseAndFinaliseJournal' has been removed. In the unlikely event you
needed it in your code, you can replace it with:
parseAndFinaliseJournal' parser iopts fp t =>
initialiseAndParseJournal parser iopts fp t
>>= liftEither . journalApplyAliases (aliasesFromOpts iopts)
>>= journalFinalise iopts fp t
Some parsers have been generalised from JournalParser to TextParser.
(SourcePos, SourcePos).
This has been marked for possible removal for a while. We are keeping
strictly more information. Possible edge cases arise with Timeclock and
CsvReader, but I think these are covered.
The particular motivation for getting rid of this is that
GenericSourcePos is creating some awkward import considerations for
little gain. Removing this enables some flattening of the module
dependency tree.
instead of a list of Amounts. No longer export Mixed constructor, to
keep API clean (if you really need it, you can import it directly from
Hledger.Data.Types). We also ensure the JSON representation of
MixedAmount doesn't change: it is stored as a normalised list of
Amounts.
This commit improves performance. Here are some indicative results.
hledger reg -f examples/10000x1000x10.journal
- Maximum residency decreases from 65MB to 60MB (8% decrease)
- Total memory in use decreases from 178MiB to 157MiB (12% decrease)
hledger reg -f examples/10000x10000x10.journal
- Maximum residency decreases from 69MB to 60MB (13% decrease)
- Total memory in use decreases from 198MiB to 153MiB (23% decrease)
hledger bal -f examples/10000x1000x10.journal
- Total heap usage decreases from 6.4GB to 6.0GB (6% decrease)
- Total memory in use decreases from 178MiB to 153MiB (14% decrease)
hledger bal -f examples/10000x10000x10.journal
- Total heap usage decreases from 7.3GB to 6.9GB (5% decrease)
- Total memory in use decreases from 196MiB to 185MiB (5% decrease)
hledger bal -M -f examples/10000x1000x10.journal
- Total heap usage decreases from 16.8GB to 10.6GB (47% decrease)
- Total time decreases from 14.3s to 12.0s (16% decrease)
hledger bal -M -f examples/10000x10000x10.journal
- Total heap usage decreases from 108GB to 48GB (56% decrease)
- Total time decreases from 62s to 41s (33% decrease)
If you never directly use the constructor Mixed or pattern match against
it then you don't need to make any changes. If you do, then do the
following:
- If you really care about the individual Amounts and never normalise
your MixedAmount (for example, just storing `Mixed amts` and then
extracting `amts` as a pattern match, then use should switch to using
[Amount]. This should just involve removing the `Mixed` constructor.
- If you ever call `mixed`, `normaliseMixedAmount`, or do any sort of
amount arithmetic (+), (-), then you should replace the constructor
`Mixed` with the function `mixed`. To extract the list of Amounts, use
the function `amounts`.
- If you ever call `normaliseMixedAmountSquashPricesForDisplay`, you can
replace that with `mixedAmountStripPrices`. (N.B. this does something
slightly different from `normaliseMixedAmountSquashPricesForDisplay`,
but I don't think there's any use case for squashing prices and then
keeping the first of the squashed prices around. If you disagree let
me know.)
- Any remaining calls to `normaliseMixedAmount` can be removed, as that
is now the identity function.
orgstruct-mode was dropped from org 9.2, and I shouldn't have been
forcing it on anyway.
The new config allows its "replacement", outshine-mode, to do similar
code folding when you press tab on any of the lines matching
outline-regexp. But only if you patch it as mentioned at
https://github.com/alphapapa/outshine/issues/77.
Enable it by, eg: (add-hook 'haskell-mode-hook 'outshine-mode)
The include directive now tries just one reader, based on the file
extension and defaulting to journal, like the rest of hledger.
(It doesn't yet handle a reader prefix.)
Reader-finding utilities have moved from Hledger.Read to
Hledger.Read.JournalReader so the include directive can use them.
Reader changes:
- rExperimental flag removed
- old rParser renamed to rReadFn
- new rParser field provides the actual parser.
This seems to require making Reader a higher-kinded type, unfortunately.
Now, org headlines before the first day entry are ignored,
regardless of content.
Note, blank lines inside a day entry are not allowed, currently.
It's now easier to be both valid journal and valid timedot at the same
time, so guessing the format of stdin is unreliable, and some tests
are failing. See following commit.
We previously had another parser type, 'type ErroringJournalParser =
ExceptT String ...' for throwing parse errors without the possibility of
backtracking. This parser type was removed under the assumption that it
would be possible to write our parser without this capability. However,
after a hairy backtracking bug, we would now prefer to have the option
to prevent backtracking.
- Define a 'FinalParseError' type specifically for the 'ExceptT' layer
- Any parse error can be raised as a "final" parse error
- Tracks the stack of include files for parser errors, anticipating the
removal of the tracking of stacks of include files in megaparsec 7
- Although a stack of include files is also tracked in the 'StateT
Journal' layer of the parser, it seems easier to guarantee correct
error messages in the 'ExceptT FinalParserError' layer
- This does not make the 'StateT Journal' stack redundant because the
'ExceptT FinalParseError' stack cannot be used to detect cycles of
include files
base-compat-batteries provides the same API across more ghc versions
than base-compat does, at the cost of more dependencies. Eg it exports
Prelude.Compat ((<>)) with ghc 7.10/base 4.8, which we expect.
My belief is that several of our deps already require it so the added
cost is not too great. We should probably go back to base-compat when
possible though, eg when we stop supporting ghc 7.10.
The new version of our package set apparently contains both base-compat and
base-compat-batteries in its transitive closure. This breaks the doctest suite,
which just imports everything into scope when the tests are run, thereby making
module names like Prelude.Compat ambiguous.
Older megaparsec is still supported.
Also cleans up our custom parser types,
and some text (un)packing is done in different places
(possible performance impact).
When we don't know a file's format, instead of choosing a subset of
readers based on content sniffing, now we just try them all.
Also, LedgerReader is now used only as a last resort,
as it's not yet competitive with JournalReader.
* Replace Parsec with Megaparsec (see #289)
This builds upon PR #289 by @rasendubi
* Revert renaming of parseWithState to parseWithCtx
* Fix doctests
* Update for Megaparsec 5
* Specialize parser to improve performance
* Pretty print errors
* Swap StateT and ParsecT
This is necessary to get the correct backtracking behavior, i.e. discard
state changes if the parsing fails.
The journal/timeclock/timedot parsers, instead of constructing (opaque)
journal update functions which are later applied to build the journal,
now construct the journal directly (by modifying the parser state). This
is easier to understand and debug. It also removes any possibility of
the journal updates being a space leak. (They weren't, in fact memory
usage is now slightly higher, but that will be addressed in other ways.)
Also:
Journal data and journal parse info have been merged into one type (for
now), and field names are more consistent.
The ParsedJournal type alias has been added to distinguish being-parsed
and finalised journals.
Journal is now a monoid.
stats: fixed an issue with ordering of include files
journal: fixed an issue with ordering of included same-date transactions
timeclock: sessions can no longer span file boundaries (unclocked-out
sessions will be auto-closed at the end of the file).
expandPath now throws a proper IO error (and requires the IO monad).
journal files can now include journal, timeclock or timedot files (but
not yet CSV files). Also timeclock/timedot files no longer support
default year directives.
The Hledger.Read.* modules have been reorganised for better reuse.
Hledger.Read.Utils has been renamed Hledger.Read.Common and holds
low-level parsers & utilities; high-level read utilities have moved to
Hledger.Read.
The Journal, Timelog and Timedot readers' detectors now check
each line in the sample data, not just the first one. I think
the sample data is only about 30 chars right now, but even so
this fixed a format detection issue I was seeing.