Since 1.50.3, canonicalizePath was being called wastefully when
processing journals with many nested include files and/or many matches
for include glob paths. On a slow filesystem, with unusually
many includes, this might have been quite noticeable.
Now we canonicalise each file path just once as it is encountered,
avoiding the wasted IO work.
PeriodData's use of Int keys caused wrong results with periodic
reports involving dates outside the machine-specific limits of Int.
Those were:
64 bits: -25252734927764696-04-22..25252734927768413-06-12
32 bits: -5877752-05-08..5881469-05-27
16 bits: 1769-02-28..1948-08-04
8 bits: 1858-07-12..1859-03-24
32 bits is supported by MicroHS; 16 and 8 bits aren't supported by
any known haskell version, but that could change in future.
For example, on 64 bit machines we got:
25252734927768413-06-12 PeriodData's max date
(expenses) 1
25252734927768414-01-01 next year past PeriodData's max date
(expenses) 2
$ hledger reg -O csv --yearly
"txnidx","date","code","description","account","amount","total"
"0","-25252734927764696-11-10","","","expenses","1","1"
Now it uses Integer (like the time package), fixing the bug.
And benchmarking shows memory and time usage slightly improved
(surprisingly; tested with up to 500 subperiods, eg
hledger -f examples/10ktxns-1kaccts.journal reg -1 cur:A -D >/dev/null)
This allows us to guarantee that the report periods are well-formed and
don't contain errors (e.g. empty spans, spans not contiguous, spans not
a partition).
Note the underlying representation is now for disjoint spans, whereas
previously the end date of a span was equal to the start date of the
next span, and then was adjusted backwards one day when needed.
We no longer attempt to process timeclock entries in time order - that
was a wrong requirement, probably given by me, that can't work. Now
we just process them in parse order. This plus a little tweaking of
error checking fixes several ordering bugs with overlapping sessions
and also allows same-named overlapping sessions.
More cleanup will follow. More testing might show that --old-timeclock
is no longer needed.
Now we process entries in a more careful order: time, then parse
order, like journal format. This fixes the original issue and another
one mentioned at #2417.
The default timeclock parser (ie when not using --old-timeclock) has
the following changes, related to issues such as
[#2141], [#2365], [#2400], [#2417]:
- semicolon now always starts a comment; timeclock account names can't include semicolons
(though journal account names still can)
- clock-in and clock-out entries now have different syntax
- clock-ins now require an account name
- clock-outs now can have a comment and tags
- the doc has been rewritten, and now mentions the --old-timeclock flag
- lib: accountnamep and modifiedaccountnamep now take a flag to allow semicolons or not
This allows using the special string `%account` in auto posting rules.
When run, this will be substituted with the account name of the matched
posting.
This adds a safer version of spanDefaultsFrom that won't create spans
that end before they start, and updates all reports to use it.
The only related change noticed so far is that close now gives an
error instead of a malformed entry, when there's no data to close.
[#2409]
This upgrades Account to enable it to store a multiperiod balance, with
a separate balance for each date period. This enables it do the hard
work in MultiBalanceReport.
Some new types are created to enable convenient operation of accounts.
- `BalanceData` is a type which stores an exclusive balance, inclusive
balance, and number of postings. This was previously directly stored
in Account, but is now factored into a separate data type.
- `PeriodData` is a container which stores date-indexed data, as well as
pre-period data. In post cases, this represents the report spans,
along with the historical data.
- Account becomes polymorphic, allowing customisation of the type of
data it stores. This will usually be `BalanceData`, but in
`BudgetReport` it can use `These BalanceData BalanceData` to store
both actuals and budgets in the same structure. The data structure
changes to contain a `PeriodData`, allowing multiperiod accounts.
Some minor changes are made to behaviour for consistency:
- --declared treats parent accounts consistently.
- --flat --empty ensures that implied accounts with no postings are not displayed, but
accounts with zero balance and actual postings are.