- parse a period expression by first extracting words separated by
single spaces, then by "re-parsing" this text with 'periodexprp'
- this way, the period expression parsers do not need to know about
the single- or double-space rules
* journal: Get rid of `journalFinalise` and use granular functions
Complete the process started in 53b3e2bd. This gets rid of the
`journalFinalise` function and uses the smaller steps, in order to
have more granular control.
* journal: Change order of operations in finalization
We want to make sure that we add the filepath after the order is
reversed, so the added filepath is on the head and not the tail (as it
would be if it were reversed after it was added).
* journal: Refine granular finalization functions
This commit fixes two of the granular finalization functions:
1. Rename `journalSetTime` to `journalSetLastReadTime` and improve
documentation.
2. Remove `journalSetFilePath`. It's redundant with `journalAddFile`
currently in `Hledger.Read.Common`. The only difference between the
functions is where the file is added (we keep the one in which it
is added to the tail), so we change the position vis-a-vis
reversal.
`journalFinalise` is only used in the `parseAndFinaliseJournal`
functions, but it needs to be run differently at different stages when
transaction modifiers are applied. This change breaks it into smaller
functions, and uses those smaller parts in `parseAndFinaliseJournal`
as needed.
Previously we ran if `--auto` was set. But this adds a small
performance hit if `--auto` becomes default. Now we only run twice if
there are transactionModifiers AND `--auto` is set. So even if auto is
specified, there will be no penalty if there are no modifiers.
Currently, automated transactions are added before the journal is
finalized. This means that no inferred values will be picked up. We
change the procedure, if `auto_` is set, to
1. first run `journalFinalise` without assertion checking (assertions
might be wrong until automated transactions), but with reordering
2. Insert transaction modifiers
3. Run `journalFinalise` again, this time with assertion checking as
set in the options, and without reordering.
If `auto_` is not set, all works as before.
Closes: #893
Previously you had to use one of the standard english account names
(assets, liabilities..) for top-level accounts, if you wanted to use
the bs/bse/cf/is commands.
Now, account directives can specify which of the big five categories
an account belongs to - asset, liability, equity, revenue or expense -
by writing one of the letters A, L, E, R or X two or more spaces after
the account name (where the numeric account code used to be).
This might change. Some thoughts influencing the current syntax:
- easy to type and read
- does not require multiple lines
- does not depend on any particular account numbering scheme
- allows more types later if needed
- still anglocentric, but only a little
- could be treated as syntactic sugar for account tags later
- seems to be compatible with (ignored by) current Ledger
The current design permits unlimited account type declarations anywhere
in the account tree. So you could declare a liability account somewhere
under assets, and maybe a revenue account under that, and another asset
account even further down. In such cases you start to see oddities like
accounts appearing in multiple places in a tree-mode report. In theory
the reports will still behave reasonably, but this has not been tested
too hard. In any case this is clearly too much freedom. I have left it
this way, for now, in case it helps with:
- modelling contra accounts ?
- multiple files. I suspect the extra expressiveness may come in handy
when combining multiple files with account type declarations,
rewriting account names, apply parent accounts etc.
If we only allowed type declarations on top-level accounts, or
only allowed a single account of each type, complications seem likely.
- Parse errors encountered in include files are treated as "final" parse
errors in the parent file, preventing backtracking and fixing an issue
in #853
We previously had another parser type, 'type ErroringJournalParser =
ExceptT String ...' for throwing parse errors without the possibility of
backtracking. This parser type was removed under the assumption that it
would be possible to write our parser without this capability. However,
after a hairy backtracking bug, we would now prefer to have the option
to prevent backtracking.
- Define a 'FinalParseError' type specifically for the 'ExceptT' layer
- Any parse error can be raised as a "final" parse error
- Tracks the stack of include files for parser errors, anticipating the
removal of the tracking of stacks of include files in megaparsec 7
- Although a stack of include files is also tracked in the 'StateT
Journal' layer of the parser, it seems easier to guarantee correct
error messages in the 'ExceptT FinalParserError' layer
- This does not make the 'StateT Journal' stack redundant because the
'ExceptT FinalParseError' stack cannot be used to detect cycles of
include files
- Don't immediately throw custom parse errors into 'ParsecT'; rather,
just construct and return them
- This anticipates the re-implementation of an 'ExceptT' layer of the
parser, which should be able throw custom parse errors
- In anticipation of megaparsec 7, which removes support for stacks of
include files (as far as I can tell)
- Intended for the 'StateT Journal' layer of the parser
- A stack of include files would be better in a 'ReaderT' layer, but I
don't want to add another layer to the parser
- Intended for detecting cycles of include files
- Potential issue: for proper error messages for include file cycles,
we must remember to provide the filepath of the root journal file via
the initial journal state passed to a 'JournalParser'; I imagine
that we may forget to do so because in all other cases it is okay
not to do so.
A bunch of account sorting changes that got intermingled.
First, account codes have been dropped. They can still be parsed and
will be ignored, for now. I don't know if anyone used them.
Instead, account display order is now controlled by the order of account
directives, if any. From the mail list:
I'd like to drop account codes, introduced in hledger 1.9 to control
the display order of accounts. In my experience,
- they are tedious to maintain
- they duplicate/compete with the natural tendency to arrange account
directives to match your mental chart of accounts
- they duplicate/compete with the tree structure created by account
names
and it gets worse if you think about using them more extensively,
eg to classify accounts by type.
Instead, I plan to just let the position (parse order) of account
directives determine the display order of those declared accounts.
Undeclared accounts will be displayed after declared accounts,
sorted alphabetically as usual.
Second, the various account sorting modes have been implemented more
widely and more correctly. All sorting modes (alphabetically, by account
declaration, by amount) should now work correctly in almost all commands
and modes (non-tabular and tabular balance reports, tree and flat modes,
the accounts command). Sorting bugs have been fixed, eg #875.
Only the budget report (balance --budget) does not yet support sorting.
Comprehensive functional tests for sorting in the accounts and balance
commands have been added. If you are confused by some sorting behaviour,
studying these tests is recommended, as sorting gets tricky.
This makes skipping/unskipping tests easier, and improves readability
a bit.
Note it's also possible to just write the test name with no preceding
function, when the type is constrained (see Journal.hs).
Same-line & next-line comments of transactions, postings, etc.
are now parsed a bit more precisely. Previously parsing no comment
gave the same result as an empty comment (a single newline); now
it gives an empty string.
Also, and perhaps as a consequence of the above, when there's no
same-line comment but there is a next-line comment, we'll insert an
empty first line, otherwise next-line comments would get moved up to
the same line when rendered.
Some doctests have been added.
This removes transactionModifierToFunction's extra query parameter;
the rewrite command sets it in the TransactionModifier instead, which
I think is equivalent. I had to change one functional test, but it
seems correct now, so perhaps it wasn't working right before ?
I was negligent and did not test enough. This should ignore
transaction comments in auto posting rules more safely.
It also adds support for trailing comments on the first line of auto
posting rules, which previously were misparsed as part of the query.
'fail' will just terminate the current parse branch, whereas here
we have encountered a definite error. Also bring the code to
get the current working directory inside 'getFilePaths', as it
logically belongs there.
Field names are supposed to be case insensitive, but a field assignment like
fields ...,Transaction_Date,...
date %Transaction_Date
was failing, because of the capitalised letters. Fixed now.
- expands the set of expected tokens when e.g. parsing the invalid
posting `account $1 a`
- whitespace can affect parse errors because of the longest match rule
where errors that occur later take precedence over those that occur
earlier
- inline `spaceamountormissingp` into `postingp`
- combine `rightsymbolamountp` and `nosymbolamountp`
- the multiplier symbol '*' for an amount must now always preceed a sign '-'
[breaking change]
- make amount parser labels more generic to simplify error messages
For Data/Dates.hs in particular:
- Changed `SimpleTextParser` to `TextParser m` for all parsers
- Changed `string` to the case-insensitive `string'` to match the
behaviour of `T.toLower` found in `parsePeriodExpr`
- export `periodexprp` for "direct" use
base-compat-batteries provides the same API across more ghc versions
than base-compat does, at the cost of more dependencies. Eg it exports
Prelude.Compat ((<>)) with ghc 7.10/base 4.8, which we expect.
My belief is that several of our deps already require it so the added
cost is not too great. We should probably go back to base-compat when
possible though, eg when we stop supporting ghc 7.10.
The new version of our package set apparently contains both base-compat and
base-compat-batteries in its transitive closure. This breaks the doctest suite,
which just imports everything into scope when the tests are run, thereby making
module names like Prelude.Compat ambiguous.
We don't need to import Data.Monoid because Prelude.Compat exports "<>"
already. In fact, importing that module causes build failures:
Hledger/Read/Common.hs:725:62: error:
Ambiguous occurrence ‘<>’
It could refer to either ‘Sem.<>’,
imported from ‘Prelude.Compat’ at Hledger/Read/Common.hs:97:1-39
(and originally defined in ‘Data.Semigroup’)
or ‘Data.Monoid.<>’,
imported from ‘Data.Monoid’ at Hledger/Read/Common.hs:110:1-18
Fixes https://github.com/simonmichael/hledger/issues/794.
- Rationale:
- The information necessary for applying exponents to a number is more
explicitly represented in the inputs to `fromRawNumber` than in the outputs
- This way, `exponentp` may simply return an `Int`
- Purpose: to reduce the verbosity of the previous implementation
- Split off `AmbiguousNumber` into its own type
- Introduce a function `AmbiguousNumber -> RawNumber` explicitly capturing the
disambiguation logic
- Reduce the number of remaining constructors in `RawNumber` to just two,
`WithSeparator` and `NoSeparator`
- The choice to distinguish by the presence of digit separators is motivated
by the need for this information later on when disallowing exponents on
numbers with digit separators
- Extracts the handling of signs out of `fromRawNumber` and into `signp` itself
- Rationale: The sign can be applied independently from the logic in
`fromRawNumber`
A commodity directive that doesn't specify the decimal point character
increases ambiguity and the chance of misparsing numbers, especially
as it overrides all style information inferred from the journal amounts.
In some cases it caused amounts with a decimal point to be parsed as if
with a digit group separator so 1.234 became 1234.
We could augment it with extra info from the journal amounts, when available,
but it would still be possible to be ambiguous, and that won't be obvious.
A commodity directive is what we recommend to nail down the style.
It seems the simple and really only way to do this reliably is to require
an explicit decimal point character. Most folks probably do this already.
Unfortunately, it makes another potential incompatiblity with ledger and
beancount journals. But the error message will be clear and easy to
work around.
Trailing whitespace in the replacement part of a regular expression
account alias is now significant. Eg, slightly flattening some bank
accounts: --alias '/:somebank:/=somebank '
Older megaparsec is still supported.
Also cleans up our custom parser types,
and some text (un)packing is done in different places
(possible performance impact).
See the issue and linked mail list discussion. Ambiguity between the
uncleared state, and the "not cleared" --uncleared flag causes confusion
and friction. At this point it seems best to break with Ledger and
past hledger, pick a new name and drop --uncleared to put an end to it.
When generating a new posting as a multiple of an existing posting,
support conversion to a different commodity. For example, postings in
hours can be used to generate postings in USD.
Automatic transactions generated from rewrite rules use the commodity,
amount style, and transaction price if the rewrite defines a commodity.
* added showMarketPrice and Hledger.Data.MarketPrice module
* showMarketPrice implemented using showDate
* attempted to add tests to Hledger.Data.MarketPrice
* moved MarketPrice test to Hledger.Read.JournalReader; fixed documentation on MarketPrice; added MarketPrice module to package.yaml
* cli: fix bug in pivot for postings without tag
Without this fix for postings without tag query checked effective
account which is always empty text ("").
* rewrite: inherit dates, change application order
For budgeting it is important to inherit actual date of posting if it
differs from date of transaction. These dates will be added
as a separate line of comment.
More natural order of rewrites is when result of first defined one is
available for all next rewrites.
* rewrite: factor out Hledger.Data.AutoTransaction
* rewrite: add diff output
With this option you can modify your original files without loosing
inter-transaction comments etc. I.e. you can run:
hledger-rewrite --diff Agency \
--add-posting 'Expenses:Taxes *0.17' \
| patch
As result multiple files should be updated.
Also it is nice to review your changes using colordiff instead of
patch.
* lib: track source lines range for journal
* doc: auto entries and diff output for rewrite
* Changed behavior of `readJournalFiles` to be identical to `readJournalFile` for singleton lists
* Balance Assertions have to be simple Amounts
* Add 'isAssignment' and 'assignmentPostings' to Hledger.Data.Posting and Transaction
* Implemented 'balanceTransactionUpdate', a more general version of 'balanceTransaction' that takes an update function
* Fixed test cases.
* Implemented balance assignment ("resetting a balance")
* Add assertions to show function
* updated the comments
* numbering is not needed in journalCheckBalanceAssertions
* remove prices before balance checks
* rename functions
When we don't know a file's format, instead of choosing a subset of
readers based on content sniffing, now we just try them all.
Also, LedgerReader is now used only as a last resort,
as it's not yet competitive with JournalReader.
This reader is used by default for files with suffix .ledger or .l,
and tried along with the other readers for files of unknown type.
Currently only the bare minimum of the raw parsed data is used:
transaction dates/descriptions and posting accounts/amounts,
with the rest being ignored.
Amounts are parsed the same way as in the hledger journal format.
Malformed amounts might be ignored instead of error-reported.
* csv rules: Show prettier parsing errors
This goes from
hledger: user error ("ParseError {errorPos = SourcePos {sourceName = \"foo.csv.rules\",
sourceLine = Pos 20, sourceColumn = Pos 1} :| [], errorUnexpected =
fromList [Tokens (' ' :| \"\")], errorExpected = fromList [Label ('b' :| \"lank or comment
line\"),EndOfInput], errorCustom = fromList []}")
to
hledger: user error (foo.csv.rules:20:1:
unexpected space
expecting blank or comment line or end of input
)
* csv rules: Fix parsing of empty field values
A single line containing `account1 ` (note the space at the end) should
parse as assignment of the empty string to account1. At least it did
until commit 4141067.
The problem is that megaparsec's `space` parses multiple space
characters as opposed to parsec. So in the example above it would
incorrectly consume the newline.
This commit also adds a new test case for this bug.
Timeclock transaction ids now count up rather than down.
Also, remove old code for appending timeclock transactions to journal transactions,
a holdover from the days when both were allowed in one file.
Transactions are now numbered consistently during journal finalisation,
rather than just in the journal reader. Also transaction knot-tying has been
moved out of journalBalanceTransactions.
* Replace Parsec with Megaparsec (see #289)
This builds upon PR #289 by @rasendubi
* Revert renaming of parseWithState to parseWithCtx
* Fix doctests
* Update for Megaparsec 5
* Specialize parser to improve performance
* Pretty print errors
* Swap StateT and ParsecT
This is necessary to get the correct backtracking behavior, i.e. discard
state changes if the parsing fails.
The journal/timeclock/timedot parsers, instead of constructing (opaque)
journal update functions which are later applied to build the journal,
now construct the journal directly (by modifying the parser state). This
is easier to understand and debug. It also removes any possibility of
the journal updates being a space leak. (They weren't, in fact memory
usage is now slightly higher, but that will be addressed in other ways.)
Also:
Journal data and journal parse info have been merged into one type (for
now), and field names are more consistent.
The ParsedJournal type alias has been added to distinguish being-parsed
and finalised journals.
Journal is now a monoid.
stats: fixed an issue with ordering of include files
journal: fixed an issue with ordering of included same-date transactions
timeclock: sessions can no longer span file boundaries (unclocked-out
sessions will be auto-closed at the end of the file).
expandPath now throws a proper IO error (and requires the IO monad).
journal files can now include journal, timeclock or timedot files (but
not yet CSV files). Also timeclock/timedot files no longer support
default year directives.
The Hledger.Read.* modules have been reorganised for better reuse.
Hledger.Read.Utils has been renamed Hledger.Read.Common and holds
low-level parsers & utilities; high-level read utilities have moved to
Hledger.Read.
Bracketed posting dates were fragile; they worked only if you wrote full
10-character dates. Also some semantics were a bit unclear. Now they
should be robust, and have been documented more clearly. This is a
legacy undocumented Ledger syntax, but it improves compatibility and
might be preferable to the more verbose "date:" tags if you write
posting dates often (as I do).
Internally, bracketed posting dates are no longer considered to be tags.
Journal comment, tag, and posting date parsers have been reworked, all
with doctests. Also the journal parser types generally have been
tightened up and clarified, making it much easier to know how to combine
and run them. There's now
-- | A parser of strings with generic user state, monad and return type.
type StringParser u m a = ParsecT String u m a
-- | A string parser with journal-parsing state.
type JournalParser m a = StringParser JournalContext m a
-- | A journal parser that runs in IO and can throw an error mid-parse.
type ErroringJournalParser a = JournalParser (ExceptT String IO) a
and corresponding convenience functions (and short aliases) for running them.