Previously, hledger could read CSV files containing non-ascii
characters only if they are UTF8-encoded. Now there is a new CSV
rule, encoding ENCODING, which allows reading CSV files with other
encodings.
This adds a dependency on the encoding library, which supports fewer
encodings than text-icu but does not require a third-party C library.
To avoid build issues on various platforms, we require version 0.10+.
This adds some use of the ImplicitParams language extension, required
by encoding's API, but only in a small code region.
This also changes the type of Reader's rReadFn; it now takes
a `Handle` rather than a `Text`, allowing more flexibility.
CSV rules files can now be read directly, eg you have the option of
writing `hledger -f foo.csv.rules CMD`. By default this will read data
from foo.csv in the same directory. But you can also specify a
different data file with a new `source FILE` rule. This has some
convenience features:
- If the data file does not exist, it is treated as empty, not an
error.
- If FILE is a relative path, it is relative to the rules file's
directory. If it is just a file name with no path, it is relative
to ~/Downloads/.
- If FILE is a glob pattern, the most recently modified matched file
is used.
This helps remove some of the busywork of managing CSV downloads.
Most of your financial institutions's default CSV filenames are
different and can be recognised by a glob pattern. So you can put a
rule like `source Checking1*.csv` in foo-checking.csv.rules,
periodically download CSV from Foo's website accepting your browser's
defaults, and then run `hledger import checking.csv.rules` to import
any new transactions. The next time, if you have done no cleanup, your
browser will probably save it as something like Checking1-2.csv, and
hledger will still see that because of the * wild card. You can choose
whether to delete CSVs after import, or keep them for a while as
temporary backups, or archive them somewhere.
Inner empty lines were not being skipped automatically, contrary to
docs. Now all empty lines are skipped automatically, and the `skip`
rule is needed only for non-empty lines, as intended.
This may be a breaking change: it's possible that the `skip` count
might need to be adjusted in some CSV rules files.
Previously, CSV date-times with a different time zone from yours
(with or without explicit timezones in the CSV) could give off-by-one
dates, because the CSV timezone was ignored.
Now,
1. you can use the `timezone` rule to indicate which other
timezone a CSV is implicitly using
2. CSV date-times with a timezone - whether declared by rule or
parsed with %Z - are localised to the system time zone
(or another set with the TZ environment variable).
May also fix#1154, #1033, #708, #536, #73: testing is needed.
This aims to solve all problems where misconfigured locales lead to
parsers failing on utf8-encoded data. This should hopefully avoid
encoding issues, but since it fundamentally alters how encoding is dealt
with it may lead to unexpected outcomes. Widespread testing on a number
of different platforms would be useful.
This increases composability and avoids some ugly case handling. We
re-export runExceptT in Hledger.Read.
The final return types of the following functions has been changed from
IO (Either String a) to ExceptT String IO a. If this causes a problem,
you can get the old behaviour by calling runExceptT on the output:
readJournal, readJournalFiles, readJournalFile
Or, you can use the easy functions readJournal', readJournalFiles', and
readJournalFile', which assume default options and return in the IO
monad.
(SourcePos, SourcePos).
This has been marked for possible removal for a while. We are keeping
strictly more information. Possible edge cases arise with Timeclock and
CsvReader, but I think these are covered.
The particular motivation for getting rid of this is that
GenericSourcePos is creating some awkward import considerations for
little gain. Removing this enables some flattening of the module
dependency tree.
transactions are balanced possibly using explicit prices, but without
inferring any prices. This is included in --strict mode.
Renames check autobalanced to check balancedwithautoconversion.
Exceptions are for dealing with the pamount field, which is really just
dealing with an unnormalised list of amounts.
This creates an API for dealing with MixedAmount, so we never have to
access the internals outside of Hledger.Data.Amount.
Also remove a comment, since it looks like #1207 has been resolved.
supplant the old interface, which relied on the Num typeclass.
MixedAmount did not have a very good Num instance. The only functions
which were defined were fromInteger, (+), and negate. Furthermore, it
was not law-abiding, as 0 + a /= a in general. Replacements for used
functions are:
0 -> nullmixedamt / mempty
(+) -> maPlus / (<>)
(-) -> maMinus
negate -> maNegate
sum -> maSum
sumStrict -> maSum
Also creates some new constructors for MixedAmount:
mixedAmount :: Amount -> MixedAmount
maAddAmount :: MixedAmount -> Amount -> MixedAmount
maAddAmounts :: MixedAmount -> [Amount] -> MixedAmount
Add Semigroup and Monoid instances for MixedAmount.
Ideally we would remove the Num instance entirely.
The only change needed have nullmixedamt/mempty substitute for
0 without problems was to not squash prices in
mixedAmount(Looks|Is)Zero. This is correct behaviour in any case.
simplifySign now covers a few more sign combinations that might arise.
And in particular, it strips a standalone sign with no number,
which simplifies sign flipping with amount-in/amount-out.