Commit Graph

49 Commits

Author SHA1 Message Date
Simon Michael
e796a00fc4 dev:import: drop archiving of original data
just archive clean data
2025-08-28 21:38:38 +01:00
Simon Michael
b64ddfe813 dev:rules reader: drop "fall back to reading latest archived" 2025-08-28 21:38:38 +01:00
Simon Michael
c60ec90756 dev:import: improve buggy detection of import command 2025-08-28 21:38:38 +01:00
Simon Michael
c515fedf70 feat:csv: support data cleaning scripts 2025-08-28 21:38:38 +01:00
Simon Michael
cb1d6a71a6 dev:import:archive: fix bugs in new code
Too hard to rebase
2025-08-14 19:22:52 +01:00
Simon Michael
7dfe2d84e7 dev:import: fix debug message 2025-08-14 17:52:54 +01:00
Simon Michael
88b451d6eb imp: when source rule finds no files, read the latest archived 2025-08-14 14:23:53 +01:00
Simon Michael
3dec0a8944 dev: indentation 2025-08-14 14:23:53 +01:00
Simon Michael
76dc6d089a feat:import:archive: archive data files, and process oldest first 2025-08-14 12:54:40 +01:00
Simon Michael
e360e50497 imp:csv: more --debug=2 output for if rules
Also, in debug output show records more like what matchers are seeing,
ie with quotes removed.
2025-05-22 17:05:45 -10:00
Simon Michael
98b40b2b0e ;dev: fix a warning 2025-04-22 12:26:55 -10:00
Simon Michael
9340b73aae imp: improve/format errors for various failures [#2367]
These now call error' and show errors in the standard style:

- reading a nonexistent data file
- reading an unsafe dotted file name on windows
- web: using --socket on windows
- demo: demo not found
- demo: error while running asciinema
- diff: bad arguments
- print --match: no match found
- register --match: no match found
- roi: no investment transactions found
2025-04-11 08:06:47 -10:00
Simon Michael
133560aa93 ;dev: csv: no need to test for unsupported feature [#2352] 2025-03-12 20:40:36 -10:00
Thomas Miedema
a8a0d3ee30 fix: csv: fix regression in parsing rules containing & (#2352) 2025-03-12 20:35:59 -10:00
Thomas Miedema
2faceb8e1b feat: csv: allow multiple matchers on the same line
`If blocks` and `If tables` now allow multiple matchers on the same line
separated by `&&` (AND) or `&& !` (AND NOT).

Example `if block` with two matchers on the same line:

	if %description amazon && %date 2025-02-22
	    account2 expenses:books

Example `if table` with two matchers on the same line:

	if,account2
	%description amazon && %date 2025-02-22, expenses:books
2025-03-01 11:21:00 -10:00
Joschua Kesper
5114962b2a feat:csv: add an encoding rule, allowing non-UTF8 CSV to be read [#2319]
Previously, hledger could read CSV files containing non-ascii
characters only if they are UTF8-encoded.  Now there is a new CSV
rule, encoding ENCODING, which allows reading CSV files with other
encodings.

This adds a dependency on the encoding library, which supports fewer
encodings than text-icu but does not require a third-party C library.
To avoid build issues on various platforms, we require version 0.10+.

This adds some use of the ImplicitParams language extension, required
by encoding's API, but only in a small code region.

This also changes the type of Reader's rReadFn; it now takes
a `Handle` rather than a `Text`, allowing more flexibility.
2025-02-15 14:48:30 -10:00
Simon Michael
29349458b3 imp:csv:if: go back to accepting unknown csv field names [#2289]
It makes life easier when reusing common rules with different CSVs.
2024-12-04 17:45:28 -10:00
Simon Michael
99fc4cd61f imp:csv:if: show the problematic field name when warning 2024-12-04 16:47:01 -10:00
Simon Michael
054a204aa0 imp:csv:if: support & ! (AND NOT) 2024-12-03 17:25:43 -10:00
Simon Michael
3d55f260b3 imp:csv:if: warn on invalid csv field names; improve doc [#2289] 2024-12-03 16:07:57 -10:00
Simon Michael
a47dce073d dev:csv: refactor/document isBlockActive, matcherMatches 2024-12-03 13:51:32 -10:00
Simon Michael
1010e3dee6 imp:csv: improve debug=7 output from isBlockActive 2024-12-03 10:45:16 -10:00
Simon Michael
c92b601028 dev: fix warnings with ghc 9.10 / base 4.20
Older ghc versions should also still build cleanly (tested with 9.8 so far).

I don't like enabling CPP in so many modules but it's easier that
figuring out how to do it with base-compat; hopefully no noticeable
compilation impact.
2024-09-30 17:20:13 -10:00
Simon Michael
823be7c565 fix: csv: tags on following lines, and posting dates, also work now [#2241]
Follow-on work from #2214.
2024-09-28 18:54:43 -10:00
Simon Michael
375fb07ede ;dev: cleanups 2024-08-29 10:07:02 +01:00
Henning Thielemann
14b5a1f82a imp: Hledger.Read.CsvUtils -> Write.Csv 2024-08-16 16:57:38 +02:00
Simon Michael
40620666f8 imp: cli: rename --rules-file to --rules; tweak options help
For brevity, and consistency with --conf.
--rules-file remains supported, as a hidden option.

hledger's main mode now supports the hidden legacy flags,
as the command modes do.
2024-06-25 18:37:55 +01:00
Simon Michael
f5c2ec681c dev: refactor: merge Text.Megaparsec.Custom into Hledger.Utils.Parse 2024-06-25 18:37:54 +01:00
Simon Michael
605f8446e5 fix:pkg: fix a doctest failure with ghc 8.10 2024-05-17 15:08:26 -10:00
Dmitry Astapov
b0b9e69e4f ;dev:lib allow comment lines in the "if" table body 2024-03-08 07:42:58 -10:00
Jonathan Dowland
3b416a76ef ;cln:import: clarify haddock for getEffectiveAssignment
Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-03-02 20:56:35 +00:00
Jonathan Dowland
c5079d4f1e dev:import: call hledgerFieldValue rather than re-implementing it
Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-03-01 18:02:29 +00:00
Jonathan Dowland
1424a1f2f1 ;cln:import: update some Haddock strings to reflect #2158
Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-03-01 17:58:28 +00:00
Jonathan Dowland
71684f5611 ref:import: simplify renderTemplate and friends
renderTemplate and its ancillary functions did not need the
HledgerFieldName argument, so remove it.

Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-02-29 21:44:42 +00:00
Jonathan Dowland
b7027c8bbb feat:import: resolve matchgroup references in ConditionalBlock scope (#2158)
Adjust getEffectiveAssignment to compute an intermediary form of the
active assignments (with an additional Either wrapper to distinguish
top-level and conditional assignments) and move the remaining work to
its only caller, hledgerField.

Rework hledgerFieldValue. Instead of calling hledgerField, call
getEffectiveAssignment and--in the conditional block case--construct
a CsvRules scoped just to the active ConditionalBlock before calling
renderTemplate.

Adjust regexMatchValue to use rconditionalblocks to access conditional
blocks from the CsvRules, rather than rblocksassigning, since we haven't
narrowed the scope of that field.

The result is match group references are only expanded for match groups
that occur within the in-scope ConditionalBlock. Fixes: #2158.

Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-02-29 21:44:40 +00:00
Jonathan Dowland
ac7f726282 ;ref:import: consistently use hledgerField
hledgerField is an alias to the function getEffectiveAssignment: both
names are used in various parts of RulesReader.

Treat hledgerField as the canonical name, and getEffectiveAssignment
as an implementation detail of hledgerField.

Replace all uses of getEffectiveAssignment with hledgerField (except the
one in hledgerField.)

Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-02-29 21:44:39 +00:00
Jonathan Dowland
8f514ac16d ;test:import: test case for match groups (#2158)
Add a test which captures the issue of overlapping scope described
in GitHub issue #2158.

Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-02-29 21:44:36 +00:00
Jonathan Dowland
bd5652c1c2 ;cln:import: remove superfluous comment lines
Signed-off-by: Jonathan Dowland <jon@dow.land>
2024-02-29 21:44:32 +00:00
Simon Michael
8f1ae401f4 dev: fix some partial head/tails, silence ghc 9.8's new warnings
Note the headErr/tailErr calls will print stack traces if they fail
(small ones: five lines, one of which is the useful location info),
which may or may not be best UX.
2024-02-28 15:58:21 -10:00
Simon Michael
60a1adc5ba lib: refactor, extract parseBalanceAssertionType 2024-02-20 20:55:27 -10:00
Michael Rees
d4ecdb3fea imp: Support tsv and ssv prefixes (#2164) 2024-02-08 06:44:44 -10:00
Simon Michael
0cb382cf0e dev: rename AmountDisplayOpts -> AmountFormat, and related constants
noColour          -> defaultFmt
noCost            -> noCostFmt
oneLine           -> oneLineFmt
csvDisplay        -> machineFmt
2024-01-23 21:35:06 -10:00
Simon Michael
8b45d4ba8c fix:csv: fix %FIELD interpolation in assignments using \n [#2134]
In field assignment values we now parse %FIELD references, \MATCHGROUP references
and "\n" newline markers more carefully, so all can coexist.
Parsing these values might be slower than before, but hopefully not noticeably so.
2023-12-23 19:25:34 -10:00
Simon Michael
20c299684b dev:csv: clarify renderTemplate [#2134] 2023-12-23 19:25:22 -10:00
Simon Michael
2b18715885 fix:csv: fix tag: queries on CSV data (#2114) 2023-11-20 21:55:11 -10:00
Jonathan Dowland
8bfa382c68 feat: import: interpolate regex matches in field templates (#2009)
Replace occurrences of '\N' (where N is a positive number) in field
templates with the corresponding regular expression match group, if it
exists.

E.g. Warp the date to the first of the month for the second posting

    if %date (....-..)-..
        comment2 date:\1-01

E.g. Strip a prefix from an imported account name

    if %account1 liabilities:jon:(.*)
        account1 \1

Fixes #2009.

Signed-off-by: Jonathan Dowland <jon@dow.land>
2023-11-08 13:49:39 -08:00
Jonathan Dowland
c619e387ea ;fix: import: minor typo 2023-11-08 13:49:39 -08:00
bobobo1618
9fb5740045 Add support for negating a Matcher
https://github.com/simonmichael/hledger/issues/2054
2023-10-05 10:22:01 +01:00
Simon Michael
029b59093b feat: csv: rules files can be read directly; data file can be specified
CSV rules files can now be read directly, eg you have the option of
writing `hledger -f foo.csv.rules CMD`. By default this will read data
from foo.csv in the same directory.  But you can also specify a
different data file with a new `source FILE` rule. This has some
convenience features:

- If the data file does not exist, it is treated as empty, not an
  error.

- If FILE is a relative path, it is relative to the rules file's
  directory. If it is just a file name with no path, it is relative
  to ~/Downloads/.

- If FILE is a glob pattern, the most recently modified matched file
  is used.

This helps remove some of the busywork of managing CSV downloads.
Most of your financial institutions's default CSV filenames are
different and can be recognised by a glob pattern.  So you can put a
rule like `source Checking1*.csv` in foo-checking.csv.rules,
periodically download CSV from Foo's website accepting your browser's
defaults, and then run `hledger import checking.csv.rules` to import
any new transactions. The next time, if you have done no cleanup, your
browser will probably save it as something like Checking1-2.csv, and
hledger will still see that because of the * wild card. You can choose
whether to delete CSVs after import, or keep them for a while as
temporary backups, or archive them somewhere.
2023-05-19 09:09:21 -10:00