From 35fbad37c46050eb441e11a31cd472bbfcd36c93 Mon Sep 17 00:00:00 2001 From: Simon Michael Date: Wed, 19 Nov 2025 10:05:55 -1000 Subject: [PATCH] ;doc:csv: Regular expressions in CSV rules --- hledger/hledger.m4.md | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/hledger/hledger.m4.md b/hledger/hledger.m4.md index bd04420cc..4f49c893b 100644 --- a/hledger/hledger.m4.md +++ b/hledger/hledger.m4.md @@ -4002,7 +4002,7 @@ Eg `! whole foods`, `! %3 whole foods`, `!%description whole foods` will match i The pattern is, as usual in hledger, a POSIX extended regular expression that also supports GNU word boundaries (`\b`, `\B`, `\<`, `\>`) and nothing else. -If you have trouble with it, see "Regular expressions" in the hledger manual (). +For more details and tips, see [Regular expressions in CSV rules](#regular-expressions-in-csv-rules) below. ### Multiple matchers @@ -4273,6 +4273,34 @@ data. See: - - -> data import/conversion +### Regular expressions in CSV rules + +Regular expressions in `if` conditions (AKA matchers) are as described at - +a POSIX extended regular expression, that also supports GNU word boundaries (`\b`, `\B`, `\<`, `\>`), and nothing else. + +Here are some examples that might be useful in CSV rules: + +- Is field "foo" truly empty ? `if %foo ^$` +- Is it empty or containing only whitespace ? `if %foo ^ *$` +- Is it non-empty ? `if %foo .` +- Does it contain non-whitespace ? `if %foo [^ ]` + +Testing the value of numeric fields is a little harder. +You can't use hledger queries like `amt:0` or `amt:>10` in CSV rules. +But you can often achieve the same thing with a regular expression. + +Also, remember the content and layout of number fields in CSV can vary a lot. +And can change if you switch data providers in future. +So it's a good idea to write defensive, robust regexps for numeric fields. + +Here are some examples; you may need to adapt them to your data: + +- Does foo contain a non-zero number ? `if %foo [1-9]` +- Is it negative ? `if %foo -` +- Is it non-negative ? `if ! %foo -` +- Is it >= 10 ? `if %foo [1-9][0-9]+\.|` (assuming a decimal period and no leading zeroes) +- Is it >=10 and < 20 ? `if %foo \b1[0-9]\.|` + ### Setting amounts Continuing from [amount field](#amount-field) above, here are more tips for amount-setting: