;doc: update embedded manuals

This commit is contained in:
Simon Michael 2025-11-19 22:25:44 -10:00
parent d5ceb7bba3
commit 64b97b2658
3 changed files with 1345 additions and 1062 deletions

View File

@ -4712,6 +4712,59 @@ when parsing numbers (cf Amounts).
However if any numbers in the CSV contain digit group marks, such as However if any numbers in the CSV contain digit group marks, such as
thousand\-separating commas, you should declare the decimal mark thousand\-separating commas, you should declare the decimal mark
explicitly with this rule, to avoid misparsed numbers. explicitly with this rule, to avoid misparsed numbers.
.SS CSV fields and hledger fields
This can be confusing, so let\[aq]s start with an overview:
.IP \[bu] 2
\f[B]CSV fields\f[R] are provided by your data file.
They are named by their position in the CSV record, starting with 1.
You can also give them a readable name.
.IP \[bu] 2
\f[B]hledger fields\f[R] are predefined; \f[CR]date\f[R],
\f[CR]description\f[R], \f[CR]account1\f[R], \f[CR]amount1\f[R],
\f[CR]account2\f[R] are some of them.
They correspond to parts of a transaction\[aq]s journal entry, mostly.
.IP \[bu] 2
The CSV fields and hledger fields are the only fields you\[aq]ll be
working with; you can\[aq]t define new fields, or variables as in a
programming language.
(But you could add extra CSV fields to the data in preprocessing, before
running the rules.)
.IP \[bu] 2
For each CSV record, you\[aq]ll assign values to one or more of the
hledger fields to build up a transaction (journal entry).
Values can be static text, CSV field values from the current record, or
a combination of these.
.IP \[bu] 2
For simple cases, you can give a CSV field the same name as one of the
hledger fields, then its value will be automatically assigned to that
hledger field.
.IP \[bu] 2
CSV fields can only be read, not written to.
They\[aq]ll be on the right hand side, with a % prefix.
Eg
.RS 2
.IP \[bu] 2
testing a CSV field\[aq]s value: \f[CR]if %CSVFIELD ...\f[R]
.IP \[bu] 2
interpolating its value: \f[CR]HLEDGERFIELD %CSVFIELD\f[R]
.RE
.IP \[bu] 2
hledger fields can only be written to, not read.
They\[aq]ll be on the left hand side (or in a fields list), with no
prefix.
Eg
.RS 2
.IP \[bu] 2
setting the transaction\[aq]s description to a value:
\f[CR]description VALUE\f[R]
.IP \[bu] 2
setting the transaction\[aq]s description to the second CSV field\[aq]s
value:
.PD 0
.P
.PD
\f[CR]fields date, description, amount\f[R]
.RE
.SS \f[CR]fields\f[R] list .SS \f[CR]fields\f[R] list
.IP .IP
.EX .EX
@ -5081,8 +5134,7 @@ The pattern is, as usual in hledger, a POSIX extended regular expression
that also supports GNU word boundaries (\f[CR]\[rs]b\f[R], that also supports GNU word boundaries (\f[CR]\[rs]b\f[R],
\f[CR]\[rs]B\f[R], \f[CR]\[rs]<\f[R], \f[CR]\[rs]>\f[R]) and nothing \f[CR]\[rs]B\f[R], \f[CR]\[rs]<\f[R], \f[CR]\[rs]>\f[R]) and nothing
else. else.
If you have trouble with it, see \[dq]Regular expressions\[dq] in the For more details and tips, see Regular expressions in CSV rules below.
hledger manual (https://hledger.org/hledger.html#regular\-expressions).
.SS Multiple matchers .SS Multiple matchers
When an if block has multiple matchers, each on its own line, When an if block has multiple matchers, each on its own line,
.IP \[bu] 2 .IP \[bu] 2
@ -5381,6 +5433,55 @@ See:
https://hledger.org/cookbook.html#setups\-and\-workflows https://hledger.org/cookbook.html#setups\-and\-workflows
.IP \[bu] 2 .IP \[bu] 2
https://plaintextaccounting.org \-> data import/conversion https://plaintextaccounting.org \-> data import/conversion
.SS Regular expressions in CSV rules
Regular expressions in \f[CR]if\f[R] conditions (AKA matchers) are POSIX
extended regular expressions, that also support GNU word boundaries
(\f[CR]\[rs]b\f[R], \f[CR]\[rs]B\f[R], \f[CR]\[rs]<\f[R],
\f[CR]\[rs]>\f[R]), and nothing else.
(For more detail, see Regular expressions.)
.PP
Here are some examples that might be useful in CSV rules:
.IP \[bu] 2
Is field \[dq]foo\[dq] truly empty ?
\f[CR]if %foo \[ha]$\f[R]
.IP \[bu] 2
Is it empty or containing only whitespace ?
\f[CR]if %foo \[ha] *$\f[R]
.IP \[bu] 2
Is it non\-empty ?
\f[CR]if %foo .\f[R]
.IP \[bu] 2
Does it contain non\-whitespace ?
\f[CR]if %foo [\[ha] ]\f[R]
.PP
Testing the value of numeric fields is a little harder.
You can\[aq]t use hledger queries like \f[CR]amt:0\f[R] or
\f[CR]amt:>10\f[R] in CSV rules.
But you can often achieve the same thing with a regular expression.
.PP
Note the content and layout of number fields in CSV varies, and can
change over time (eg if you switch data providers).
So numeric regexps are always somewhat specific to your particular CSV
data; and it\[aq]s a good idea to make them defensive and robust if you
can.
.PP
Here are some examples:
.IP \[bu] 2
Does foo contain a non\-zero number ?
\f[CR]if %foo [1\-9]\f[R]
.IP \[bu] 2
Is it negative ?
\f[CR]if %foo \-\f[R]
.IP \[bu] 2
Is it non\-negative ?
\f[CR]if ! %foo \-\f[R]
.IP \[bu] 2
Is it >= 10 ?
\f[CR]if %foo [1\-9][0\-9]+\[rs].\f[R] (assuming a decimal period and no
leading zeros)
.IP \[bu] 2
Is it >= 10 and < 20 ?
\f[CR]if %foo \[rs]b1[0\-9]\[rs].\f[R]
.SS Setting amounts .SS Setting amounts
Continuing from amount field above, here are more tips for Continuing from amount field above, here are more tips for
amount\-setting: amount\-setting:
@ -9802,6 +9903,7 @@ Show journal and performance statistics.
.IP .IP
.EX .EX
Flags: Flags:
\-1 show a single line of output
\-v \-\-verbose show more detailed output \-v \-\-verbose show more detailed output
\-o \-\-output\-file=FILE write output to FILE. \-o \-\-output\-file=FILE write output to FILE.
.EE .EE
@ -9810,26 +9912,27 @@ The stats command shows summary information for the whole journal, or a
matched part of it. matched part of it.
With a reporting interval, it shows a report for each report period. With a reporting interval, it shows a report for each report period.
.PP .PP
The default output is fairly impersonal, though it reveals the main file It also shows some performance statistics:
name. .IP \[bu] 2
With \f[CR]\-v/\-\-verbose\f[R], more details are shown, like file how long the program ran for
paths, included files, and commodity names. .IP \[bu] 2
the number of transactions processed per second
.IP \[bu] 2
the peak live memory in use by the program to do its work
.IP \[bu] 2
the peak allocated memory as seen by the program
.PP .PP
It also shows some run time statistics: By default, the output is reasonably discreet; it reveals the main file
.IP \[bu] 2 name, your activity level, and the speed of your machine.
elapsed time
.IP \[bu] 2
throughput: the number of transactions processed per second
.IP \[bu] 2
live: the peak memory in use by the program to do its work
.IP \[bu] 2
alloc: the peak memory allocation from the OS as seen by GHC.
Measuring this externally, eg with GNU time, is more accurate; usually
that will be a larger number; sometimes (with swapping?)
smaller.
.PP .PP
The \f[CR]stats\f[R] command\[aq]s run time is similar to that of a With \f[CR]\-v/\-\-verbose\f[R], more details are shown: the full paths
balance report. of all files, and the names of the commodities you work with.
.PP
With \f[CR]\-1\f[R], only one line of output is shown, in a
machine\-friendly tab\-separated format: the program version, the main
journal file name, and the performance stats,
.PP
The run time of \f[CR]stats\f[R] is similar to that of a balance report.
.PP .PP
Example: Example:
.IP .IP
@ -9848,6 +9951,11 @@ Commodities : 26
Market prices : 1000 Market prices : 1000
Runtime stats : 0.12 s elapsed, 8266 txns/s, 4 MB live, 16 MB alloc Runtime stats : 0.12 s elapsed, 8266 txns/s, 4 MB live, 16 MB alloc
.EE .EE
.IP
.EX
$ hledger stats \-1 \-f examples/10ktxns\-1kaccts.journal
1.50.99\-g0835a2485\-20251119, mac\-aarch64 10ktxns\-1kaccts.journal 0.66 s elapsed 15244 txns/s 28 MB live 86 MB alloc
.EE
.PP .PP
This command supports the \-o/\-\-output\-file option (but not This command supports the \-o/\-\-output\-file option (but not
\-O/\-\-output\-format). \-O/\-\-output\-format).

File diff suppressed because it is too large Load Diff

View File

@ -3614,6 +3614,47 @@ CSV
should declare the decimal mark explicitly with this rule, to avoid should declare the decimal mark explicitly with this rule, to avoid
misparsed numbers. misparsed numbers.
CSV fields and hledger fields
This can be confusing, so let's start with an overview:
o CSV fields are provided by your data file. They are named by their
position in the CSV record, starting with 1. You can also give them
a readable name.
o hledger fields are predefined; date, description, account1, amount1,
account2 are some of them. They correspond to parts of a transac-
tion's journal entry, mostly.
o The CSV fields and hledger fields are the only fields you'll be work-
ing with; you can't define new fields, or variables as in a program-
ming language. (But you could add extra CSV fields to the data in
preprocessing, before running the rules.)
o For each CSV record, you'll assign values to one or more of the
hledger fields to build up a transaction (journal entry). Values can
be static text, CSV field values from the current record, or a combi-
nation of these.
o For simple cases, you can give a CSV field the same name as one of
the hledger fields, then its value will be automatically assigned to
that hledger field.
o CSV fields can only be read, not written to. They'll be on the right
hand side, with a % prefix. Eg
o testing a CSV field's value: if %CSVFIELD ...
o interpolating its value: HLEDGERFIELD %CSVFIELD
o hledger fields can only be written to, not read. They'll be on the
left hand side (or in a fields list), with no prefix. Eg
o setting the transaction's description to a value: description VALUE
o setting the transaction's description to the second CSV field's
value:
fields date, description, amount
fields list fields list
fields FIELDNAME1, FIELDNAME2, ... fields FIELDNAME1, FIELDNAME2, ...
@ -3920,9 +3961,8 @@ CSV
The pattern is, as usual in hledger, a POSIX extended regular expres- The pattern is, as usual in hledger, a POSIX extended regular expres-
sion that also supports GNU word boundaries (\b, \B, \<, \>) and noth- sion that also supports GNU word boundaries (\b, \B, \<, \>) and noth-
ing else. If you have trouble with it, see "Regular expressions" in ing else. For more details and tips, see Regular expressions in CSV
the hledger manual (https://hledger.org/hledger.html#regular-expres- rules below.
sions).
Multiple matchers Multiple matchers
When an if block has multiple matchers, each on its own line, When an if block has multiple matchers, each on its own line,
@ -4179,6 +4219,43 @@ CSV
o https://plaintextaccounting.org -> data import/conversion o https://plaintextaccounting.org -> data import/conversion
Regular expressions in CSV rules
Regular expressions in if conditions (AKA matchers) are POSIX extended
regular expressions, that also support GNU word boundaries (\b, \B, \<,
\>), and nothing else. (For more detail, see Regular expressions.)
Here are some examples that might be useful in CSV rules:
o Is field "foo" truly empty ? if %foo ^$
o Is it empty or containing only whitespace ? if %foo ^ *$
o Is it non-empty ? if %foo .
o Does it contain non-whitespace ? if %foo [^ ]
Testing the value of numeric fields is a little harder. You can't use
hledger queries like amt:0 or amt:>10 in CSV rules. But you can often
achieve the same thing with a regular expression.
Note the content and layout of number fields in CSV varies, and can
change over time (eg if you switch data providers). So numeric regexps
are always somewhat specific to your particular CSV data; and it's a
good idea to make them defensive and robust if you can.
Here are some examples:
o Does foo contain a non-zero number ? if %foo [1-9]
o Is it negative ? if %foo -
o Is it non-negative ? if ! %foo -
o Is it >= 10 ? if %foo [1-9][0-9]+\. (assuming a decimal period and
no leading zeros)
o Is it >= 10 and < 20 ? if %foo \b1[0-9]\.
Setting amounts Setting amounts
Continuing from amount field above, here are more tips for amount-set- Continuing from amount field above, here are more tips for amount-set-
ting: ting:
@ -7615,6 +7692,7 @@ Basic report commands
Show journal and performance statistics. Show journal and performance statistics.
Flags: Flags:
-1 show a single line of output
-v --verbose show more detailed output -v --verbose show more detailed output
-o --output-file=FILE write output to FILE. -o --output-file=FILE write output to FILE.
@ -7622,23 +7700,27 @@ Basic report commands
matched part of it. With a reporting interval, it shows a report for matched part of it. With a reporting interval, it shows a report for
each report period. each report period.
The default output is fairly impersonal, though it reveals the main It also shows some performance statistics:
file name. With -v/--verbose, more details are shown, like file paths,
included files, and commodity names.
It also shows some run time statistics: o how long the program ran for
o elapsed time o the number of transactions processed per second
o throughput: the number of transactions processed per second o the peak live memory in use by the program to do its work
o live: the peak memory in use by the program to do its work o the peak allocated memory as seen by the program
o alloc: the peak memory allocation from the OS as seen by GHC. Mea- By default, the output is reasonably discreet; it reveals the main file
suring this externally, eg with GNU time, is more accurate; usually name, your activity level, and the speed of your machine.
that will be a larger number; sometimes (with swapping?) smaller.
The stats command's run time is similar to that of a balance report. With -v/--verbose, more details are shown: the full paths of all files,
and the names of the commodities you work with.
With -1, only one line of output is shown, in a machine-friendly
tab-separated format: the program version, the main journal file name,
and the performance stats,
The run time of stats is similar to that of a balance report.
Example: Example:
@ -7656,6 +7738,9 @@ Basic report commands
Market prices : 1000 Market prices : 1000
Runtime stats : 0.12 s elapsed, 8266 txns/s, 4 MB live, 16 MB alloc Runtime stats : 0.12 s elapsed, 8266 txns/s, 4 MB live, 16 MB alloc
$ hledger stats -1 -f examples/10ktxns-1kaccts.journal
1.50.99-g0835a2485-20251119, mac-aarch64 10ktxns-1kaccts.journal 0.66 s elapsed 15244 txns/s 28 MB live 86 MB alloc
This command supports the -o/--output-file option (but not -O/--out- This command supports the -o/--output-file option (but not -O/--out-
put-format). put-format).