;doc: regen manuals

[ci skip]
This commit is contained in:
Simon Michael 2019-11-06 13:10:17 -08:00
parent d92351e21a
commit 7ecc42f142
3 changed files with 1224 additions and 566 deletions

View File

@ -18,8 +18,8 @@ These do several things:
.IP \[bu] 2 .IP \[bu] 2
they describe the layout and format of the CSV data they describe the layout and format of the CSV data
.IP \[bu] 2 .IP \[bu] 2
they can customize the generated journal entries using a simple they can customize the generated journal entries (transactions) using a
templating language simple templating language
.IP \[bu] 2 .IP \[bu] 2
they can add refinements based on patterns in the CSV data, eg they can add refinements based on patterns in the CSV data, eg
categorizing transactions with more detailed account names. categorizing transactions with more detailed account names.
@ -44,70 +44,142 @@ skip 1
\f[R] \f[R]
.fi .fi
.PP .PP
A more complete example: More examples in the EXAMPLES section below.
.SH CSV RULES
.PP
The following kinds of rule can appear in the rules file, in any order
(except for \f[C]end\f[R] which can appear only inside a conditional
block).
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
ignored.
.SS \f[C]skip\f[R]
.IP .IP
.nf .nf
\f[C] \f[C]
# hledger CSV rules for amazon.com order history skip N
# sample:
# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
# skip one header line
skip 1
# name the csv fields (and assign the transaction\[aq]s date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
\f[R] \f[R]
.fi .fi
.PP .PP
For more examples, see Convert CSV files. The word \[dq]skip\[dq] followed by a number (or no number, meaning 1)
.SH CSV RULES tells hledger to ignore this many non-empty lines preceding the CSV
.PP data.
The following seven kinds of rule can appear in the rules file, in any
order.
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
ignored.
.SS skip
.PP
\f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
.PP
Skip this many non-empty lines preceding the CSV data.
(Empty/blank lines are skipped automatically.) You\[aq]ll need this (Empty/blank lines are skipped automatically.) You\[aq]ll need this
whenever your CSV data contains header lines. whenever your CSV data contains header lines.
.PP
It also has a second purpose: it can be used to ignore certain CSV
records, see conditional blocks below.
.SS \f[C]fields\f[R]
.IP
.nf
\f[C]
fields FIELDNAME1, FIELDNAME2, ...
\f[R]
.fi
.PP
A fields list (\[dq]fields\[dq] followed by one or more comma-separated
field names) is the quick way to assign CSV field values to hledger
fields.
It (a) names the CSV fields, in order (names may not contain whitespace;
fields you don\[aq]t care about can be left unnamed), and (b) assigns
them to hledger fields if you use standard hledger field names.
Here\[aq]s an example:
.IP
.nf
\f[C]
# use the 1st, 2nd and 4th CSV fields as the transaction\[aq]s date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
\f[R]
.fi
.PP
Here are the standard hledger field names:
.SS Transaction fields
.PP
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
\f[C]description\f[R], \f[C]comment\f[R] can be used to form the
transaction\[aq]s first line.
Only \f[C]date\f[R] is required.
(See also date-format below.)
.SS Posting fields
.PP
\f[C]accountN\f[R], where N is 1 to 9, sets the Nth posting\[aq]s
account name.
Most often there are two postings, so you\[aq]ll want to set
\f[C]account1\f[R] and \f[C]account2\f[R].
.PP
A number of field/pseudo-field names are available for setting posting
amounts:
.IP \[bu] 2
\f[C]amountN\f[R] sets posting N\[aq]s amount
.IP \[bu] 2
\f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] can be used instead, if
the CSV has separate fields for debits and credits
.IP \[bu] 2
\f[C]currencyN\f[R] sets a currency symbol to be left-prefixed to the
amount, useful if the CSV provides that as a separate field
.IP \[bu] 2
\f[C]balanceN\f[R] sets a (separate) balance assertion amount (or when
no posting amount is set, a balance assignment)
.PP
If you write these with no number (\f[C]amount\f[R],
\f[C]amount-in\f[R], \f[C]amount-out\f[R], \f[C]currency\f[R],
\f[C]balance\f[R]), it means posting 1.
Also, if you set an amount for posting 1 only, a second posting that
balances the transaction will be generated automatically.
This helps support CSV rules created before hledger 1.16.
.PP
Finally, \f[C]commentN\f[R] sets a comment on the Nth posting.
Comments can of course contain tags.
.SS \f[C](field assignment)\f[R]
.IP
.nf
\f[C]
HLEDGERFIELDNAME FIELDVALUE
\f[R]
.fi
.PP
Instead of or in addition to a fields list, you can assign a value to a
hledger field by writing its name (any of the standard names above)
followed by a text value.
The value may contain interpolated CSV fields, referenced by their
1-based position in the CSV record (\f[C]%N\f[R]), or by the name they
were given in the fields list (\f[C]%CSVFIELDNAME\f[R]).
Eg: Eg:
.IP .IP
.nf .nf
\f[C] \f[C]
# ignore the first CSV line # set the amount to the 4th CSV field, with \[dq] USD\[dq] appended
skip 1 amount %4 USD
\f[R]
.fi
.IP
.nf
\f[C]
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
\f[R] \f[R]
.fi .fi
.SS date-format
.PP .PP
\f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R] Interpolation strips any outer whitespace, so a CSV value like
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
Note you can only interpolate CSV fields, not the hledger fields being
assigned to; for more on this, see TIPS.
.SS \f[C]date-format\f[R]
.IP
.nf
\f[C]
date-format DATEFMT
\f[R]
.fi
.PP .PP
When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R] This is a helper for the \f[C]date\f[R] (and \f[C]date2\f[R]) fields.
(or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to If your CSV dates are not formatted like \f[C]YYYY-MM-DD\f[R],
specify the format. \f[C]YYYY/MM/DD\f[R] or \f[C]YYYY.MM.DD\f[R], you\[aq]ll need to specify
DATEFMT is a strptime-like date parsing pattern, which must parse the the format by writing \[dq]date-format\[dq] followed by a strptime-like
date field values completely. date parsing pattern, which must parse the date field values completely.
Examples: Examples:
.IP .IP
.nf .nf
@ -119,7 +191,7 @@ date-format %m/%d/%Y
.IP .IP
.nf .nf
\f[C] \f[C]
# for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional): # for dates like \[dq]6/11/2013\[dq]. The - allows leading zeros to be optional.
date-format %-d/%-m/%Y date-format %-d/%-m/%Y
\f[R] \f[R]
.fi .fi
@ -137,90 +209,47 @@ date-format %Y-%h-%d
date-format %-m/%-d/%Y %l:%M %p date-format %-m/%-d/%Y %l:%M %p
\f[R] \f[R]
.fi .fi
.SS field list .SS \f[C]if\f[R]
.PP
\f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
\f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
.PP
This (a) names the CSV fields, in order (names may not contain
whitespace; uninteresting names may be left blank), and (b) assigns them
to journal entry fields if you use any of these standard field names:
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
\f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
\f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
\f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
\f[C]balance1\f[R], \f[C]balance2\f[R].
Eg:
.IP .IP
.nf .nf
\f[C] \f[C]
# use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount, if PATTERN
# and give the 7th and 8th fields meaningful names for later reference: RULE
#
# CSV field: if
# 1 2 3 4 5 6 7 8 PATTERN
# entry field: PATTERN
fields date, description, , amount, , , somefield, anotherfield PATTERN
\f[R] RULE
.fi RULE
.SS field assignment
.PP
\f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
.PP
This sets a journal entry field (one of the standard names above) to the
given text value, which can include CSV field values interpolated by
name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
Eg:
.IP
.nf
\f[C]
# set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
amount USD %4
\f[R]
.fi
.IP
.nf
\f[C]
# combine three fields to make a comment (containing two tags)
comment note: %somefield - %anotherfield, date: %1
\f[R] \f[R]
.fi .fi
.PP .PP
Field assignments can be used instead of or in addition to a field list. Conditional blocks apply one or more rules to CSV records which are
matched by any of the PATTERNs.
This allows transactions to be customised or categorised based on
patterns in the data.
.PP .PP
Note, interpolation strips any outer whitespace, so a CSV value like A single pattern can be written on the same line as the \[dq]if\[dq]; or
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051). multiple patterns can be written on the following lines, non-indented.
.SS conditional block
.PP .PP
\f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R] Patterns are case-insensitive regular expressions which try to match any
.PD 0 part of the whole CSV record.
.P It\[aq]s not yet possible to match within a specific field.
.PD Note the CSV record they see is close but not identical to the one in
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]... the CSV file; eg double quotes are removed, and the separator character
becomes comma.
.PP .PP
\f[C]if\f[R] After the patterns, there should be one or more rules to apply, all
.PD 0 indented by at least one space.
.P Three kinds of rule are allowed in conditional blocks:
.PD .IP \[bu] 2
\f[I]\f[CI]PATTERN\f[I]\f[R] field assignments (to set a field\[aq]s value)
.PD 0 .IP \[bu] 2
.P skip (to skip the matched CSV record)
.PD .IP \[bu] 2
\f[I]\f[CI]PATTERN\f[I]\f[R]... end (to skip all remaining CSV records).
.PD 0
.P
.PD
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
.PP .PP
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs.
The patterns are case-insensitive regular expressions which match
anywhere within the whole CSV record (it\[aq]s not yet possible to match
within a specific field).
When there are multiple patterns they can be written on separate lines,
unindented.
The field assignments are on separate lines indented by at least one
space.
Examples: Examples:
.IP .IP
.nf .nf
@ -242,112 +271,319 @@ banking thru software
comment XXX deductible ? check it comment XXX deductible ? check it
\f[R] \f[R]
.fi .fi
.SS include .SS \f[C]end\f[R]
.PP .PP
\f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R] As mentioned above, this rule can be used inside conditional blocks
.PP (only) to cause hledger to stop reading CSV records and proceed with
Include another rules file at this point. command execution.
\f[C]RULESFILE\f[R] is either an absolute file path or a path relative
to the current file\[aq]s directory.
Eg: Eg:
.IP .IP
.nf .nf
\f[C] \f[C]
# rules reused with several CSV files # ignore everything following the first empty record
include common.rules if ,,,,
end
\f[R]
.fi
.SS \f[C]include\f[R]
.IP
.nf
\f[C]
include RULESFILE
\f[R] \f[R]
.fi .fi
.SS newest-first
.PP .PP
\f[C]newest-first\f[R] Include another CSV rules file at this point, as if it were written
inline.
\f[C]RULESFILE\f[R] is an absolute file path or a path relative to the
current file\[aq]s directory.
.PP .PP
Consider adding this rule if all of the following are true: you might be This can be useful eg for reusing common rules in several rules files:
processing just one day of data, your CSV records are in reverse .IP
chronological order (newest first), and you care about preserving the .nf
order of same-day transactions. \f[C]
It usually isn\[aq]t needed, because hledger autodetects the CSV order, # someaccount.csv.rules
but when all CSV records have the same date it will assume they are
oldest first. ## someaccount-specific rules
.SH CSV TIPS fields date,description,amount
.SS CSV ordering account1 some:account
account2 some:misc
## common rules
include categorisation.rules
\f[R]
.fi
.SS \f[C]newest-first\f[R]
.PP .PP
The generated journal entries will be sorted by date. hledger always sorts the generated transactions by date.
The order of same-day entries will be preserved (except in the special Transactions on the same date should appear in the same order as their
case where you might need \f[C]newest-first\f[R], see above). CSV records, as hledger can usually auto-detect whether the CSV\[aq]s
.SS CSV accounts normal order is oldest first or newest first.
.PP But if all of the following are true:
Each journal entry will have two postings, to \f[C]account1\f[R] and
\f[C]account2\f[R] respectively.
It\[aq]s not yet possible to generate entries with more than two
postings.
It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
account whose CSV we are reading.
.SS CSV amounts
.PP
A transaction amount must be set, in one of these ways:
.IP \[bu] 2 .IP \[bu] 2
with an \f[C]amount\f[R] field assignment, which sets the first the CSV might sometimes contain just one day of data (all records having
posting\[aq]s amount the same date)
.IP \[bu] 2 .IP \[bu] 2
(When the CSV has debit and credit amounts in separate fields:) the CSV records are normally in reverse chronological order (newest
.PD 0 first)
.P
.PD
with field assignments for the \f[C]amount-in\f[R] and
\f[C]amount-out\f[R] pseudo fields (both of them).
Whichever one has a value will be used, with appropriate sign.
If both contain a value, it might not work so well.
.IP \[bu] 2 .IP \[bu] 2
or implicitly by means of a balance assignment (see below). and you care about preserving the order of same-day transactions
.PP
you should add the \f[C]newest-first\f[R] rule as a hint.
Eg:
.IP
.nf
\f[C]
# tell hledger explicitly that the CSV is normally newest-first
newest-first
\f[R]
.fi
.SH EXAMPLES
.PP
A more complete example, generating three-posting transactions:
.IP
.nf
\f[C]
# hledger CSV rules for amazon.com order history
# sample:
# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
# skip one header line
skip 1
# name the csv fields (and assign the transaction\[aq]s date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
\f[R]
.fi
.PP
For more examples, see Convert CSV files.
.SH TIPS
.SS Reading multiple CSV files
.PP
You can read multiple CSV files at once using multiple \f[C]-f\f[R]
arguments on the command line.
hledger will look for a correspondingly-named rules file for each CSV
file.
If you use the \f[C]--rules-file\f[R] option, that rules file will be
used for all the CSV files.
.SS Deduplicating, importing
.PP
When you download a CSV file repeatedly, eg to get your latest bank
transactions, the new file may contain some of the same records as the
old one.
The print --new command is one simple way to detect just the new
transactions.
Or better still, the import command appends those new transactions to
your main journal.
This is the easiest way to import CSV data.
Eg, after downloading your latest CSV files:
.IP
.nf
\f[C]
$ hledger import *.csv [--dry]
\f[R]
.fi
.SS Other import methods
.PP
A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
.IP \[bu] 2
https://hledger.org -> sidebar -> real world setups
.IP \[bu] 2
https://plaintextaccounting.org -> data import/conversion
.SS Valid CSV
.PP
hledger accepts CSV conforming to RFC 4180.
Some things to note when values are enclosed in quotes:
.IP \[bu] 2
you must use double quotes (not single quotes)
.IP \[bu] 2
spaces outside the quotes are not allowed
.SS Other separator characters
.PP
With the \f[C]--separator \[aq]CHAR\[aq]\f[R] option, hledger will
expect the separator to be CHAR instead of a comma.
Ie it will read other \[dq]Character Separated Values\[dq] formats, such
as TSV (Tab Separated Values).
Note: on the command line, use a real tab character in quotes, not Eg:
.IP
.nf
\f[C]
$ hledger -f foo.tsv --separator \[aq] \[aq] print
\f[R]
.fi
.PP
(Experimental.)
.SS Setting amounts
.PP
A posting amount can be set in one of these ways:
.IP \[bu] 2
by assigning (with a fields list or field assigment) to
\f[C]amountN\f[R] (posting N\[aq]s amount) or \f[C]amount\f[R] (posting
1\[aq]s amount)
.IP \[bu] 2
by assigning to \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] (or
\f[C]amount-in\f[R] and \f[C]amount-out\f[R]).
For each CSV record, whichever of these has a non-zero value will be
used, with appropriate sign.
If both contain a non-zero value, this may not work.
.IP \[bu] 2
by assigning to \f[C]balanceN\f[R] (or \f[C]balance\f[R]) instead of the
above, setting the amount indirectly via a balance assignment.
.PP .PP
There is some special handling for sign in amounts: There is some special handling for sign in amounts:
.IP \[bu] 2 .IP \[bu] 2
If an amount value is parenthesised, it will be de-parenthesised and If an amount value is parenthesised, it will be de-parenthesised and
sign-flipped. sign-flipped.
.IP \[bu] 2 .IP \[bu] 2
If an amount value begins with a double minus sign, those will cancel If an amount value begins with a double minus sign, those cancel out and
out and be removed. are removed.
.PP .PP
If the currency/commodity symbol is provided as a separate CSV field, If the currency/commodity symbol is provided as a separate CSV field,
assign it to the \f[C]currency\f[R] pseudo field; the symbol will be you can assign it to \f[C]currency\f[R] (affects all posting amounts) or
prepended to the amount (TODO: when there is an amount). \f[C]currencyN\f[R] (affects just posting N\[aq]s amount).
Or, you can use an \f[C]amount\f[R] field assignment for more control, The symbol will be prepended to the amount.
eg: Or for more control, you can set both currency symbol and amount with a
field assignment, eg:
.IP .IP
.nf .nf
\f[C] \f[C]
fields date,description,currency,amount fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency amount %amount %currency
\f[R] \f[R]
.fi .fi
.SS CSV balance assertions/assignments .SS Referencing other fields
.PP .PP
If the CSV includes a running balance, you can assign that to one of the In field assignments, you can interpolate only CSV fields, not hledger
pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or fields.
\f[C]balance2\f[R]. In the example below, there\[aq]s both a CSV field and a hledger field
This will generate a balance assertion (or if the amount is left empty, named amount1, but %amount1 always means the CSV field, not the hledger
a balance assignment), on the first or second posting, whenever the field:
running balance field is non-empty. .IP
(TODO: #1000) .nf
.SS Reading multiple CSV files \f[C]
# Name the third CSV field \[dq]amount1\[dq]
fields date,description,amount1
# Set hledger\[aq]s amount1 to the CSV amount1 field followed by USD
amount1 %amount1 USD
# Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
\f[R]
.fi
.PP .PP
You can read multiple CSV files at once using multiple \f[C]-f\f[R] Here, since there\[aq]s no CSV amount1 field, %amount1 will produce a
arguments on the command line, and hledger will look for a literal \[dq]amount1\[dq]:
correspondingly-named rules file for each. .IP
Note if you use the \f[C]--rules-file\f[R] option, this one rules file .nf
will be used for all the CSV files being read. \f[C]
.SS Valid CSV fields date,description,csvamount
amount1 %csvamount USD
# Can\[aq]t interpolate amount1 here
comment %amount1
\f[R]
.fi
.PP .PP
hledger follows RFC 4180, with the addition of a customisable separator When there are multiple field assignments to the same hledger field,
character. only the last one takes effect.
Here, comment\[aq]s value will be be B, or C if \[dq]something\[dq] is
matched, but never A:
.IP
.nf
\f[C]
comment A
comment B
if something
comment C
\f[R]
.fi
.SS How CSV rules are evaluated
.PP .PP
Some things to note: Here\[aq]s how to think of CSV rules being evaluated (if you really need
.PP to).
When quoting fields, First,
.IP \[bu] 2 .IP \[bu] 2
you must use double quotes, not single quotes include - all includes are inlined, from top to bottom, depth first.
(At each include point the file is inlined and scanned for further
includes, before proceeding.)
.PP
Then \[dq]global\[dq] rules are evaluated, top to bottom.
If a rule is repeated, the last one wins:
.IP \[bu] 2 .IP \[bu] 2
spaces outside the quotes are not allowed. skip (at top level)
.IP \[bu] 2
date-format
.IP \[bu] 2
newest-first
.IP \[bu] 2
fields - names the CSV fields, optionally sets up initial assignments to
hledger fields
.PP
Then for each CSV record in turn:
.IP \[bu] 2
test all \f[C]if\f[R] blocks.
If any of them contain a \f[C]end\f[R] rule, skip all remaining CSV
records.
Otherwise if any of them contain a \f[C]skip\f[R] rule, skip that many
CSV records.
If there are multiple matched skip rules, the first one wins.
.IP \[bu] 2
collect all field assignments at top level and in matched if blocks.
When there are multiple assignments for a field, keep only the last one.
.IP \[bu] 2
compute a value for each hledger field - either the one that was
assigned to it (and interpolate the %CSVFIELDNAME references), or a
default
.IP \[bu] 2
generate a synthetic hledger transaction from these values, which
becomes part of the input to the hledger command that has been selected
.SS Valid transactions
.PP
hledger currently does not post-process and validate transactions
generated from CSV as thoroughly as transactions read from a journal
file.
This means that if your rules are wrong, you can generate invalid
transactions.
Or, amounts may not be displayed with a canonical display style.
.PP
So when setting up or adjusting CSV rules, you should check your results
visually with the print command.
You can pipe print\[aq]s output through hledger once more to validate
and canonicalise fully.
Eg:
.IP
.nf
\f[C]
$ hledger -f some.csv print | hledger -f- print -I
\f[R]
.fi
.PP
(The -I/--ignore-assertions flag disables balance assertion checks,
usually needed when re-parsing print output.)
.SH "REPORTING BUGS" .SH "REPORTING BUGS"

View File

@ -14,8 +14,8 @@ transaction. (To learn about _writing_ CSV, see CSV output.)
rules. These do several things: rules. These do several things:
* they describe the layout and format of the CSV data * they describe the layout and format of the CSV data
* they can customize the generated journal entries using a simple * they can customize the generated journal entries (transactions)
templating language using a simple templating language
* they can add refinements based on patterns in the CSV data, eg * they can add refinements based on patterns in the CSV data, eg
categorizing transactions with more detailed account names. categorizing transactions with more detailed account names.
@ -33,93 +33,164 @@ fields date, _, _, amount
date-format %d/%m/%Y date-format %d/%m/%Y
skip 1 skip 1
A more complete example: More examples in the EXAMPLES section below.
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
For more examples, see Convert CSV files.
* Menu: * Menu:
* CSV RULES:: * CSV RULES::
* CSV TIPS:: * EXAMPLES::
* TIPS::
 
File: hledger_csv.info, Node: CSV RULES, Next: CSV TIPS, Prev: Top, Up: Top File: hledger_csv.info, Node: CSV RULES, Next: EXAMPLES, Prev: Top, Up: Top
1 CSV RULES 1 CSV RULES
*********** ***********
The following seven kinds of rule can appear in the rules file, in any The following kinds of rule can appear in the rules file, in any order
order. Blank lines and lines beginning with '#' or ';' are ignored. (except for 'end' which can appear only inside a conditional block).
Blank lines and lines beginning with '#' or ';' are ignored.
* Menu: * Menu:
* skip:: * skip::
* date-format:: * fields::
* field list::
* field assignment:: * field assignment::
* conditional block:: * date-format::
* if::
* end::
* include:: * include::
* newest-first:: * newest-first::
 
File: hledger_csv.info, Node: skip, Next: date-format, Up: CSV RULES File: hledger_csv.info, Node: skip, Next: fields, Up: CSV RULES
1.1 skip 1.1 'skip'
======== ==========
'skip'_'N'_ skip N
Skip this many non-empty lines preceding the CSV data. (Empty/blank The word "skip" followed by a number (or no number, meaning 1) tells
lines are skipped automatically.) You'll need this whenever your CSV hledger to ignore this many non-empty lines preceding the CSV data.
data contains header lines. Eg: (Empty/blank lines are skipped automatically.) You'll need this
whenever your CSV data contains header lines.
# ignore the first CSV line It also has a second purpose: it can be used to ignore certain CSV
skip 1 records, see conditional blocks below.
 
File: hledger_csv.info, Node: date-format, Next: field list, Prev: skip, Up: CSV RULES File: hledger_csv.info, Node: fields, Next: field assignment, Prev: skip, Up: CSV RULES
1.2 date-format 1.2 'fields'
=============== ============
'date-format'_'DATEFMT'_ fields FIELDNAME1, FIELDNAME2, ...
When your CSV date fields are not formatted like 'YYYY/MM/DD' (or A fields list ("fields" followed by one or more comma-separated field
'YYYY-MM-DD' or 'YYYY.MM.DD'), you'll need to specify the format. names) is the quick way to assign CSV field values to hledger fields.
DATEFMT is a strptime-like date parsing pattern, which must parse the It (a) names the CSV fields, in order (names may not contain whitespace;
date field values completely. Examples: fields you don't care about can be left unnamed), and (b) assigns them
to hledger fields if you use standard hledger field names. Here's an
example:
# use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
Here are the standard hledger field names:
* Menu:
* Transaction fields::
* Posting fields::

File: hledger_csv.info, Node: Transaction fields, Next: Posting fields, Up: fields
1.2.1 Transaction fields
------------------------
'date', 'date2', 'status', 'code', 'description', 'comment' can be used
to form the transaction's first line. Only 'date' is required. (See
also date-format below.)

File: hledger_csv.info, Node: Posting fields, Prev: Transaction fields, Up: fields
1.2.2 Posting fields
--------------------
'accountN', where N is 1 to 9, sets the Nth posting's account name.
Most often there are two postings, so you'll want to set 'account1' and
'account2'.
A number of field/pseudo-field names are available for setting
posting amounts:
* 'amountN' sets posting N's amount
* 'amountN-in' and 'amountN-out' can be used instead, if the CSV has
separate fields for debits and credits
* 'currencyN' sets a currency symbol to be left-prefixed to the
amount, useful if the CSV provides that as a separate field
* 'balanceN' sets a (separate) balance assertion amount (or when no
posting amount is set, a balance assignment)
If you write these with no number ('amount', 'amount-in',
'amount-out', 'currency', 'balance'), it means posting 1. Also, if you
set an amount for posting 1 only, a second posting that balances the
transaction will be generated automatically. This helps support CSV
rules created before hledger 1.16.
Finally, 'commentN' sets a comment on the Nth posting. Comments can
of course contain tags.

File: hledger_csv.info, Node: field assignment, Next: date-format, Prev: fields, Up: CSV RULES
1.3 '(field assignment)'
========================
HLEDGERFIELDNAME FIELDVALUE
Instead of or in addition to a fields list, you can assign a value to
a hledger field by writing its name (any of the standard names above)
followed by a text value. The value may contain interpolated CSV
fields, referenced by their 1-based position in the CSV record ('%N'),
or by the name they were given in the fields list ('%CSVFIELDNAME').
Eg:
# set the amount to the 4th CSV field, with " USD" appended
amount %4 USD
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
Interpolation strips any outer whitespace, so a CSV value like '" 1
"' becomes '1' when interpolated (#1051). Note you can only interpolate
CSV fields, not the hledger fields being assigned to; for more on this,
see TIPS.

File: hledger_csv.info, Node: date-format, Next: if, Prev: field assignment, Up: CSV RULES
1.4 'date-format'
=================
date-format DATEFMT
This is a helper for the 'date' (and 'date2') fields. If your CSV
dates are not formatted like 'YYYY-MM-DD', 'YYYY/MM/DD' or 'YYYY.MM.DD',
you'll need to specify the format by writing "date-format" followed by a
strptime-like date parsing pattern, which must parse the date field
values completely. Examples:
# for dates like "11/06/2013": # for dates like "11/06/2013":
date-format %m/%d/%Y date-format %m/%d/%Y
# for dates like "6/11/2013" (note the - to make leading zeros optional): # for dates like "6/11/2013". The - allows leading zeros to be optional.
date-format %-d/%-m/%Y date-format %-d/%-m/%Y
# for dates like "2013-Nov-06": # for dates like "2013-Nov-06":
@ -129,73 +200,43 @@ date-format %Y-%h-%d
date-format %-m/%-d/%Y %l:%M %p date-format %-m/%-d/%Y %l:%M %p
 
File: hledger_csv.info, Node: field list, Next: field assignment, Prev: date-format, Up: CSV RULES File: hledger_csv.info, Node: if, Next: end, Prev: date-format, Up: CSV RULES
1.3 field list 1.5 'if'
============== ========
'fields'_'FIELDNAME1'_, _'FIELDNAME2'_... if PATTERN
RULE
This (a) names the CSV fields, in order (names may not contain if
whitespace; uninteresting names may be left blank), and (b) assigns them PATTERN
to journal entry fields if you use any of these standard field names: PATTERN
'date', 'date2', 'status', 'code', 'description', 'comment', 'account1', PATTERN
'account2', 'amount', 'amount-in', 'amount-out', 'currency', 'balance', RULE
'balance1', 'balance2'. Eg: RULE
# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount, Conditional blocks apply one or more rules to CSV records which are
# and give the 7th and 8th fields meaningful names for later reference: matched by any of the PATTERNs. This allows transactions to be
# customised or categorised based on patterns in the data.
# CSV field:
# 1 2 3 4 5 6 7 8
# entry field:
fields date, description, , amount, , , somefield, anotherfield
 A single pattern can be written on the same line as the "if"; or
File: hledger_csv.info, Node: field assignment, Next: conditional block, Prev: field list, Up: CSV RULES multiple patterns can be written on the following lines, non-indented.
1.4 field assignment Patterns are case-insensitive regular expressions which try to match
==================== any part of the whole CSV record. It's not yet possible to match within
a specific field. Note the CSV record they see is close but not
identical to the one in the CSV file; eg double quotes are removed, and
the separator character becomes comma.
_'ENTRYFIELDNAME'_ _'FIELDVALUE'_ After the patterns, there should be one or more rules to apply, all
indented by at least one space. Three kinds of rule are allowed in
conditional blocks:
This sets a journal entry field (one of the standard names above) to * field assignments (to set a field's value)
the given text value, which can include CSV field values interpolated by * skip (to skip the matched CSV record)
name ('%CSVFIELDNAME') or 1-based position ('%N'). Eg: * end (to skip all remaining CSV records).
# set the amount to the 4th CSV field with "USD " prepended Examples:
amount USD %4
# combine three fields to make a comment (containing two tags)
comment note: %somefield - %anotherfield, date: %1
Field assignments can be used instead of or in addition to a field
list.
Note, interpolation strips any outer whitespace, so a CSV value like
'" 1 "' becomes '1' when interpolated (#1051).

File: hledger_csv.info, Node: conditional block, Next: include, Prev: field assignment, Up: CSV RULES
1.5 conditional block
=====================
'if' _'PATTERN'_
_'FIELDASSIGNMENTS'_...
'if'
_'PATTERN'_
_'PATTERN'_...
_'FIELDASSIGNMENTS'_...
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs. The patterns are case-insensitive
regular expressions which match anywhere within the whole CSV record
(it's not yet possible to match within a specific field). When there
are multiple patterns they can be written on separate lines, unindented.
The field assignments are on separate lines indented by at least one
space. Examples:
# if the CSV record contains "groceries", set account2 to "expenses:groceries" # if the CSV record contains "groceries", set account2 to "expenses:groceries"
if groceries if groceries
@ -210,176 +251,369 @@ banking thru software
comment XXX deductible ? check it comment XXX deductible ? check it
 
File: hledger_csv.info, Node: include, Next: newest-first, Prev: conditional block, Up: CSV RULES File: hledger_csv.info, Node: end, Next: include, Prev: if, Up: CSV RULES
1.6 include 1.6 'end'
=========== =========
'include'_'RULESFILE'_ As mentioned above, this rule can be used inside conditional blocks
(only) to cause hledger to stop reading CSV records and proceed with
command execution. Eg:
Include another rules file at this point. 'RULESFILE' is either an # ignore everything following the first empty record
absolute file path or a path relative to the current file's directory. if ,,,,
Eg: end
# rules reused with several CSV files 
include common.rules File: hledger_csv.info, Node: include, Next: newest-first, Prev: end, Up: CSV RULES
1.7 'include'
=============
include RULESFILE
Include another CSV rules file at this point, as if it were written
inline. 'RULESFILE' is an absolute file path or a path relative to the
current file's directory.
This can be useful eg for reusing common rules in several rules
files:
# someaccount.csv.rules
## someaccount-specific rules
fields date,description,amount
account1 some:account
account2 some:misc
## common rules
include categorisation.rules
 
File: hledger_csv.info, Node: newest-first, Prev: include, Up: CSV RULES File: hledger_csv.info, Node: newest-first, Prev: include, Up: CSV RULES
1.7 newest-first 1.8 'newest-first'
================ ==================
'newest-first' hledger always sorts the generated transactions by date. Transactions
on the same date should appear in the same order as their CSV records,
as hledger can usually auto-detect whether the CSV's normal order is
oldest first or newest first. But if all of the following are true:
Consider adding this rule if all of the following are true: you might * the CSV might sometimes contain just one day of data (all records
be processing just one day of data, your CSV records are in reverse having the same date)
chronological order (newest first), and you care about preserving the * the CSV records are normally in reverse chronological order (newest
order of same-day transactions. It usually isn't needed, because first)
hledger autodetects the CSV order, but when all CSV records have the * and you care about preserving the order of same-day transactions
same date it will assume they are oldest first.
you should add the 'newest-first' rule as a hint. Eg:
# tell hledger explicitly that the CSV is normally newest-first
newest-first
 
File: hledger_csv.info, Node: CSV TIPS, Prev: CSV RULES, Up: Top File: hledger_csv.info, Node: EXAMPLES, Next: TIPS, Prev: CSV RULES, Up: Top
2 CSV TIPS 2 EXAMPLES
********** **********
A more complete example, generating three-posting transactions:
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
For more examples, see Convert CSV files.

File: hledger_csv.info, Node: TIPS, Prev: EXAMPLES, Up: Top
3 TIPS
******
* Menu: * Menu:
* CSV ordering::
* CSV accounts::
* CSV amounts::
* CSV balance assertions/assignments::
* Reading multiple CSV files:: * Reading multiple CSV files::
* Deduplicating importing::
* Other import methods::
* Valid CSV:: * Valid CSV::
* Other separator characters::
* Setting amounts::
* Referencing other fields::
* How CSV rules are evaluated::
* Valid transactions::
 
File: hledger_csv.info, Node: CSV ordering, Next: CSV accounts, Up: CSV TIPS File: hledger_csv.info, Node: Reading multiple CSV files, Next: Deduplicating importing, Up: TIPS
2.1 CSV ordering 3.1 Reading multiple CSV files
================ ==============================
The generated journal entries will be sorted by date. The order of You can read multiple CSV files at once using multiple '-f' arguments on
same-day entries will be preserved (except in the special case where you the command line. hledger will look for a correspondingly-named rules
might need 'newest-first', see above). file for each CSV file. If you use the '--rules-file' option, that
rules file will be used for all the CSV files.
 
File: hledger_csv.info, Node: CSV accounts, Next: CSV amounts, Prev: CSV ordering, Up: CSV TIPS File: hledger_csv.info, Node: Deduplicating importing, Next: Other import methods, Prev: Reading multiple CSV files, Up: TIPS
2.2 CSV accounts 3.2 Deduplicating, importing
================ ============================
Each journal entry will have two postings, to 'account1' and 'account2' When you download a CSV file repeatedly, eg to get your latest bank
respectively. It's not yet possible to generate entries with more than transactions, the new file may contain some of the same records as the
two postings. It's conventional and recommended to use 'account1' for old one. The print -new command is one simple way to detect just the
the account whose CSV we are reading. new transactions. Or better still, the import command appends those new
transactions to your main journal. This is the easiest way to import
CSV data. Eg, after downloading your latest CSV files:
$ hledger import *.csv [--dry]
 
File: hledger_csv.info, Node: CSV amounts, Next: CSV balance assertions/assignments, Prev: CSV accounts, Up: CSV TIPS File: hledger_csv.info, Node: Other import methods, Next: Valid CSV, Prev: Deduplicating importing, Up: TIPS
2.3 CSV amounts 3.3 Other import methods
=============== ========================
A transaction amount must be set, in one of these ways: A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
* with an 'amount' field assignment, which sets the first posting's * https://hledger.org -> sidebar -> real world setups
amount * https://plaintextaccounting.org -> data import/conversion
* (When the CSV has debit and credit amounts in separate fields:) 
with field assignments for the 'amount-in' and 'amount-out' pseudo File: hledger_csv.info, Node: Valid CSV, Next: Other separator characters, Prev: Other import methods, Up: TIPS
fields (both of them). Whichever one has a value will be used,
with appropriate sign. If both contain a value, it might not work
so well.
* or implicitly by means of a balance assignment (see below). 3.4 Valid CSV
=============
hledger accepts CSV conforming to RFC 4180. Some things to note when
values are enclosed in quotes:
* you must use double quotes (not single quotes)
* spaces outside the quotes are not allowed

File: hledger_csv.info, Node: Other separator characters, Next: Setting amounts, Prev: Valid CSV, Up: TIPS
3.5 Other separator characters
==============================
With the '--separator 'CHAR'' option, hledger will expect the separator
to be CHAR instead of a comma. Ie it will read other "Character
Separated Values" formats, such as TSV (Tab Separated Values). Note: on
the command line, use a real tab character in quotes, not
$ hledger -f foo.tsv --separator ' ' print
(Experimental.)

File: hledger_csv.info, Node: Setting amounts, Next: Referencing other fields, Prev: Other separator characters, Up: TIPS
3.6 Setting amounts
===================
A posting amount can be set in one of these ways:
* by assigning (with a fields list or field assigment) to 'amountN'
(posting N's amount) or 'amount' (posting 1's amount)
* by assigning to 'amountN-in' and 'amountN-out' (or 'amount-in' and
'amount-out'). For each CSV record, whichever of these has a
non-zero value will be used, with appropriate sign. If both
contain a non-zero value, this may not work.
* by assigning to 'balanceN' (or 'balance') instead of the above,
setting the amount indirectly via a balance assignment.
There is some special handling for sign in amounts: There is some special handling for sign in amounts:
* If an amount value is parenthesised, it will be de-parenthesised * If an amount value is parenthesised, it will be de-parenthesised
and sign-flipped. and sign-flipped.
* If an amount value begins with a double minus sign, those will * If an amount value begins with a double minus sign, those cancel
cancel out and be removed. out and are removed.
If the currency/commodity symbol is provided as a separate CSV field, If the currency/commodity symbol is provided as a separate CSV field,
assign it to the 'currency' pseudo field; the symbol will be prepended you can assign it to 'currency' (affects all posting amounts) or
to the amount (TODO: when there is an amount). Or, you can use an 'currencyN' (affects just posting N's amount). The symbol will be
'amount' field assignment for more control, eg: prepended to the amount. Or for more control, you can set both currency
symbol and amount with a field assignment, eg:
fields date,description,currency,amount fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency amount %amount %currency
 
File: hledger_csv.info, Node: CSV balance assertions/assignments, Next: Reading multiple CSV files, Prev: CSV amounts, Up: CSV TIPS File: hledger_csv.info, Node: Referencing other fields, Next: How CSV rules are evaluated, Prev: Setting amounts, Up: TIPS
2.4 CSV balance assertions/assignments 3.7 Referencing other fields
====================================== ============================
If the CSV includes a running balance, you can assign that to one of the In field assignments, you can interpolate only CSV fields, not hledger
pseudo fields 'balance' (or 'balance1') or 'balance2'. This will fields. In the example below, there's both a CSV field and a hledger
generate a balance assertion (or if the amount is left empty, a balance field named amount1, but %amount1 always means the CSV field, not the
assignment), on the first or second posting, whenever the running hledger field:
balance field is non-empty. (TODO: #1000)
# Name the third CSV field "amount1"
fields date,description,amount1
# Set hledger's amount1 to the CSV amount1 field followed by USD
amount1 %amount1 USD
# Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
Here, since there's no CSV amount1 field, %amount1 will produce a
literal "amount1":
fields date,description,csvamount
amount1 %csvamount USD
# Can't interpolate amount1 here
comment %amount1
When there are multiple field assignments to the same hledger field,
only the last one takes effect. Here, comment's value will be be B, or
C if "something" is matched, but never A:
comment A
comment B
if something
comment C
 
File: hledger_csv.info, Node: Reading multiple CSV files, Next: Valid CSV, Prev: CSV balance assertions/assignments, Up: CSV TIPS File: hledger_csv.info, Node: How CSV rules are evaluated, Next: Valid transactions, Prev: Referencing other fields, Up: TIPS
2.5 Reading multiple CSV files 3.8 How CSV rules are evaluated
============================== ===============================
You can read multiple CSV files at once using multiple '-f' arguments on Here's how to think of CSV rules being evaluated (if you really need
the command line, and hledger will look for a correspondingly-named to). First,
rules file for each. Note if you use the '--rules-file' option, this
one rules file will be used for all the CSV files being read. * include - all includes are inlined, from top to bottom, depth
first. (At each include point the file is inlined and scanned for
further includes, before proceeding.)
Then "global" rules are evaluated, top to bottom. If a rule is
repeated, the last one wins:
* skip (at top level)
* date-format
* newest-first
* fields - names the CSV fields, optionally sets up initial
assignments to hledger fields
Then for each CSV record in turn:
* test all 'if' blocks. If any of them contain a 'end' rule, skip
all remaining CSV records. Otherwise if any of them contain a
'skip' rule, skip that many CSV records. If there are multiple
matched skip rules, the first one wins.
* collect all field assignments at top level and in matched if
blocks. When there are multiple assignments for a field, keep only
the last one.
* compute a value for each hledger field - either the one that was
assigned to it (and interpolate the %CSVFIELDNAME references), or a
default
* generate a synthetic hledger transaction from these values, which
becomes part of the input to the hledger command that has been
selected
 
File: hledger_csv.info, Node: Valid CSV, Prev: Reading multiple CSV files, Up: CSV TIPS File: hledger_csv.info, Node: Valid transactions, Prev: How CSV rules are evaluated, Up: TIPS
2.6 Valid CSV 3.9 Valid transactions
============= ======================
hledger follows RFC 4180, with the addition of a customisable separator hledger currently does not post-process and validate transactions
character. generated from CSV as thoroughly as transactions read from a journal
file. This means that if your rules are wrong, you can generate invalid
transactions. Or, amounts may not be displayed with a canonical display
style.
Some things to note: So when setting up or adjusting CSV rules, you should check your
results visually with the print command. You can pipe print's output
through hledger once more to validate and canonicalise fully. Eg:
When quoting fields, $ hledger -f some.csv print | hledger -f- print -I
* you must use double quotes, not single quotes (The -I/-ignore-assertions flag disables balance assertion checks,
* spaces outside the quotes are not allowed. usually needed when re-parsing print output.)
 
Tag Table: Tag Table:
Node: Top72 Node: Top72
Node: CSV RULES2167 Node: CSV RULES1428
Ref: #csv-rules2275 Ref: #csv-rules1536
Node: skip2538 Node: skip1849
Ref: #skip2632 Ref: #skip1942
Node: date-format2857 Node: fields2312
Ref: #date-format2984 Ref: #fields2434
Node: field list3534 Node: Transaction fields3239
Ref: #field-list3671 Ref: #transaction-fields3379
Node: field assignment4401 Node: Posting fields3547
Ref: #field-assignment4556 Ref: #posting-fields3679
Node: conditional block5180 Node: field assignment4729
Ref: #conditional-block5334 Ref: #field-assignment4882
Node: include6230 Node: date-format5693
Ref: #include6360 Ref: #date-format5828
Node: newest-first6591 Node: if6440
Ref: #newest-first6705 Ref: #if6544
Node: CSV TIPS7116 Node: end7915
Ref: #csv-tips7210 Ref: #end8017
Node: CSV ordering7354 Node: include8246
Ref: #csv-ordering7472 Ref: #include8366
Node: CSV accounts7653 Node: newest-first8804
Ref: #csv-accounts7791 Ref: #newest-first8922
Node: CSV amounts8045 Node: EXAMPLES9594
Ref: #csv-amounts8203 Ref: #examples9701
Node: CSV balance assertions/assignments9283 Node: TIPS10607
Ref: #csv-balance-assertionsassignments9501 Ref: #tips10688
Node: Reading multiple CSV files9822 Node: Reading multiple CSV files10931
Ref: #reading-multiple-csv-files10022 Ref: #reading-multiple-csv-files11098
Node: Valid CSV10296 Node: Deduplicating importing11358
Ref: #valid-csv10419 Ref: #deduplicating-importing11550
Node: Other import methods11991
Ref: #other-import-methods12158
Node: Valid CSV12428
Ref: #valid-csv12576
Node: Other separator characters12778
Ref: #other-separator-characters12955
Node: Setting amounts13289
Ref: #setting-amounts13459
Node: Referencing other fields14702
Ref: #referencing-other-fields14891
Node: How CSV rules are evaluated15788
Ref: #how-csv-rules-are-evaluated15986
Node: Valid transactions17266
Ref: #valid-transactions17413
 
End Tag Table End Tag Table

View File

@ -16,8 +16,8 @@ DESCRIPTION
o they describe the layout and format of the CSV data o they describe the layout and format of the CSV data
o they can customize the generated journal entries using a simple tem- o they can customize the generated journal entries (transactions) using
plating language a simple templating language
o they can add refinements based on patterns in the CSV data, eg cate- o they can add refinements based on patterns in the CSV data, eg cate-
gorizing transactions with more detailed account names. gorizing transactions with more detailed account names.
@ -36,63 +36,109 @@ DESCRIPTION
date-format %d/%m/%Y date-format %d/%m/%Y
skip 1 skip 1
A more complete example: More examples in the EXAMPLES section below.
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
For more examples, see Convert CSV files.
CSV RULES CSV RULES
The following seven kinds of rule can appear in the rules file, in any The following kinds of rule can appear in the rules file, in any order
order. Blank lines and lines beginning with # or ; are ignored. (except for end which can appear only inside a conditional block).
Blank lines and lines beginning with # or ; are ignored.
skip skip
skipN skip N
Skip this many non-empty lines preceding the CSV data. (Empty/blank The word "skip" followed by a number (or no number, meaning 1) tells
lines are skipped automatically.) You'll need this whenever your CSV hledger to ignore this many non-empty lines preceding the CSV data.
data contains header lines. Eg: (Empty/blank lines are skipped automatically.) You'll need this when-
ever your CSV data contains header lines.
# ignore the first CSV line It also has a second purpose: it can be used to ignore certain CSV
skip 1 records, see conditional blocks below.
fields
fields FIELDNAME1, FIELDNAME2, ...
A fields list ("fields" followed by one or more comma-separated field
names) is the quick way to assign CSV field values to hledger fields.
It (a) names the CSV fields, in order (names may not contain white-
space; fields you don't care about can be left unnamed), and (b) as-
signs them to hledger fields if you use standard hledger field names.
Here's an example:
# use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
Here are the standard hledger field names:
Transaction fields
date, date2, status, code, description, comment can be used to form the
transaction's first line. Only date is required. (See also date-for-
mat below.)
Posting fields
accountN, where N is 1 to 9, sets the Nth posting's account name. Most
often there are two postings, so you'll want to set account1 and ac-
count2.
A number of field/pseudo-field names are available for setting posting
amounts:
o amountN sets posting N's amount
o amountN-in and amountN-out can be used instead, if the CSV has sepa-
rate fields for debits and credits
o currencyN sets a currency symbol to be left-prefixed to the amount,
useful if the CSV provides that as a separate field
o balanceN sets a (separate) balance assertion amount (or when no post-
ing amount is set, a balance assignment)
If you write these with no number (amount, amount-in, amount-out, cur-
rency, balance), it means posting 1. Also, if you set an amount for
posting 1 only, a second posting that balances the transaction will be
generated automatically. This helps support CSV rules created before
hledger 1.16.
Finally, commentN sets a comment on the Nth posting. Comments can of
course contain tags.
(field assignment)
HLEDGERFIELDNAME FIELDVALUE
Instead of or in addition to a fields list, you can assign a value to a
hledger field by writing its name (any of the standard names above)
followed by a text value. The value may contain interpolated CSV
fields, referenced by their 1-based position in the CSV record (%N), or
by the name they were given in the fields list (%CSVFIELDNAME). Eg:
# set the amount to the 4th CSV field, with " USD" appended
amount %4 USD
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
Interpolation strips any outer whitespace, so a CSV value like " 1 "
becomes 1 when interpolated (#1051). Note you can only interpolate CSV
fields, not the hledger fields being assigned to; for more on this, see
TIPS.
date-format date-format
date-formatDATEFMT date-format DATEFMT
When your CSV date fields are not formatted like YYYY/MM/DD (or YYYY- This is a helper for the date (and date2) fields. If your CSV dates
MM-DD or YYYY.MM.DD), you'll need to specify the format. DATEFMT is a are not formatted like YYYY-MM-DD, YYYY/MM/DD or YYYY.MM.DD, you'll
strptime-like date parsing pattern, which must parse the date field need to specify the format by writing "date-format" followed by a strp-
values completely. Examples: time-like date parsing pattern, which must parse the date field values
completely. Examples:
# for dates like "11/06/2013": # for dates like "11/06/2013":
date-format %m/%d/%Y date-format %m/%d/%Y
# for dates like "6/11/2013" (note the - to make leading zeros optional): # for dates like "6/11/2013". The - allows leading zeros to be optional.
date-format %-d/%-m/%Y date-format %-d/%-m/%Y
# for dates like "2013-Nov-06": # for dates like "2013-Nov-06":
@ -101,59 +147,41 @@ CSV RULES
# for dates like "11/6/2013 11:32 PM": # for dates like "11/6/2013 11:32 PM":
date-format %-m/%-d/%Y %l:%M %p date-format %-m/%-d/%Y %l:%M %p
field list if
fieldsFIELDNAME1, FIELDNAME2... if PATTERN
RULE
This (a) names the CSV fields, in order (names may not contain white- if
space; uninteresting names may be left blank), and (b) assigns them to PATTERN
journal entry fields if you use any of these standard field names: PATTERN
date, date2, status, code, description, comment, account1, account2, PATTERN
amount, amount-in, amount-out, currency, balance, balance1, balance2. RULE
Eg: RULE
# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount, Conditional blocks apply one or more rules to CSV records which are
# and give the 7th and 8th fields meaningful names for later reference: matched by any of the PATTERNs. This allows transactions to be cus-
# tomised or categorised based on patterns in the data.
# CSV field:
# 1 2 3 4 5 6 7 8
# entry field:
fields date, description, , amount, , , somefield, anotherfield
field assignment A single pattern can be written on the same line as the "if"; or multi-
ENTRYFIELDNAME FIELDVALUE ple patterns can be written on the following lines, non-indented.
This sets a journal entry field (one of the standard names above) to Patterns are case-insensitive regular expressions which try to match
the given text value, which can include CSV field values interpolated any part of the whole CSV record. It's not yet possible to match
by name (%CSVFIELDNAME) or 1-based position (%N). Eg: within a specific field. Note the CSV record they see is close but not
identical to the one in the CSV file; eg double quotes are removed, and
the separator character becomes comma.
# set the amount to the 4th CSV field with "USD " prepended After the patterns, there should be one or more rules to apply, all in-
amount USD %4 dented by at least one space. Three kinds of rule are allowed in con-
ditional blocks:
# combine three fields to make a comment (containing two tags) o field assignments (to set a field's value)
comment note: %somefield - %anotherfield, date: %1
Field assignments can be used instead of or in addition to a field o skip (to skip the matched CSV record)
list.
Note, interpolation strips any outer whitespace, so a CSV value like " o end (to skip all remaining CSV records).
1 " becomes 1 when interpolated (#1051).
conditional block Examples:
if PATTERN
FIELDASSIGNMENTS...
if
PATTERN
PATTERN...
FIELDASSIGNMENTS...
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs. The patterns are case-insensitive reg-
ular expressions which match anywhere within the whole CSV record (it's
not yet possible to match within a specific field). When there are
multiple patterns they can be written on separate lines, unindented.
The field assignments are on separate lines indented by at least one
space. Examples:
# if the CSV record contains "groceries", set account2 to "expenses:groceries" # if the CSV record contains "groceries", set account2 to "expenses:groceries"
if groceries if groceries
@ -167,90 +195,250 @@ CSV RULES
account2 expenses:business:banking account2 expenses:business:banking
comment XXX deductible ? check it comment XXX deductible ? check it
end
As mentioned above, this rule can be used inside conditional blocks
(only) to cause hledger to stop reading CSV records and proceed with
command execution. Eg:
# ignore everything following the first empty record
if ,,,,
end
include include
includeRULESFILE include RULESFILE
Include another rules file at this point. RULESFILE is either an abso- Include another CSV rules file at this point, as if it were written in-
lute file path or a path relative to the current file's directory. Eg: line. RULESFILE is an absolute file path or a path relative to the
current file's directory.
# rules reused with several CSV files This can be useful eg for reusing common rules in several rules files:
include common.rules
# someaccount.csv.rules
## someaccount-specific rules
fields date,description,amount
account1 some:account
account2 some:misc
## common rules
include categorisation.rules
newest-first newest-first
newest-first hledger always sorts the generated transactions by date. Transactions
on the same date should appear in the same order as their CSV records,
as hledger can usually auto-detect whether the CSV's normal order is
oldest first or newest first. But if all of the following are true:
Consider adding this rule if all of the following are true: you might o the CSV might sometimes contain just one day of data (all records
be processing just one day of data, your CSV records are in reverse having the same date)
chronological order (newest first), and you care about preserving the
order of same-day transactions. It usually isn't needed, because
hledger autodetects the CSV order, but when all CSV records have the
same date it will assume they are oldest first.
CSV TIPS o the CSV records are normally in reverse chronological order (newest
CSV ordering first)
The generated journal entries will be sorted by date. The order of
same-day entries will be preserved (except in the special case where
you might need newest-first, see above).
CSV accounts o and you care about preserving the order of same-day transactions
Each journal entry will have two postings, to account1 and account2 re-
spectively. It's not yet possible to generate entries with more than
two postings. It's conventional and recommended to use account1 for
the account whose CSV we are reading.
CSV amounts you should add the newest-first rule as a hint. Eg:
A transaction amount must be set, in one of these ways:
o with an amount field assignment, which sets the first posting's # tell hledger explicitly that the CSV is normally newest-first
amount newest-first
o (When the CSV has debit and credit amounts in separate fields:) EXAMPLES
with field assignments for the amount-in and amount-out pseudo fields A more complete example, generating three-posting transactions:
(both of them). Whichever one has a value will be used, with appropri-
ate sign. If both contain a value, it might not work so well.
o or implicitly by means of a balance assignment (see below). # hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
For more examples, see Convert CSV files.
TIPS
Reading multiple CSV files
You can read multiple CSV files at once using multiple -f arguments on
the command line. hledger will look for a correspondingly-named rules
file for each CSV file. If you use the --rules-file option, that rules
file will be used for all the CSV files.
Deduplicating, importing
When you download a CSV file repeatedly, eg to get your latest bank
transactions, the new file may contain some of the same records as the
old one. The print --new command is one simple way to detect just the
new transactions. Or better still, the import command appends those
new transactions to your main journal. This is the easiest way to im-
port CSV data. Eg, after downloading your latest CSV files:
$ hledger import *.csv [--dry]
Other import methods
A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
o https://hledger.org -> sidebar -> real world setups
o https://plaintextaccounting.org -> data import/conversion
Valid CSV
hledger accepts CSV conforming to RFC 4180. Some things to note when
values are enclosed in quotes:
o you must use double quotes (not single quotes)
o spaces outside the quotes are not allowed
Other separator characters
With the --separator 'CHAR' option, hledger will expect the separator
to be CHAR instead of a comma. Ie it will read other "Character Sepa-
rated Values" formats, such as TSV (Tab Separated Values). Note: on
the command line, use a real tab character in quotes, not Eg:
$ hledger -f foo.tsv --separator ' ' print
(Experimental.)
Setting amounts
A posting amount can be set in one of these ways:
o by assigning (with a fields list or field assigment) to amountN
(posting N's amount) or amount (posting 1's amount)
o by assigning to amountN-in and amountN-out (or amount-in and amount-
out). For each CSV record, whichever of these has a non-zero value
will be used, with appropriate sign. If both contain a non-zero
value, this may not work.
o by assigning to balanceN (or balance) instead of the above, setting
the amount indirectly via a balance assignment.
There is some special handling for sign in amounts: There is some special handling for sign in amounts:
o If an amount value is parenthesised, it will be de-parenthesised and o If an amount value is parenthesised, it will be de-parenthesised and
sign-flipped. sign-flipped.
o If an amount value begins with a double minus sign, those will cancel o If an amount value begins with a double minus sign, those cancel out
out and be removed. and are removed.
If the currency/commodity symbol is provided as a separate CSV field, If the currency/commodity symbol is provided as a separate CSV field,
assign it to the currency pseudo field; the symbol will be prepended to you can assign it to currency (affects all posting amounts) or curren-
the amount (TODO: when there is an amount). Or, you can use an amount cyN (affects just posting N's amount). The symbol will be prepended to
field assignment for more control, eg: the amount. Or for more control, you can set both currency symbol and
amount with a field assignment, eg:
fields date,description,currency,amount fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency amount %amount %currency
CSV balance assertions/assignments Referencing other fields
If the CSV includes a running balance, you can assign that to one of In field assignments, you can interpolate only CSV fields, not hledger
the pseudo fields balance (or balance1) or balance2. This will gener- fields. In the example below, there's both a CSV field and a hledger
ate a balance assertion (or if the amount is left empty, a balance as- field named amount1, but %amount1 always means the CSV field, not the
signment), on the first or second posting, whenever the running balance hledger field:
field is non-empty. (TODO: #1000)
Reading multiple CSV files # Name the third CSV field "amount1"
You can read multiple CSV files at once using multiple -f arguments on fields date,description,amount1
the command line, and hledger will look for a correspondingly-named
rules file for each. Note if you use the --rules-file option, this one
rules file will be used for all the CSV files being read.
Valid CSV # Set hledger's amount1 to the CSV amount1 field followed by USD
hledger follows RFC 4180, with the addition of a customisable separator amount1 %amount1 USD
character.
Some things to note: # Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
When quoting fields, Here, since there's no CSV amount1 field, %amount1 will produce a lit-
eral "amount1":
o you must use double quotes, not single quotes fields date,description,csvamount
amount1 %csvamount USD
# Can't interpolate amount1 here
comment %amount1
o spaces outside the quotes are not allowed. When there are multiple field assignments to the same hledger field,
only the last one takes effect. Here, comment's value will be be B, or
C if "something" is matched, but never A:
comment A
comment B
if something
comment C
How CSV rules are evaluated
Here's how to think of CSV rules being evaluated (if you really need
to). First,
o include - all includes are inlined, from top to bottom, depth first.
(At each include point the file is inlined and scanned for further
includes, before proceeding.)
Then "global" rules are evaluated, top to bottom. If a rule is re-
peated, the last one wins:
o skip (at top level)
o date-format
o newest-first
o fields - names the CSV fields, optionally sets up initial assignments
to hledger fields
Then for each CSV record in turn:
o test all if blocks. If any of them contain a end rule, skip all re-
maining CSV records. Otherwise if any of them contain a skip rule,
skip that many CSV records. If there are multiple matched skip
rules, the first one wins.
o collect all field assignments at top level and in matched if blocks.
When there are multiple assignments for a field, keep only the last
one.
o compute a value for each hledger field - either the one that was as-
signed to it (and interpolate the %CSVFIELDNAME references), or a de-
fault
o generate a synthetic hledger transaction from these values, which be-
comes part of the input to the hledger command that has been selected
Valid transactions
hledger currently does not post-process and validate transactions gen-
erated from CSV as thoroughly as transactions read from a journal file.
This means that if your rules are wrong, you can generate invalid
transactions. Or, amounts may not be displayed with a canonical dis-
play style.
So when setting up or adjusting CSV rules, you should check your re-
sults visually with the print command. You can pipe print's output
through hledger once more to validate and canonicalise fully. Eg:
$ hledger -f some.csv print | hledger -f- print -I
(The -I/--ignore-assertions flag disables balance assertion checks,
usually needed when re-parsing print output.)