;doc: regen manuals

[ci skip]
This commit is contained in:
Simon Michael 2019-11-06 13:10:17 -08:00
parent d92351e21a
commit 7ecc42f142
3 changed files with 1224 additions and 566 deletions

View File

@ -18,8 +18,8 @@ These do several things:
.IP \[bu] 2
they describe the layout and format of the CSV data
.IP \[bu] 2
they can customize the generated journal entries using a simple
templating language
they can customize the generated journal entries (transactions) using a
simple templating language
.IP \[bu] 2
they can add refinements based on patterns in the CSV data, eg
categorizing transactions with more detailed account names.
@ -44,70 +44,142 @@ skip 1
\f[R]
.fi
.PP
A more complete example:
More examples in the EXAMPLES section below.
.SH CSV RULES
.PP
The following kinds of rule can appear in the rules file, in any order
(except for \f[C]end\f[R] which can appear only inside a conditional
block).
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
ignored.
.SS \f[C]skip\f[R]
.IP
.nf
\f[C]
# hledger CSV rules for amazon.com order history
# sample:
# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
# skip one header line
skip 1
# name the csv fields (and assign the transaction\[aq]s date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
skip N
\f[R]
.fi
.PP
For more examples, see Convert CSV files.
.SH CSV RULES
.PP
The following seven kinds of rule can appear in the rules file, in any
order.
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
ignored.
.SS skip
.PP
\f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
.PP
Skip this many non-empty lines preceding the CSV data.
The word \[dq]skip\[dq] followed by a number (or no number, meaning 1)
tells hledger to ignore this many non-empty lines preceding the CSV
data.
(Empty/blank lines are skipped automatically.) You\[aq]ll need this
whenever your CSV data contains header lines.
.PP
It also has a second purpose: it can be used to ignore certain CSV
records, see conditional blocks below.
.SS \f[C]fields\f[R]
.IP
.nf
\f[C]
fields FIELDNAME1, FIELDNAME2, ...
\f[R]
.fi
.PP
A fields list (\[dq]fields\[dq] followed by one or more comma-separated
field names) is the quick way to assign CSV field values to hledger
fields.
It (a) names the CSV fields, in order (names may not contain whitespace;
fields you don\[aq]t care about can be left unnamed), and (b) assigns
them to hledger fields if you use standard hledger field names.
Here\[aq]s an example:
.IP
.nf
\f[C]
# use the 1st, 2nd and 4th CSV fields as the transaction\[aq]s date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
\f[R]
.fi
.PP
Here are the standard hledger field names:
.SS Transaction fields
.PP
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
\f[C]description\f[R], \f[C]comment\f[R] can be used to form the
transaction\[aq]s first line.
Only \f[C]date\f[R] is required.
(See also date-format below.)
.SS Posting fields
.PP
\f[C]accountN\f[R], where N is 1 to 9, sets the Nth posting\[aq]s
account name.
Most often there are two postings, so you\[aq]ll want to set
\f[C]account1\f[R] and \f[C]account2\f[R].
.PP
A number of field/pseudo-field names are available for setting posting
amounts:
.IP \[bu] 2
\f[C]amountN\f[R] sets posting N\[aq]s amount
.IP \[bu] 2
\f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] can be used instead, if
the CSV has separate fields for debits and credits
.IP \[bu] 2
\f[C]currencyN\f[R] sets a currency symbol to be left-prefixed to the
amount, useful if the CSV provides that as a separate field
.IP \[bu] 2
\f[C]balanceN\f[R] sets a (separate) balance assertion amount (or when
no posting amount is set, a balance assignment)
.PP
If you write these with no number (\f[C]amount\f[R],
\f[C]amount-in\f[R], \f[C]amount-out\f[R], \f[C]currency\f[R],
\f[C]balance\f[R]), it means posting 1.
Also, if you set an amount for posting 1 only, a second posting that
balances the transaction will be generated automatically.
This helps support CSV rules created before hledger 1.16.
.PP
Finally, \f[C]commentN\f[R] sets a comment on the Nth posting.
Comments can of course contain tags.
.SS \f[C](field assignment)\f[R]
.IP
.nf
\f[C]
HLEDGERFIELDNAME FIELDVALUE
\f[R]
.fi
.PP
Instead of or in addition to a fields list, you can assign a value to a
hledger field by writing its name (any of the standard names above)
followed by a text value.
The value may contain interpolated CSV fields, referenced by their
1-based position in the CSV record (\f[C]%N\f[R]), or by the name they
were given in the fields list (\f[C]%CSVFIELDNAME\f[R]).
Eg:
.IP
.nf
\f[C]
# ignore the first CSV line
skip 1
# set the amount to the 4th CSV field, with \[dq] USD\[dq] appended
amount %4 USD
\f[R]
.fi
.IP
.nf
\f[C]
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
\f[R]
.fi
.SS date-format
.PP
\f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R]
Interpolation strips any outer whitespace, so a CSV value like
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
Note you can only interpolate CSV fields, not the hledger fields being
assigned to; for more on this, see TIPS.
.SS \f[C]date-format\f[R]
.IP
.nf
\f[C]
date-format DATEFMT
\f[R]
.fi
.PP
When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R]
(or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to
specify the format.
DATEFMT is a strptime-like date parsing pattern, which must parse the
date field values completely.
This is a helper for the \f[C]date\f[R] (and \f[C]date2\f[R]) fields.
If your CSV dates are not formatted like \f[C]YYYY-MM-DD\f[R],
\f[C]YYYY/MM/DD\f[R] or \f[C]YYYY.MM.DD\f[R], you\[aq]ll need to specify
the format by writing \[dq]date-format\[dq] followed by a strptime-like
date parsing pattern, which must parse the date field values completely.
Examples:
.IP
.nf
@ -119,7 +191,7 @@ date-format %m/%d/%Y
.IP
.nf
\f[C]
# for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional):
# for dates like \[dq]6/11/2013\[dq]. The - allows leading zeros to be optional.
date-format %-d/%-m/%Y
\f[R]
.fi
@ -137,90 +209,47 @@ date-format %Y-%h-%d
date-format %-m/%-d/%Y %l:%M %p
\f[R]
.fi
.SS field list
.PP
\f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
\f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
.PP
This (a) names the CSV fields, in order (names may not contain
whitespace; uninteresting names may be left blank), and (b) assigns them
to journal entry fields if you use any of these standard field names:
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
\f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
\f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
\f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
\f[C]balance1\f[R], \f[C]balance2\f[R].
Eg:
.SS \f[C]if\f[R]
.IP
.nf
\f[C]
# use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount,
# and give the 7th and 8th fields meaningful names for later reference:
#
# CSV field:
# 1 2 3 4 5 6 7 8
# entry field:
fields date, description, , amount, , , somefield, anotherfield
\f[R]
.fi
.SS field assignment
.PP
\f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
.PP
This sets a journal entry field (one of the standard names above) to the
given text value, which can include CSV field values interpolated by
name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
Eg:
.IP
.nf
\f[C]
# set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
amount USD %4
\f[R]
.fi
.IP
.nf
\f[C]
# combine three fields to make a comment (containing two tags)
comment note: %somefield - %anotherfield, date: %1
if PATTERN
RULE
if
PATTERN
PATTERN
PATTERN
RULE
RULE
\f[R]
.fi
.PP
Field assignments can be used instead of or in addition to a field list.
Conditional blocks apply one or more rules to CSV records which are
matched by any of the PATTERNs.
This allows transactions to be customised or categorised based on
patterns in the data.
.PP
Note, interpolation strips any outer whitespace, so a CSV value like
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
.SS conditional block
A single pattern can be written on the same line as the \[dq]if\[dq]; or
multiple patterns can be written on the following lines, non-indented.
.PP
\f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R]
.PD 0
.P
.PD
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
Patterns are case-insensitive regular expressions which try to match any
part of the whole CSV record.
It\[aq]s not yet possible to match within a specific field.
Note the CSV record they see is close but not identical to the one in
the CSV file; eg double quotes are removed, and the separator character
becomes comma.
.PP
\f[C]if\f[R]
.PD 0
.P
.PD
\f[I]\f[CI]PATTERN\f[I]\f[R]
.PD 0
.P
.PD
\f[I]\f[CI]PATTERN\f[I]\f[R]...
.PD 0
.P
.PD
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
After the patterns, there should be one or more rules to apply, all
indented by at least one space.
Three kinds of rule are allowed in conditional blocks:
.IP \[bu] 2
field assignments (to set a field\[aq]s value)
.IP \[bu] 2
skip (to skip the matched CSV record)
.IP \[bu] 2
end (to skip all remaining CSV records).
.PP
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs.
The patterns are case-insensitive regular expressions which match
anywhere within the whole CSV record (it\[aq]s not yet possible to match
within a specific field).
When there are multiple patterns they can be written on separate lines,
unindented.
The field assignments are on separate lines indented by at least one
space.
Examples:
.IP
.nf
@ -242,112 +271,319 @@ banking thru software
comment XXX deductible ? check it
\f[R]
.fi
.SS include
.SS \f[C]end\f[R]
.PP
\f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R]
.PP
Include another rules file at this point.
\f[C]RULESFILE\f[R] is either an absolute file path or a path relative
to the current file\[aq]s directory.
As mentioned above, this rule can be used inside conditional blocks
(only) to cause hledger to stop reading CSV records and proceed with
command execution.
Eg:
.IP
.nf
\f[C]
# rules reused with several CSV files
include common.rules
# ignore everything following the first empty record
if ,,,,
end
\f[R]
.fi
.SS \f[C]include\f[R]
.IP
.nf
\f[C]
include RULESFILE
\f[R]
.fi
.SS newest-first
.PP
\f[C]newest-first\f[R]
Include another CSV rules file at this point, as if it were written
inline.
\f[C]RULESFILE\f[R] is an absolute file path or a path relative to the
current file\[aq]s directory.
.PP
Consider adding this rule if all of the following are true: you might be
processing just one day of data, your CSV records are in reverse
chronological order (newest first), and you care about preserving the
order of same-day transactions.
It usually isn\[aq]t needed, because hledger autodetects the CSV order,
but when all CSV records have the same date it will assume they are
oldest first.
.SH CSV TIPS
.SS CSV ordering
This can be useful eg for reusing common rules in several rules files:
.IP
.nf
\f[C]
# someaccount.csv.rules
## someaccount-specific rules
fields date,description,amount
account1 some:account
account2 some:misc
## common rules
include categorisation.rules
\f[R]
.fi
.SS \f[C]newest-first\f[R]
.PP
The generated journal entries will be sorted by date.
The order of same-day entries will be preserved (except in the special
case where you might need \f[C]newest-first\f[R], see above).
.SS CSV accounts
.PP
Each journal entry will have two postings, to \f[C]account1\f[R] and
\f[C]account2\f[R] respectively.
It\[aq]s not yet possible to generate entries with more than two
postings.
It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
account whose CSV we are reading.
.SS CSV amounts
.PP
A transaction amount must be set, in one of these ways:
hledger always sorts the generated transactions by date.
Transactions on the same date should appear in the same order as their
CSV records, as hledger can usually auto-detect whether the CSV\[aq]s
normal order is oldest first or newest first.
But if all of the following are true:
.IP \[bu] 2
with an \f[C]amount\f[R] field assignment, which sets the first
posting\[aq]s amount
the CSV might sometimes contain just one day of data (all records having
the same date)
.IP \[bu] 2
(When the CSV has debit and credit amounts in separate fields:)
.PD 0
.P
.PD
with field assignments for the \f[C]amount-in\f[R] and
\f[C]amount-out\f[R] pseudo fields (both of them).
Whichever one has a value will be used, with appropriate sign.
If both contain a value, it might not work so well.
the CSV records are normally in reverse chronological order (newest
first)
.IP \[bu] 2
or implicitly by means of a balance assignment (see below).
and you care about preserving the order of same-day transactions
.PP
you should add the \f[C]newest-first\f[R] rule as a hint.
Eg:
.IP
.nf
\f[C]
# tell hledger explicitly that the CSV is normally newest-first
newest-first
\f[R]
.fi
.SH EXAMPLES
.PP
A more complete example, generating three-posting transactions:
.IP
.nf
\f[C]
# hledger CSV rules for amazon.com order history
# sample:
# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
# skip one header line
skip 1
# name the csv fields (and assign the transaction\[aq]s date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
\f[R]
.fi
.PP
For more examples, see Convert CSV files.
.SH TIPS
.SS Reading multiple CSV files
.PP
You can read multiple CSV files at once using multiple \f[C]-f\f[R]
arguments on the command line.
hledger will look for a correspondingly-named rules file for each CSV
file.
If you use the \f[C]--rules-file\f[R] option, that rules file will be
used for all the CSV files.
.SS Deduplicating, importing
.PP
When you download a CSV file repeatedly, eg to get your latest bank
transactions, the new file may contain some of the same records as the
old one.
The print --new command is one simple way to detect just the new
transactions.
Or better still, the import command appends those new transactions to
your main journal.
This is the easiest way to import CSV data.
Eg, after downloading your latest CSV files:
.IP
.nf
\f[C]
$ hledger import *.csv [--dry]
\f[R]
.fi
.SS Other import methods
.PP
A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
.IP \[bu] 2
https://hledger.org -> sidebar -> real world setups
.IP \[bu] 2
https://plaintextaccounting.org -> data import/conversion
.SS Valid CSV
.PP
hledger accepts CSV conforming to RFC 4180.
Some things to note when values are enclosed in quotes:
.IP \[bu] 2
you must use double quotes (not single quotes)
.IP \[bu] 2
spaces outside the quotes are not allowed
.SS Other separator characters
.PP
With the \f[C]--separator \[aq]CHAR\[aq]\f[R] option, hledger will
expect the separator to be CHAR instead of a comma.
Ie it will read other \[dq]Character Separated Values\[dq] formats, such
as TSV (Tab Separated Values).
Note: on the command line, use a real tab character in quotes, not Eg:
.IP
.nf
\f[C]
$ hledger -f foo.tsv --separator \[aq] \[aq] print
\f[R]
.fi
.PP
(Experimental.)
.SS Setting amounts
.PP
A posting amount can be set in one of these ways:
.IP \[bu] 2
by assigning (with a fields list or field assigment) to
\f[C]amountN\f[R] (posting N\[aq]s amount) or \f[C]amount\f[R] (posting
1\[aq]s amount)
.IP \[bu] 2
by assigning to \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] (or
\f[C]amount-in\f[R] and \f[C]amount-out\f[R]).
For each CSV record, whichever of these has a non-zero value will be
used, with appropriate sign.
If both contain a non-zero value, this may not work.
.IP \[bu] 2
by assigning to \f[C]balanceN\f[R] (or \f[C]balance\f[R]) instead of the
above, setting the amount indirectly via a balance assignment.
.PP
There is some special handling for sign in amounts:
.IP \[bu] 2
If an amount value is parenthesised, it will be de-parenthesised and
sign-flipped.
.IP \[bu] 2
If an amount value begins with a double minus sign, those will cancel
out and be removed.
If an amount value begins with a double minus sign, those cancel out and
are removed.
.PP
If the currency/commodity symbol is provided as a separate CSV field,
assign it to the \f[C]currency\f[R] pseudo field; the symbol will be
prepended to the amount (TODO: when there is an amount).
Or, you can use an \f[C]amount\f[R] field assignment for more control,
eg:
you can assign it to \f[C]currency\f[R] (affects all posting amounts) or
\f[C]currencyN\f[R] (affects just posting N\[aq]s amount).
The symbol will be prepended to the amount.
Or for more control, you can set both currency symbol and amount with a
field assignment, eg:
.IP
.nf
\f[C]
fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency
\f[R]
.fi
.SS CSV balance assertions/assignments
.SS Referencing other fields
.PP
If the CSV includes a running balance, you can assign that to one of the
pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or
\f[C]balance2\f[R].
This will generate a balance assertion (or if the amount is left empty,
a balance assignment), on the first or second posting, whenever the
running balance field is non-empty.
(TODO: #1000)
.SS Reading multiple CSV files
In field assignments, you can interpolate only CSV fields, not hledger
fields.
In the example below, there\[aq]s both a CSV field and a hledger field
named amount1, but %amount1 always means the CSV field, not the hledger
field:
.IP
.nf
\f[C]
# Name the third CSV field \[dq]amount1\[dq]
fields date,description,amount1
# Set hledger\[aq]s amount1 to the CSV amount1 field followed by USD
amount1 %amount1 USD
# Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
\f[R]
.fi
.PP
You can read multiple CSV files at once using multiple \f[C]-f\f[R]
arguments on the command line, and hledger will look for a
correspondingly-named rules file for each.
Note if you use the \f[C]--rules-file\f[R] option, this one rules file
will be used for all the CSV files being read.
.SS Valid CSV
Here, since there\[aq]s no CSV amount1 field, %amount1 will produce a
literal \[dq]amount1\[dq]:
.IP
.nf
\f[C]
fields date,description,csvamount
amount1 %csvamount USD
# Can\[aq]t interpolate amount1 here
comment %amount1
\f[R]
.fi
.PP
hledger follows RFC 4180, with the addition of a customisable separator
character.
When there are multiple field assignments to the same hledger field,
only the last one takes effect.
Here, comment\[aq]s value will be be B, or C if \[dq]something\[dq] is
matched, but never A:
.IP
.nf
\f[C]
comment A
comment B
if something
comment C
\f[R]
.fi
.SS How CSV rules are evaluated
.PP
Some things to note:
.PP
When quoting fields,
Here\[aq]s how to think of CSV rules being evaluated (if you really need
to).
First,
.IP \[bu] 2
you must use double quotes, not single quotes
include - all includes are inlined, from top to bottom, depth first.
(At each include point the file is inlined and scanned for further
includes, before proceeding.)
.PP
Then \[dq]global\[dq] rules are evaluated, top to bottom.
If a rule is repeated, the last one wins:
.IP \[bu] 2
spaces outside the quotes are not allowed.
skip (at top level)
.IP \[bu] 2
date-format
.IP \[bu] 2
newest-first
.IP \[bu] 2
fields - names the CSV fields, optionally sets up initial assignments to
hledger fields
.PP
Then for each CSV record in turn:
.IP \[bu] 2
test all \f[C]if\f[R] blocks.
If any of them contain a \f[C]end\f[R] rule, skip all remaining CSV
records.
Otherwise if any of them contain a \f[C]skip\f[R] rule, skip that many
CSV records.
If there are multiple matched skip rules, the first one wins.
.IP \[bu] 2
collect all field assignments at top level and in matched if blocks.
When there are multiple assignments for a field, keep only the last one.
.IP \[bu] 2
compute a value for each hledger field - either the one that was
assigned to it (and interpolate the %CSVFIELDNAME references), or a
default
.IP \[bu] 2
generate a synthetic hledger transaction from these values, which
becomes part of the input to the hledger command that has been selected
.SS Valid transactions
.PP
hledger currently does not post-process and validate transactions
generated from CSV as thoroughly as transactions read from a journal
file.
This means that if your rules are wrong, you can generate invalid
transactions.
Or, amounts may not be displayed with a canonical display style.
.PP
So when setting up or adjusting CSV rules, you should check your results
visually with the print command.
You can pipe print\[aq]s output through hledger once more to validate
and canonicalise fully.
Eg:
.IP
.nf
\f[C]
$ hledger -f some.csv print | hledger -f- print -I
\f[R]
.fi
.PP
(The -I/--ignore-assertions flag disables balance assertion checks,
usually needed when re-parsing print output.)
.SH "REPORTING BUGS"

View File

@ -14,8 +14,8 @@ transaction. (To learn about _writing_ CSV, see CSV output.)
rules. These do several things:
* they describe the layout and format of the CSV data
* they can customize the generated journal entries using a simple
templating language
* they can customize the generated journal entries (transactions)
using a simple templating language
* they can add refinements based on patterns in the CSV data, eg
categorizing transactions with more detailed account names.
@ -33,93 +33,164 @@ fields date, _, _, amount
date-format %d/%m/%Y
skip 1
A more complete example:
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
For more examples, see Convert CSV files.
More examples in the EXAMPLES section below.
* Menu:
* CSV RULES::
* CSV TIPS::
* EXAMPLES::
* TIPS::

File: hledger_csv.info, Node: CSV RULES, Next: CSV TIPS, Prev: Top, Up: Top
File: hledger_csv.info, Node: CSV RULES, Next: EXAMPLES, Prev: Top, Up: Top
1 CSV RULES
***********
The following seven kinds of rule can appear in the rules file, in any
order. Blank lines and lines beginning with '#' or ';' are ignored.
The following kinds of rule can appear in the rules file, in any order
(except for 'end' which can appear only inside a conditional block).
Blank lines and lines beginning with '#' or ';' are ignored.
* Menu:
* skip::
* date-format::
* field list::
* fields::
* field assignment::
* conditional block::
* date-format::
* if::
* end::
* include::
* newest-first::

File: hledger_csv.info, Node: skip, Next: date-format, Up: CSV RULES
File: hledger_csv.info, Node: skip, Next: fields, Up: CSV RULES
1.1 skip
========
1.1 'skip'
==========
'skip'_'N'_
skip N
Skip this many non-empty lines preceding the CSV data. (Empty/blank
lines are skipped automatically.) You'll need this whenever your CSV
data contains header lines. Eg:
The word "skip" followed by a number (or no number, meaning 1) tells
hledger to ignore this many non-empty lines preceding the CSV data.
(Empty/blank lines are skipped automatically.) You'll need this
whenever your CSV data contains header lines.
# ignore the first CSV line
skip 1
It also has a second purpose: it can be used to ignore certain CSV
records, see conditional blocks below.

File: hledger_csv.info, Node: date-format, Next: field list, Prev: skip, Up: CSV RULES
File: hledger_csv.info, Node: fields, Next: field assignment, Prev: skip, Up: CSV RULES
1.2 date-format
===============
1.2 'fields'
============
'date-format'_'DATEFMT'_
fields FIELDNAME1, FIELDNAME2, ...
When your CSV date fields are not formatted like 'YYYY/MM/DD' (or
'YYYY-MM-DD' or 'YYYY.MM.DD'), you'll need to specify the format.
DATEFMT is a strptime-like date parsing pattern, which must parse the
date field values completely. Examples:
A fields list ("fields" followed by one or more comma-separated field
names) is the quick way to assign CSV field values to hledger fields.
It (a) names the CSV fields, in order (names may not contain whitespace;
fields you don't care about can be left unnamed), and (b) assigns them
to hledger fields if you use standard hledger field names. Here's an
example:
# use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
Here are the standard hledger field names:
* Menu:
* Transaction fields::
* Posting fields::

File: hledger_csv.info, Node: Transaction fields, Next: Posting fields, Up: fields
1.2.1 Transaction fields
------------------------
'date', 'date2', 'status', 'code', 'description', 'comment' can be used
to form the transaction's first line. Only 'date' is required. (See
also date-format below.)

File: hledger_csv.info, Node: Posting fields, Prev: Transaction fields, Up: fields
1.2.2 Posting fields
--------------------
'accountN', where N is 1 to 9, sets the Nth posting's account name.
Most often there are two postings, so you'll want to set 'account1' and
'account2'.
A number of field/pseudo-field names are available for setting
posting amounts:
* 'amountN' sets posting N's amount
* 'amountN-in' and 'amountN-out' can be used instead, if the CSV has
separate fields for debits and credits
* 'currencyN' sets a currency symbol to be left-prefixed to the
amount, useful if the CSV provides that as a separate field
* 'balanceN' sets a (separate) balance assertion amount (or when no
posting amount is set, a balance assignment)
If you write these with no number ('amount', 'amount-in',
'amount-out', 'currency', 'balance'), it means posting 1. Also, if you
set an amount for posting 1 only, a second posting that balances the
transaction will be generated automatically. This helps support CSV
rules created before hledger 1.16.
Finally, 'commentN' sets a comment on the Nth posting. Comments can
of course contain tags.

File: hledger_csv.info, Node: field assignment, Next: date-format, Prev: fields, Up: CSV RULES
1.3 '(field assignment)'
========================
HLEDGERFIELDNAME FIELDVALUE
Instead of or in addition to a fields list, you can assign a value to
a hledger field by writing its name (any of the standard names above)
followed by a text value. The value may contain interpolated CSV
fields, referenced by their 1-based position in the CSV record ('%N'),
or by the name they were given in the fields list ('%CSVFIELDNAME').
Eg:
# set the amount to the 4th CSV field, with " USD" appended
amount %4 USD
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
Interpolation strips any outer whitespace, so a CSV value like '" 1
"' becomes '1' when interpolated (#1051). Note you can only interpolate
CSV fields, not the hledger fields being assigned to; for more on this,
see TIPS.

File: hledger_csv.info, Node: date-format, Next: if, Prev: field assignment, Up: CSV RULES
1.4 'date-format'
=================
date-format DATEFMT
This is a helper for the 'date' (and 'date2') fields. If your CSV
dates are not formatted like 'YYYY-MM-DD', 'YYYY/MM/DD' or 'YYYY.MM.DD',
you'll need to specify the format by writing "date-format" followed by a
strptime-like date parsing pattern, which must parse the date field
values completely. Examples:
# for dates like "11/06/2013":
date-format %m/%d/%Y
# for dates like "6/11/2013" (note the - to make leading zeros optional):
# for dates like "6/11/2013". The - allows leading zeros to be optional.
date-format %-d/%-m/%Y
# for dates like "2013-Nov-06":
@ -129,73 +200,43 @@ date-format %Y-%h-%d
date-format %-m/%-d/%Y %l:%M %p

File: hledger_csv.info, Node: field list, Next: field assignment, Prev: date-format, Up: CSV RULES
File: hledger_csv.info, Node: if, Next: end, Prev: date-format, Up: CSV RULES
1.3 field list
==============
1.5 'if'
========
'fields'_'FIELDNAME1'_, _'FIELDNAME2'_...
if PATTERN
RULE
This (a) names the CSV fields, in order (names may not contain
whitespace; uninteresting names may be left blank), and (b) assigns them
to journal entry fields if you use any of these standard field names:
'date', 'date2', 'status', 'code', 'description', 'comment', 'account1',
'account2', 'amount', 'amount-in', 'amount-out', 'currency', 'balance',
'balance1', 'balance2'. Eg:
if
PATTERN
PATTERN
PATTERN
RULE
RULE
# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
# and give the 7th and 8th fields meaningful names for later reference:
#
# CSV field:
# 1 2 3 4 5 6 7 8
# entry field:
fields date, description, , amount, , , somefield, anotherfield
Conditional blocks apply one or more rules to CSV records which are
matched by any of the PATTERNs. This allows transactions to be
customised or categorised based on patterns in the data.

File: hledger_csv.info, Node: field assignment, Next: conditional block, Prev: field list, Up: CSV RULES
A single pattern can be written on the same line as the "if"; or
multiple patterns can be written on the following lines, non-indented.
1.4 field assignment
====================
Patterns are case-insensitive regular expressions which try to match
any part of the whole CSV record. It's not yet possible to match within
a specific field. Note the CSV record they see is close but not
identical to the one in the CSV file; eg double quotes are removed, and
the separator character becomes comma.
_'ENTRYFIELDNAME'_ _'FIELDVALUE'_
After the patterns, there should be one or more rules to apply, all
indented by at least one space. Three kinds of rule are allowed in
conditional blocks:
This sets a journal entry field (one of the standard names above) to
the given text value, which can include CSV field values interpolated by
name ('%CSVFIELDNAME') or 1-based position ('%N'). Eg:
* field assignments (to set a field's value)
* skip (to skip the matched CSV record)
* end (to skip all remaining CSV records).
# set the amount to the 4th CSV field with "USD " prepended
amount USD %4
# combine three fields to make a comment (containing two tags)
comment note: %somefield - %anotherfield, date: %1
Field assignments can be used instead of or in addition to a field
list.
Note, interpolation strips any outer whitespace, so a CSV value like
'" 1 "' becomes '1' when interpolated (#1051).

File: hledger_csv.info, Node: conditional block, Next: include, Prev: field assignment, Up: CSV RULES
1.5 conditional block
=====================
'if' _'PATTERN'_
_'FIELDASSIGNMENTS'_...
'if'
_'PATTERN'_
_'PATTERN'_...
_'FIELDASSIGNMENTS'_...
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs. The patterns are case-insensitive
regular expressions which match anywhere within the whole CSV record
(it's not yet possible to match within a specific field). When there
are multiple patterns they can be written on separate lines, unindented.
The field assignments are on separate lines indented by at least one
space. Examples:
Examples:
# if the CSV record contains "groceries", set account2 to "expenses:groceries"
if groceries
@ -210,176 +251,369 @@ banking thru software
comment XXX deductible ? check it

File: hledger_csv.info, Node: include, Next: newest-first, Prev: conditional block, Up: CSV RULES
File: hledger_csv.info, Node: end, Next: include, Prev: if, Up: CSV RULES
1.6 include
===========
1.6 'end'
=========
'include'_'RULESFILE'_
As mentioned above, this rule can be used inside conditional blocks
(only) to cause hledger to stop reading CSV records and proceed with
command execution. Eg:
Include another rules file at this point. 'RULESFILE' is either an
absolute file path or a path relative to the current file's directory.
Eg:
# ignore everything following the first empty record
if ,,,,
end
# rules reused with several CSV files
include common.rules

File: hledger_csv.info, Node: include, Next: newest-first, Prev: end, Up: CSV RULES
1.7 'include'
=============
include RULESFILE
Include another CSV rules file at this point, as if it were written
inline. 'RULESFILE' is an absolute file path or a path relative to the
current file's directory.
This can be useful eg for reusing common rules in several rules
files:
# someaccount.csv.rules
## someaccount-specific rules
fields date,description,amount
account1 some:account
account2 some:misc
## common rules
include categorisation.rules

File: hledger_csv.info, Node: newest-first, Prev: include, Up: CSV RULES
1.7 newest-first
================
1.8 'newest-first'
==================
'newest-first'
hledger always sorts the generated transactions by date. Transactions
on the same date should appear in the same order as their CSV records,
as hledger can usually auto-detect whether the CSV's normal order is
oldest first or newest first. But if all of the following are true:
Consider adding this rule if all of the following are true: you might
be processing just one day of data, your CSV records are in reverse
chronological order (newest first), and you care about preserving the
order of same-day transactions. It usually isn't needed, because
hledger autodetects the CSV order, but when all CSV records have the
same date it will assume they are oldest first.
* the CSV might sometimes contain just one day of data (all records
having the same date)
* the CSV records are normally in reverse chronological order (newest
first)
* and you care about preserving the order of same-day transactions
you should add the 'newest-first' rule as a hint. Eg:
# tell hledger explicitly that the CSV is normally newest-first
newest-first

File: hledger_csv.info, Node: CSV TIPS, Prev: CSV RULES, Up: Top
File: hledger_csv.info, Node: EXAMPLES, Next: TIPS, Prev: CSV RULES, Up: Top
2 CSV TIPS
2 EXAMPLES
**********
A more complete example, generating three-posting transactions:
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
For more examples, see Convert CSV files.

File: hledger_csv.info, Node: TIPS, Prev: EXAMPLES, Up: Top
3 TIPS
******
* Menu:
* CSV ordering::
* CSV accounts::
* CSV amounts::
* CSV balance assertions/assignments::
* Reading multiple CSV files::
* Deduplicating importing::
* Other import methods::
* Valid CSV::
* Other separator characters::
* Setting amounts::
* Referencing other fields::
* How CSV rules are evaluated::
* Valid transactions::

File: hledger_csv.info, Node: CSV ordering, Next: CSV accounts, Up: CSV TIPS
File: hledger_csv.info, Node: Reading multiple CSV files, Next: Deduplicating importing, Up: TIPS
2.1 CSV ordering
================
3.1 Reading multiple CSV files
==============================
The generated journal entries will be sorted by date. The order of
same-day entries will be preserved (except in the special case where you
might need 'newest-first', see above).
You can read multiple CSV files at once using multiple '-f' arguments on
the command line. hledger will look for a correspondingly-named rules
file for each CSV file. If you use the '--rules-file' option, that
rules file will be used for all the CSV files.

File: hledger_csv.info, Node: CSV accounts, Next: CSV amounts, Prev: CSV ordering, Up: CSV TIPS
File: hledger_csv.info, Node: Deduplicating importing, Next: Other import methods, Prev: Reading multiple CSV files, Up: TIPS
2.2 CSV accounts
================
3.2 Deduplicating, importing
============================
Each journal entry will have two postings, to 'account1' and 'account2'
respectively. It's not yet possible to generate entries with more than
two postings. It's conventional and recommended to use 'account1' for
the account whose CSV we are reading.
When you download a CSV file repeatedly, eg to get your latest bank
transactions, the new file may contain some of the same records as the
old one. The print -new command is one simple way to detect just the
new transactions. Or better still, the import command appends those new
transactions to your main journal. This is the easiest way to import
CSV data. Eg, after downloading your latest CSV files:
$ hledger import *.csv [--dry]

File: hledger_csv.info, Node: CSV amounts, Next: CSV balance assertions/assignments, Prev: CSV accounts, Up: CSV TIPS
File: hledger_csv.info, Node: Other import methods, Next: Valid CSV, Prev: Deduplicating importing, Up: TIPS
2.3 CSV amounts
===============
3.3 Other import methods
========================
A transaction amount must be set, in one of these ways:
A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
* with an 'amount' field assignment, which sets the first posting's
amount
* https://hledger.org -> sidebar -> real world setups
* https://plaintextaccounting.org -> data import/conversion
* (When the CSV has debit and credit amounts in separate fields:)
with field assignments for the 'amount-in' and 'amount-out' pseudo
fields (both of them). Whichever one has a value will be used,
with appropriate sign. If both contain a value, it might not work
so well.

File: hledger_csv.info, Node: Valid CSV, Next: Other separator characters, Prev: Other import methods, Up: TIPS
* or implicitly by means of a balance assignment (see below).
3.4 Valid CSV
=============
hledger accepts CSV conforming to RFC 4180. Some things to note when
values are enclosed in quotes:
* you must use double quotes (not single quotes)
* spaces outside the quotes are not allowed

File: hledger_csv.info, Node: Other separator characters, Next: Setting amounts, Prev: Valid CSV, Up: TIPS
3.5 Other separator characters
==============================
With the '--separator 'CHAR'' option, hledger will expect the separator
to be CHAR instead of a comma. Ie it will read other "Character
Separated Values" formats, such as TSV (Tab Separated Values). Note: on
the command line, use a real tab character in quotes, not
$ hledger -f foo.tsv --separator ' ' print
(Experimental.)

File: hledger_csv.info, Node: Setting amounts, Next: Referencing other fields, Prev: Other separator characters, Up: TIPS
3.6 Setting amounts
===================
A posting amount can be set in one of these ways:
* by assigning (with a fields list or field assigment) to 'amountN'
(posting N's amount) or 'amount' (posting 1's amount)
* by assigning to 'amountN-in' and 'amountN-out' (or 'amount-in' and
'amount-out'). For each CSV record, whichever of these has a
non-zero value will be used, with appropriate sign. If both
contain a non-zero value, this may not work.
* by assigning to 'balanceN' (or 'balance') instead of the above,
setting the amount indirectly via a balance assignment.
There is some special handling for sign in amounts:
* If an amount value is parenthesised, it will be de-parenthesised
and sign-flipped.
* If an amount value begins with a double minus sign, those will
cancel out and be removed.
* If an amount value begins with a double minus sign, those cancel
out and are removed.
If the currency/commodity symbol is provided as a separate CSV field,
assign it to the 'currency' pseudo field; the symbol will be prepended
to the amount (TODO: when there is an amount). Or, you can use an
'amount' field assignment for more control, eg:
you can assign it to 'currency' (affects all posting amounts) or
'currencyN' (affects just posting N's amount). The symbol will be
prepended to the amount. Or for more control, you can set both currency
symbol and amount with a field assignment, eg:
fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency

File: hledger_csv.info, Node: CSV balance assertions/assignments, Next: Reading multiple CSV files, Prev: CSV amounts, Up: CSV TIPS
File: hledger_csv.info, Node: Referencing other fields, Next: How CSV rules are evaluated, Prev: Setting amounts, Up: TIPS
2.4 CSV balance assertions/assignments
======================================
3.7 Referencing other fields
============================
If the CSV includes a running balance, you can assign that to one of the
pseudo fields 'balance' (or 'balance1') or 'balance2'. This will
generate a balance assertion (or if the amount is left empty, a balance
assignment), on the first or second posting, whenever the running
balance field is non-empty. (TODO: #1000)
In field assignments, you can interpolate only CSV fields, not hledger
fields. In the example below, there's both a CSV field and a hledger
field named amount1, but %amount1 always means the CSV field, not the
hledger field:
# Name the third CSV field "amount1"
fields date,description,amount1
# Set hledger's amount1 to the CSV amount1 field followed by USD
amount1 %amount1 USD
# Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
Here, since there's no CSV amount1 field, %amount1 will produce a
literal "amount1":
fields date,description,csvamount
amount1 %csvamount USD
# Can't interpolate amount1 here
comment %amount1
When there are multiple field assignments to the same hledger field,
only the last one takes effect. Here, comment's value will be be B, or
C if "something" is matched, but never A:
comment A
comment B
if something
comment C

File: hledger_csv.info, Node: Reading multiple CSV files, Next: Valid CSV, Prev: CSV balance assertions/assignments, Up: CSV TIPS
File: hledger_csv.info, Node: How CSV rules are evaluated, Next: Valid transactions, Prev: Referencing other fields, Up: TIPS
2.5 Reading multiple CSV files
==============================
3.8 How CSV rules are evaluated
===============================
You can read multiple CSV files at once using multiple '-f' arguments on
the command line, and hledger will look for a correspondingly-named
rules file for each. Note if you use the '--rules-file' option, this
one rules file will be used for all the CSV files being read.
Here's how to think of CSV rules being evaluated (if you really need
to). First,
* include - all includes are inlined, from top to bottom, depth
first. (At each include point the file is inlined and scanned for
further includes, before proceeding.)
Then "global" rules are evaluated, top to bottom. If a rule is
repeated, the last one wins:
* skip (at top level)
* date-format
* newest-first
* fields - names the CSV fields, optionally sets up initial
assignments to hledger fields
Then for each CSV record in turn:
* test all 'if' blocks. If any of them contain a 'end' rule, skip
all remaining CSV records. Otherwise if any of them contain a
'skip' rule, skip that many CSV records. If there are multiple
matched skip rules, the first one wins.
* collect all field assignments at top level and in matched if
blocks. When there are multiple assignments for a field, keep only
the last one.
* compute a value for each hledger field - either the one that was
assigned to it (and interpolate the %CSVFIELDNAME references), or a
default
* generate a synthetic hledger transaction from these values, which
becomes part of the input to the hledger command that has been
selected

File: hledger_csv.info, Node: Valid CSV, Prev: Reading multiple CSV files, Up: CSV TIPS
File: hledger_csv.info, Node: Valid transactions, Prev: How CSV rules are evaluated, Up: TIPS
2.6 Valid CSV
=============
3.9 Valid transactions
======================
hledger follows RFC 4180, with the addition of a customisable separator
character.
hledger currently does not post-process and validate transactions
generated from CSV as thoroughly as transactions read from a journal
file. This means that if your rules are wrong, you can generate invalid
transactions. Or, amounts may not be displayed with a canonical display
style.
Some things to note:
So when setting up or adjusting CSV rules, you should check your
results visually with the print command. You can pipe print's output
through hledger once more to validate and canonicalise fully. Eg:
When quoting fields,
$ hledger -f some.csv print | hledger -f- print -I
* you must use double quotes, not single quotes
* spaces outside the quotes are not allowed.
(The -I/-ignore-assertions flag disables balance assertion checks,
usually needed when re-parsing print output.)

Tag Table:
Node: Top72
Node: CSV RULES2167
Ref: #csv-rules2275
Node: skip2538
Ref: #skip2632
Node: date-format2857
Ref: #date-format2984
Node: field list3534
Ref: #field-list3671
Node: field assignment4401
Ref: #field-assignment4556
Node: conditional block5180
Ref: #conditional-block5334
Node: include6230
Ref: #include6360
Node: newest-first6591
Ref: #newest-first6705
Node: CSV TIPS7116
Ref: #csv-tips7210
Node: CSV ordering7354
Ref: #csv-ordering7472
Node: CSV accounts7653
Ref: #csv-accounts7791
Node: CSV amounts8045
Ref: #csv-amounts8203
Node: CSV balance assertions/assignments9283
Ref: #csv-balance-assertionsassignments9501
Node: Reading multiple CSV files9822
Ref: #reading-multiple-csv-files10022
Node: Valid CSV10296
Ref: #valid-csv10419
Node: CSV RULES1428
Ref: #csv-rules1536
Node: skip1849
Ref: #skip1942
Node: fields2312
Ref: #fields2434
Node: Transaction fields3239
Ref: #transaction-fields3379
Node: Posting fields3547
Ref: #posting-fields3679
Node: field assignment4729
Ref: #field-assignment4882
Node: date-format5693
Ref: #date-format5828
Node: if6440
Ref: #if6544
Node: end7915
Ref: #end8017
Node: include8246
Ref: #include8366
Node: newest-first8804
Ref: #newest-first8922
Node: EXAMPLES9594
Ref: #examples9701
Node: TIPS10607
Ref: #tips10688
Node: Reading multiple CSV files10931
Ref: #reading-multiple-csv-files11098
Node: Deduplicating importing11358
Ref: #deduplicating-importing11550
Node: Other import methods11991
Ref: #other-import-methods12158
Node: Valid CSV12428
Ref: #valid-csv12576
Node: Other separator characters12778
Ref: #other-separator-characters12955
Node: Setting amounts13289
Ref: #setting-amounts13459
Node: Referencing other fields14702
Ref: #referencing-other-fields14891
Node: How CSV rules are evaluated15788
Ref: #how-csv-rules-are-evaluated15986
Node: Valid transactions17266
Ref: #valid-transactions17413

End Tag Table

View File

@ -16,8 +16,8 @@ DESCRIPTION
o they describe the layout and format of the CSV data
o they can customize the generated journal entries using a simple tem-
plating language
o they can customize the generated journal entries (transactions) using
a simple templating language
o they can add refinements based on patterns in the CSV data, eg cate-
gorizing transactions with more detailed account names.
@ -36,63 +36,109 @@ DESCRIPTION
date-format %d/%m/%Y
skip 1
A more complete example:
# hledger CSV rules for amazon.com order history
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
# skip one header line
skip 1
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount, fees, code
# how to parse the date
date-format %b %-d, %Y
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus, fees:%fees
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
For more examples, see Convert CSV files.
More examples in the EXAMPLES section below.
CSV RULES
The following seven kinds of rule can appear in the rules file, in any
order. Blank lines and lines beginning with # or ; are ignored.
The following kinds of rule can appear in the rules file, in any order
(except for end which can appear only inside a conditional block).
Blank lines and lines beginning with # or ; are ignored.
skip
skipN
skip N
Skip this many non-empty lines preceding the CSV data. (Empty/blank
lines are skipped automatically.) You'll need this whenever your CSV
data contains header lines. Eg:
The word "skip" followed by a number (or no number, meaning 1) tells
hledger to ignore this many non-empty lines preceding the CSV data.
(Empty/blank lines are skipped automatically.) You'll need this when-
ever your CSV data contains header lines.
# ignore the first CSV line
skip 1
It also has a second purpose: it can be used to ignore certain CSV
records, see conditional blocks below.
fields
fields FIELDNAME1, FIELDNAME2, ...
A fields list ("fields" followed by one or more comma-separated field
names) is the quick way to assign CSV field values to hledger fields.
It (a) names the CSV fields, in order (names may not contain white-
space; fields you don't care about can be left unnamed), and (b) as-
signs them to hledger fields if you use standard hledger field names.
Here's an example:
# use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
# ignore the 3rd, 5th and 6th fields,
# and name the 7th and 8th fields for later reference:
# 1 2 3 4 5 6 7 8
fields date, description, , amount1, , , somefield, anotherfield
Here are the standard hledger field names:
Transaction fields
date, date2, status, code, description, comment can be used to form the
transaction's first line. Only date is required. (See also date-for-
mat below.)
Posting fields
accountN, where N is 1 to 9, sets the Nth posting's account name. Most
often there are two postings, so you'll want to set account1 and ac-
count2.
A number of field/pseudo-field names are available for setting posting
amounts:
o amountN sets posting N's amount
o amountN-in and amountN-out can be used instead, if the CSV has sepa-
rate fields for debits and credits
o currencyN sets a currency symbol to be left-prefixed to the amount,
useful if the CSV provides that as a separate field
o balanceN sets a (separate) balance assertion amount (or when no post-
ing amount is set, a balance assignment)
If you write these with no number (amount, amount-in, amount-out, cur-
rency, balance), it means posting 1. Also, if you set an amount for
posting 1 only, a second posting that balances the transaction will be
generated automatically. This helps support CSV rules created before
hledger 1.16.
Finally, commentN sets a comment on the Nth posting. Comments can of
course contain tags.
(field assignment)
HLEDGERFIELDNAME FIELDVALUE
Instead of or in addition to a fields list, you can assign a value to a
hledger field by writing its name (any of the standard names above)
followed by a text value. The value may contain interpolated CSV
fields, referenced by their 1-based position in the CSV record (%N), or
by the name they were given in the fields list (%CSVFIELDNAME). Eg:
# set the amount to the 4th CSV field, with " USD" appended
amount %4 USD
# combine three fields to make a comment, containing note: and date: tags
comment note: %somefield - %anotherfield, date: %1
Interpolation strips any outer whitespace, so a CSV value like " 1 "
becomes 1 when interpolated (#1051). Note you can only interpolate CSV
fields, not the hledger fields being assigned to; for more on this, see
TIPS.
date-format
date-formatDATEFMT
date-format DATEFMT
When your CSV date fields are not formatted like YYYY/MM/DD (or YYYY-
MM-DD or YYYY.MM.DD), you'll need to specify the format. DATEFMT is a
strptime-like date parsing pattern, which must parse the date field
values completely. Examples:
This is a helper for the date (and date2) fields. If your CSV dates
are not formatted like YYYY-MM-DD, YYYY/MM/DD or YYYY.MM.DD, you'll
need to specify the format by writing "date-format" followed by a strp-
time-like date parsing pattern, which must parse the date field values
completely. Examples:
# for dates like "11/06/2013":
date-format %m/%d/%Y
# for dates like "6/11/2013" (note the - to make leading zeros optional):
# for dates like "6/11/2013". The - allows leading zeros to be optional.
date-format %-d/%-m/%Y
# for dates like "2013-Nov-06":
@ -101,59 +147,41 @@ CSV RULES
# for dates like "11/6/2013 11:32 PM":
date-format %-m/%-d/%Y %l:%M %p
field list
fieldsFIELDNAME1, FIELDNAME2...
This (a) names the CSV fields, in order (names may not contain white-
space; uninteresting names may be left blank), and (b) assigns them to
journal entry fields if you use any of these standard field names:
date, date2, status, code, description, comment, account1, account2,
amount, amount-in, amount-out, currency, balance, balance1, balance2.
Eg:
# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
# and give the 7th and 8th fields meaningful names for later reference:
#
# CSV field:
# 1 2 3 4 5 6 7 8
# entry field:
fields date, description, , amount, , , somefield, anotherfield
field assignment
ENTRYFIELDNAME FIELDVALUE
This sets a journal entry field (one of the standard names above) to
the given text value, which can include CSV field values interpolated
by name (%CSVFIELDNAME) or 1-based position (%N). Eg:
# set the amount to the 4th CSV field with "USD " prepended
amount USD %4
# combine three fields to make a comment (containing two tags)
comment note: %somefield - %anotherfield, date: %1
Field assignments can be used instead of or in addition to a field
list.
Note, interpolation strips any outer whitespace, so a CSV value like "
1 " becomes 1 when interpolated (#1051).
conditional block
if
if PATTERN
FIELDASSIGNMENTS...
RULE
if
PATTERN
PATTERN...
FIELDASSIGNMENTS...
PATTERN
PATTERN
RULE
RULE
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs. The patterns are case-insensitive reg-
ular expressions which match anywhere within the whole CSV record (it's
not yet possible to match within a specific field). When there are
multiple patterns they can be written on separate lines, unindented.
The field assignments are on separate lines indented by at least one
space. Examples:
Conditional blocks apply one or more rules to CSV records which are
matched by any of the PATTERNs. This allows transactions to be cus-
tomised or categorised based on patterns in the data.
A single pattern can be written on the same line as the "if"; or multi-
ple patterns can be written on the following lines, non-indented.
Patterns are case-insensitive regular expressions which try to match
any part of the whole CSV record. It's not yet possible to match
within a specific field. Note the CSV record they see is close but not
identical to the one in the CSV file; eg double quotes are removed, and
the separator character becomes comma.
After the patterns, there should be one or more rules to apply, all in-
dented by at least one space. Three kinds of rule are allowed in con-
ditional blocks:
o field assignments (to set a field's value)
o skip (to skip the matched CSV record)
o end (to skip all remaining CSV records).
Examples:
# if the CSV record contains "groceries", set account2 to "expenses:groceries"
if groceries
@ -167,90 +195,250 @@ CSV RULES
account2 expenses:business:banking
comment XXX deductible ? check it
end
As mentioned above, this rule can be used inside conditional blocks
(only) to cause hledger to stop reading CSV records and proceed with
command execution. Eg:
# ignore everything following the first empty record
if ,,,,
end
include
includeRULESFILE
include RULESFILE
Include another rules file at this point. RULESFILE is either an abso-
lute file path or a path relative to the current file's directory. Eg:
Include another CSV rules file at this point, as if it were written in-
line. RULESFILE is an absolute file path or a path relative to the
current file's directory.
# rules reused with several CSV files
include common.rules
This can be useful eg for reusing common rules in several rules files:
# someaccount.csv.rules
## someaccount-specific rules
fields date,description,amount
account1 some:account
account2 some:misc
## common rules
include categorisation.rules
newest-first
hledger always sorts the generated transactions by date. Transactions
on the same date should appear in the same order as their CSV records,
as hledger can usually auto-detect whether the CSV's normal order is
oldest first or newest first. But if all of the following are true:
o the CSV might sometimes contain just one day of data (all records
having the same date)
o the CSV records are normally in reverse chronological order (newest
first)
o and you care about preserving the order of same-day transactions
you should add the newest-first rule as a hint. Eg:
# tell hledger explicitly that the CSV is normally newest-first
newest-first
Consider adding this rule if all of the following are true: you might
be processing just one day of data, your CSV records are in reverse
chronological order (newest first), and you care about preserving the
order of same-day transactions. It usually isn't needed, because
hledger autodetects the CSV order, but when all CSV records have the
same date it will assume they are oldest first.
EXAMPLES
A more complete example, generating three-posting transactions:
CSV TIPS
CSV ordering
The generated journal entries will be sorted by date. The order of
same-day entries will be preserved (except in the special case where
you might need newest-first, see above).
# hledger CSV rules for amazon.com order history
CSV accounts
Each journal entry will have two postings, to account1 and account2 re-
spectively. It's not yet possible to generate entries with more than
two postings. It's conventional and recommended to use account1 for
the account whose CSV we are reading.
# sample:
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
CSV amounts
A transaction amount must be set, in one of these ways:
# skip one header line
skip 1
o with an amount field assignment, which sets the first posting's
amount
# name the csv fields (and assign the transaction's date, amount and code)
fields date, _, toorfrom, name, amzstatus, amount1, fees, code
o (When the CSV has debit and credit amounts in separate fields:)
with field assignments for the amount-in and amount-out pseudo fields
(both of them). Whichever one has a value will be used, with appropri-
ate sign. If both contain a value, it might not work so well.
# how to parse the date
date-format %b %-d, %Y
o or implicitly by means of a balance assignment (see below).
# combine two fields to make the description
description %toorfrom %name
# save these fields as tags
comment status:%amzstatus
# set the base account for all transactions
account1 assets:amazon
# flip the sign on the amount
amount -%amount
# Put fees in a separate posting
amount3 %fees
comment3 fees
For more examples, see Convert CSV files.
TIPS
Reading multiple CSV files
You can read multiple CSV files at once using multiple -f arguments on
the command line. hledger will look for a correspondingly-named rules
file for each CSV file. If you use the --rules-file option, that rules
file will be used for all the CSV files.
Deduplicating, importing
When you download a CSV file repeatedly, eg to get your latest bank
transactions, the new file may contain some of the same records as the
old one. The print --new command is one simple way to detect just the
new transactions. Or better still, the import command appends those
new transactions to your main journal. This is the easiest way to im-
port CSV data. Eg, after downloading your latest CSV files:
$ hledger import *.csv [--dry]
Other import methods
A number of other tools and workflows, hledger-specific and otherwise,
exist for converting, deduplicating, classifying and managing CSV data.
See:
o https://hledger.org -> sidebar -> real world setups
o https://plaintextaccounting.org -> data import/conversion
Valid CSV
hledger accepts CSV conforming to RFC 4180. Some things to note when
values are enclosed in quotes:
o you must use double quotes (not single quotes)
o spaces outside the quotes are not allowed
Other separator characters
With the --separator 'CHAR' option, hledger will expect the separator
to be CHAR instead of a comma. Ie it will read other "Character Sepa-
rated Values" formats, such as TSV (Tab Separated Values). Note: on
the command line, use a real tab character in quotes, not Eg:
$ hledger -f foo.tsv --separator ' ' print
(Experimental.)
Setting amounts
A posting amount can be set in one of these ways:
o by assigning (with a fields list or field assigment) to amountN
(posting N's amount) or amount (posting 1's amount)
o by assigning to amountN-in and amountN-out (or amount-in and amount-
out). For each CSV record, whichever of these has a non-zero value
will be used, with appropriate sign. If both contain a non-zero
value, this may not work.
o by assigning to balanceN (or balance) instead of the above, setting
the amount indirectly via a balance assignment.
There is some special handling for sign in amounts:
o If an amount value is parenthesised, it will be de-parenthesised and
sign-flipped.
o If an amount value begins with a double minus sign, those will cancel
out and be removed.
o If an amount value begins with a double minus sign, those cancel out
and are removed.
If the currency/commodity symbol is provided as a separate CSV field,
assign it to the currency pseudo field; the symbol will be prepended to
the amount (TODO: when there is an amount). Or, you can use an amount
field assignment for more control, eg:
you can assign it to currency (affects all posting amounts) or curren-
cyN (affects just posting N's amount). The symbol will be prepended to
the amount. Or for more control, you can set both currency symbol and
amount with a field assignment, eg:
fields date,description,currency,amount
# add currency symbol on the right:
amount %amount %currency
CSV balance assertions/assignments
If the CSV includes a running balance, you can assign that to one of
the pseudo fields balance (or balance1) or balance2. This will gener-
ate a balance assertion (or if the amount is left empty, a balance as-
signment), on the first or second posting, whenever the running balance
field is non-empty. (TODO: #1000)
Referencing other fields
In field assignments, you can interpolate only CSV fields, not hledger
fields. In the example below, there's both a CSV field and a hledger
field named amount1, but %amount1 always means the CSV field, not the
hledger field:
Reading multiple CSV files
You can read multiple CSV files at once using multiple -f arguments on
the command line, and hledger will look for a correspondingly-named
rules file for each. Note if you use the --rules-file option, this one
rules file will be used for all the CSV files being read.
# Name the third CSV field "amount1"
fields date,description,amount1
Valid CSV
hledger follows RFC 4180, with the addition of a customisable separator
character.
# Set hledger's amount1 to the CSV amount1 field followed by USD
amount1 %amount1 USD
Some things to note:
# Set comment to the CSV amount1 (not the amount1 assigned above)
comment %amount1
When quoting fields,
Here, since there's no CSV amount1 field, %amount1 will produce a lit-
eral "amount1":
o you must use double quotes, not single quotes
fields date,description,csvamount
amount1 %csvamount USD
# Can't interpolate amount1 here
comment %amount1
o spaces outside the quotes are not allowed.
When there are multiple field assignments to the same hledger field,
only the last one takes effect. Here, comment's value will be be B, or
C if "something" is matched, but never A:
comment A
comment B
if something
comment C
How CSV rules are evaluated
Here's how to think of CSV rules being evaluated (if you really need
to). First,
o include - all includes are inlined, from top to bottom, depth first.
(At each include point the file is inlined and scanned for further
includes, before proceeding.)
Then "global" rules are evaluated, top to bottom. If a rule is re-
peated, the last one wins:
o skip (at top level)
o date-format
o newest-first
o fields - names the CSV fields, optionally sets up initial assignments
to hledger fields
Then for each CSV record in turn:
o test all if blocks. If any of them contain a end rule, skip all re-
maining CSV records. Otherwise if any of them contain a skip rule,
skip that many CSV records. If there are multiple matched skip
rules, the first one wins.
o collect all field assignments at top level and in matched if blocks.
When there are multiple assignments for a field, keep only the last
one.
o compute a value for each hledger field - either the one that was as-
signed to it (and interpolate the %CSVFIELDNAME references), or a de-
fault
o generate a synthetic hledger transaction from these values, which be-
comes part of the input to the hledger command that has been selected
Valid transactions
hledger currently does not post-process and validate transactions gen-
erated from CSV as thoroughly as transactions read from a journal file.
This means that if your rules are wrong, you can generate invalid
transactions. Or, amounts may not be displayed with a canonical dis-
play style.
So when setting up or adjusting CSV rules, you should check your re-
sults visually with the print command. You can pipe print's output
through hledger once more to validate and canonicalise fully. Eg:
$ hledger -f some.csv print | hledger -f- print -I
(The -I/--ignore-assertions flag disables balance assertion checks,
usually needed when re-parsing print output.)