;doc: regen manuals

[ci skip]
2019-11-06 13:10:17 -08:00 · 2019-11-06 13:10:17 -08:00 · 7ecc42f142
commit 7ecc42f142
parent d92351e21a
3 changed files with 1224 additions and 566 deletions
--- a/hledger-lib/hledger_csv.5
+++ b/hledger-lib/hledger_csv.5
@ -18,8 +18,8 @@ These do several things:
 .IP \[bu] 2
 they describe the layout and format of the CSV data
 .IP \[bu] 2
-they can customize the generated journal entries using a simple
-templating language
+they can customize the generated journal entries (transactions) using a
+simple templating language
 .IP \[bu] 2
 they can add refinements based on patterns in the CSV data, eg
 categorizing transactions with more detailed account names.
@ -44,70 +44,142 @@ skip 1
 \f[R]
 .fi
 .PP
-A more complete example:
+More examples in the EXAMPLES section below.
+.SH CSV RULES
+.PP
+The following kinds of rule can appear in the rules file, in any order
+(except for \f[C]end\f[R] which can appear only inside a conditional
+block).
+Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
+ignored.
+.SS \f[C]skip\f[R]
 .IP
 .nf
 \f[C]
-# hledger CSV rules for amazon.com order history
-
-# sample:
-# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
-# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
-
-# skip one header line
-skip 1
-
-# name the csv fields (and assign the transaction\[aq]s date, amount and code)
-fields date, _, toorfrom, name, amzstatus, amount, fees, code
-
-# how to parse the date
-date-format %b %-d, %Y
-
-# combine two fields to make the description
-description %toorfrom %name
-
-# save these fields as tags
-comment     status:%amzstatus, fees:%fees
-
-# set the base account for all transactions
-account1    assets:amazon
-
-# flip the sign on the amount
-amount      -%amount
+skip N
 \f[R]
 .fi
 .PP
-For more examples, see Convert CSV files.
-.SH CSV RULES
-.PP
-The following seven kinds of rule can appear in the rules file, in any
-order.
-Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
-ignored.
-.SS skip
-.PP
-\f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
-.PP
-Skip this many non-empty lines preceding the CSV data.
+The word \[dq]skip\[dq] followed by a number (or no number, meaning 1)
+tells hledger to ignore this many non-empty lines preceding the CSV
+data.
 (Empty/blank lines are skipped automatically.) You\[aq]ll need this
 whenever your CSV data contains header lines.
+.PP
+It also has a second purpose: it can be used to ignore certain CSV
+records, see conditional blocks below.
+.SS \f[C]fields\f[R]
+.IP
+.nf
+\f[C]
+fields FIELDNAME1, FIELDNAME2, ...
+\f[R]
+.fi
+.PP
+A fields list (\[dq]fields\[dq] followed by one or more comma-separated
+field names) is the quick way to assign CSV field values to hledger
+fields.
+It (a) names the CSV fields, in order (names may not contain whitespace;
+fields you don\[aq]t care about can be left unnamed), and (b) assigns
+them to hledger fields if you use standard hledger field names.
+Here\[aq]s an example:
+.IP
+.nf
+\f[C]
+# use the 1st, 2nd and 4th CSV fields as the transaction\[aq]s date, description and amount,
+# ignore the 3rd, 5th and 6th fields,
+# and name the 7th and 8th fields for later reference:
+#      1     2           3  4       5 6  7          8
+
+fields date, description, , amount1, , , somefield, anotherfield
+\f[R]
+.fi
+.PP
+Here are the standard hledger field names:
+.SS Transaction fields
+.PP
+\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
+\f[C]description\f[R], \f[C]comment\f[R] can be used to form the
+transaction\[aq]s first line.
+Only \f[C]date\f[R] is required.
+(See also date-format below.)
+.SS Posting fields
+.PP
+\f[C]accountN\f[R], where N is 1 to 9, sets the Nth posting\[aq]s
+account name.
+Most often there are two postings, so you\[aq]ll want to set
+\f[C]account1\f[R] and \f[C]account2\f[R].
+.PP
+A number of field/pseudo-field names are available for setting posting
+amounts:
+.IP \[bu] 2
+\f[C]amountN\f[R] sets posting N\[aq]s amount
+.IP \[bu] 2
+\f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] can be used instead, if
+the CSV has separate fields for debits and credits
+.IP \[bu] 2
+\f[C]currencyN\f[R] sets a currency symbol to be left-prefixed to the
+amount, useful if the CSV provides that as a separate field
+.IP \[bu] 2
+\f[C]balanceN\f[R] sets a (separate) balance assertion amount (or when
+no posting amount is set, a balance assignment)
+.PP
+If you write these with no number (\f[C]amount\f[R],
+\f[C]amount-in\f[R], \f[C]amount-out\f[R], \f[C]currency\f[R],
+\f[C]balance\f[R]), it means posting 1.
+Also, if you set an amount for posting 1 only, a second posting that
+balances the transaction will be generated automatically.
+This helps support CSV rules created before hledger 1.16.
+.PP
+Finally, \f[C]commentN\f[R] sets a comment on the Nth posting.
+Comments can of course contain tags.
+.SS \f[C](field assignment)\f[R]
+.IP
+.nf
+\f[C]
+HLEDGERFIELDNAME FIELDVALUE
+\f[R]
+.fi
+.PP
+Instead of or in addition to a fields list, you can assign a value to a
+hledger field by writing its name (any of the standard names above)
+followed by a text value.
+The value may contain interpolated CSV fields, referenced by their
+1-based position in the CSV record (\f[C]%N\f[R]), or by the name they
+were given in the fields list (\f[C]%CSVFIELDNAME\f[R]).
 Eg:
 .IP
 .nf
 \f[C]
-# ignore the first CSV line
-skip 1
+# set the amount to the 4th CSV field, with \[dq] USD\[dq] appended
+amount %4 USD
+\f[R]
+.fi
+.IP
+.nf
+\f[C]
+# combine three fields to make a comment, containing note: and date: tags
+comment note: %somefield - %anotherfield, date: %1
 \f[R]
 .fi
-.SS date-format
 .PP
-\f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R]
+Interpolation strips any outer whitespace, so a CSV value like
+\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
+Note you can only interpolate CSV fields, not the hledger fields being
+assigned to; for more on this, see TIPS.
+.SS \f[C]date-format\f[R]
+.IP
+.nf
+\f[C]
+date-format DATEFMT
+\f[R]
+.fi
 .PP
-When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R]
-(or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to
-specify the format.
-DATEFMT is a strptime-like date parsing pattern, which must parse the
-date field values completely.
+This is a helper for the \f[C]date\f[R] (and \f[C]date2\f[R]) fields.
+If your CSV dates are not formatted like \f[C]YYYY-MM-DD\f[R],
+\f[C]YYYY/MM/DD\f[R] or \f[C]YYYY.MM.DD\f[R], you\[aq]ll need to specify
+the format by writing \[dq]date-format\[dq] followed by a strptime-like
+date parsing pattern, which must parse the date field values completely.
 Examples:
 .IP
 .nf
@ -119,7 +191,7 @@ date-format %m/%d/%Y
 .IP
 .nf
 \f[C]
-# for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional):
+# for dates like \[dq]6/11/2013\[dq]. The - allows leading zeros to be optional.
 date-format %-d/%-m/%Y
 \f[R]
 .fi
@ -137,90 +209,47 @@ date-format %Y-%h-%d
 date-format %-m/%-d/%Y %l:%M %p
 \f[R]
 .fi
-.SS field list
-.PP
-\f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
-\f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
-.PP
-This (a) names the CSV fields, in order (names may not contain
-whitespace; uninteresting names may be left blank), and (b) assigns them
-to journal entry fields if you use any of these standard field names:
-\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
-\f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
-\f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
-\f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
-\f[C]balance1\f[R], \f[C]balance2\f[R].
-Eg:
+.SS \f[C]if\f[R]
 .IP
 .nf
 \f[C]
-# use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount,
-# and give the 7th and 8th fields meaningful names for later reference:
-#
-# CSV field:
-#      1     2            3 4       5 6 7          8
-# entry field:
-fields date, description, , amount, , , somefield, anotherfield
-\f[R]
-.fi
-.SS field assignment
-.PP
-\f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
-.PP
-This sets a journal entry field (one of the standard names above) to the
-given text value, which can include CSV field values interpolated by
-name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
-Eg:
-.IP
-.nf
-\f[C]
-# set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
-amount USD %4
-\f[R]
-.fi
-.IP
-.nf
-\f[C]
-# combine three fields to make a comment (containing two tags)
-comment note: %somefield - %anotherfield, date: %1
+if PATTERN
+ RULE
+
+if
+PATTERN
+PATTERN
+PATTERN
+ RULE
+ RULE
 \f[R]
 .fi
 .PP
-Field assignments can be used instead of or in addition to a field list.
+Conditional blocks apply one or more rules to CSV records which are
+matched by any of the PATTERNs.
+This allows transactions to be customised or categorised based on
+patterns in the data.
 .PP
-Note, interpolation strips any outer whitespace, so a CSV value like
-\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
-.SS conditional block
+A single pattern can be written on the same line as the \[dq]if\[dq]; or
+multiple patterns can be written on the following lines, non-indented.
 .PP
-\f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R]
-.PD 0
-.P
-.PD
-\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
+Patterns are case-insensitive regular expressions which try to match any
+part of the whole CSV record.
+It\[aq]s not yet possible to match within a specific field.
+Note the CSV record they see is close but not identical to the one in
+the CSV file; eg double quotes are removed, and the separator character
+becomes comma.
 .PP
-\f[C]if\f[R]
-.PD 0
-.P
-.PD
-\f[I]\f[CI]PATTERN\f[I]\f[R]
-.PD 0
-.P
-.PD
-\f[I]\f[CI]PATTERN\f[I]\f[R]...
-.PD 0
-.P
-.PD
-\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
+After the patterns, there should be one or more rules to apply, all
+indented by at least one space.
+Three kinds of rule are allowed in conditional blocks:
+.IP \[bu] 2
+field assignments (to set a field\[aq]s value)
+.IP \[bu] 2
+skip (to skip the matched CSV record)
+.IP \[bu] 2
+end (to skip all remaining CSV records).
 .PP
-This applies one or more field assignments, only to those CSV records
-matched by one of the PATTERNs.
-The patterns are case-insensitive regular expressions which match
-anywhere within the whole CSV record (it\[aq]s not yet possible to match
-within a specific field).
-When there are multiple patterns they can be written on separate lines,
-unindented.
-The field assignments are on separate lines indented by at least one
-space.
 Examples:
 .IP
 .nf
@ -242,112 +271,319 @@ banking thru software
 comment  XXX deductible ? check it
 \f[R]
 .fi
-.SS include
+.SS \f[C]end\f[R]
 .PP
-\f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R]
-.PP
-Include another rules file at this point.
-\f[C]RULESFILE\f[R] is either an absolute file path or a path relative
-to the current file\[aq]s directory.
+As mentioned above, this rule can be used inside conditional blocks
+(only) to cause hledger to stop reading CSV records and proceed with
+command execution.
 Eg:
 .IP
 .nf
 \f[C]
-# rules reused with several CSV files
-include common.rules
+# ignore everything following the first empty record
+if ,,,,
+ end
+\f[R]
+.fi
+.SS \f[C]include\f[R]
+.IP
+.nf
+\f[C]
+include RULESFILE
 \f[R]
 .fi
-.SS newest-first
 .PP
-\f[C]newest-first\f[R]
+Include another CSV rules file at this point, as if it were written
+inline.
+\f[C]RULESFILE\f[R] is an absolute file path or a path relative to the
+current file\[aq]s directory.
 .PP
-Consider adding this rule if all of the following are true: you might be
-processing just one day of data, your CSV records are in reverse
-chronological order (newest first), and you care about preserving the
-order of same-day transactions.
-It usually isn\[aq]t needed, because hledger autodetects the CSV order,
-but when all CSV records have the same date it will assume they are
-oldest first.
-.SH CSV TIPS
-.SS CSV ordering
+This can be useful eg for reusing common rules in several rules files:
+.IP
+.nf
+\f[C]
+# someaccount.csv.rules
+
+## someaccount-specific rules
+fields date,description,amount
+account1 some:account
+account2 some:misc
+
+## common rules
+include categorisation.rules
+\f[R]
+.fi
+.SS \f[C]newest-first\f[R]
 .PP
-The generated journal entries will be sorted by date.
-The order of same-day entries will be preserved (except in the special
-case where you might need \f[C]newest-first\f[R], see above).
-.SS CSV accounts
-.PP
-Each journal entry will have two postings, to \f[C]account1\f[R] and
-\f[C]account2\f[R] respectively.
-It\[aq]s not yet possible to generate entries with more than two
-postings.
-It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
-account whose CSV we are reading.
-.SS CSV amounts
-.PP
-A transaction amount must be set, in one of these ways:
+hledger always sorts the generated transactions by date.
+Transactions on the same date should appear in the same order as their
+CSV records, as hledger can usually auto-detect whether the CSV\[aq]s
+normal order is oldest first or newest first.
+But if all of the following are true:
 .IP \[bu] 2
-with an \f[C]amount\f[R] field assignment, which sets the first
-posting\[aq]s amount
+the CSV might sometimes contain just one day of data (all records having
+the same date)
 .IP \[bu] 2
-(When the CSV has debit and credit amounts in separate fields:)
-.PD 0
-.P
-.PD
-with field assignments for the \f[C]amount-in\f[R] and
-\f[C]amount-out\f[R] pseudo fields (both of them).
-Whichever one has a value will be used, with appropriate sign.
-If both contain a value, it might not work so well.
+the CSV records are normally in reverse chronological order (newest
+first)
 .IP \[bu] 2
-or implicitly by means of a balance assignment (see below).
+and you care about preserving the order of same-day transactions
+.PP
+you should add the \f[C]newest-first\f[R] rule as a hint.
+Eg:
+.IP
+.nf
+\f[C]
+# tell hledger explicitly that the CSV is normally newest-first
+newest-first
+\f[R]
+.fi
+.SH EXAMPLES
+.PP
+A more complete example, generating three-posting transactions:
+.IP
+.nf
+\f[C]
+# hledger CSV rules for amazon.com order history
+
+# sample:
+# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
+# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
+
+# skip one header line
+skip 1
+
+# name the csv fields (and assign the transaction\[aq]s date, amount and code)
+fields date, _, toorfrom, name, amzstatus, amount1, fees, code
+
+# how to parse the date
+date-format %b %-d, %Y
+
+# combine two fields to make the description
+description %toorfrom %name
+
+# save these fields as tags
+comment     status:%amzstatus
+
+# set the base account for all transactions
+account1    assets:amazon
+
+# flip the sign on the amount
+amount      -%amount
+
+# Put fees in a separate posting
+amount3     %fees
+comment3    fees
+\f[R]
+.fi
+.PP
+For more examples, see Convert CSV files.
+.SH TIPS
+.SS Reading multiple CSV files
+.PP
+You can read multiple CSV files at once using multiple \f[C]-f\f[R]
+arguments on the command line.
+hledger will look for a correspondingly-named rules file for each CSV
+file.
+If you use the \f[C]--rules-file\f[R] option, that rules file will be
+used for all the CSV files.
+.SS Deduplicating, importing
+.PP
+When you download a CSV file repeatedly, eg to get your latest bank
+transactions, the new file may contain some of the same records as the
+old one.
+The print --new command is one simple way to detect just the new
+transactions.
+Or better still, the import command appends those new transactions to
+your main journal.
+This is the easiest way to import CSV data.
+Eg, after downloading your latest CSV files:
+.IP
+.nf
+\f[C]
+$ hledger import *.csv [--dry]
+\f[R]
+.fi
+.SS Other import methods
+.PP
+A number of other tools and workflows, hledger-specific and otherwise,
+exist for converting, deduplicating, classifying and managing CSV data.
+See:
+.IP \[bu] 2
+https://hledger.org -> sidebar -> real world setups
+.IP \[bu] 2
+https://plaintextaccounting.org -> data import/conversion
+.SS Valid CSV
+.PP
+hledger accepts CSV conforming to RFC 4180.
+Some things to note when values are enclosed in quotes:
+.IP \[bu] 2
+you must use double quotes (not single quotes)
+.IP \[bu] 2
+spaces outside the quotes are not allowed
+.SS Other separator characters
+.PP
+With the \f[C]--separator \[aq]CHAR\[aq]\f[R] option, hledger will
+expect the separator to be CHAR instead of a comma.
+Ie it will read other \[dq]Character Separated Values\[dq] formats, such
+as TSV (Tab Separated Values).
+Note: on the command line, use a real tab character in quotes, not Eg:
+.IP
+.nf
+\f[C]
+$ hledger -f foo.tsv --separator \[aq]  \[aq] print
+\f[R]
+.fi
+.PP
+(Experimental.)
+.SS Setting amounts
+.PP
+A posting amount can be set in one of these ways:
+.IP \[bu] 2
+by assigning (with a fields list or field assigment) to
+\f[C]amountN\f[R] (posting N\[aq]s amount) or \f[C]amount\f[R] (posting
+1\[aq]s amount)
+.IP \[bu] 2
+by assigning to \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] (or
+\f[C]amount-in\f[R] and \f[C]amount-out\f[R]).
+For each CSV record, whichever of these has a non-zero value will be
+used, with appropriate sign.
+If both contain a non-zero value, this may not work.
+.IP \[bu] 2
+by assigning to \f[C]balanceN\f[R] (or \f[C]balance\f[R]) instead of the
+above, setting the amount indirectly via a balance assignment.
 .PP
 There is some special handling for sign in amounts:
 .IP \[bu] 2
 If an amount value is parenthesised, it will be de-parenthesised and
 sign-flipped.
 .IP \[bu] 2
-If an amount value begins with a double minus sign, those will cancel
-out and be removed.
+If an amount value begins with a double minus sign, those cancel out and
+are removed.
 .PP
 If the currency/commodity symbol is provided as a separate CSV field,
-assign it to the \f[C]currency\f[R] pseudo field; the symbol will be
-prepended to the amount (TODO: when there is an amount).
-Or, you can use an \f[C]amount\f[R] field assignment for more control,
-eg:
+you can assign it to \f[C]currency\f[R] (affects all posting amounts) or
+\f[C]currencyN\f[R] (affects just posting N\[aq]s amount).
+The symbol will be prepended to the amount.
+Or for more control, you can set both currency symbol and amount with a
+field assignment, eg:
 .IP
 .nf
 \f[C]
 fields date,description,currency,amount
+# add currency symbol on the right:
 amount %amount %currency
 \f[R]
 .fi
-.SS CSV balance assertions/assignments
+.SS Referencing other fields
 .PP
-If the CSV includes a running balance, you can assign that to one of the
-pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or
-\f[C]balance2\f[R].
-This will generate a balance assertion (or if the amount is left empty,
-a balance assignment), on the first or second posting, whenever the
-running balance field is non-empty.
-(TODO: #1000)
-.SS Reading multiple CSV files
+In field assignments, you can interpolate only CSV fields, not hledger
+fields.
+In the example below, there\[aq]s both a CSV field and a hledger field
+named amount1, but %amount1 always means the CSV field, not the hledger
+field:
+.IP
+.nf
+\f[C]
+# Name the third CSV field \[dq]amount1\[dq]
+fields date,description,amount1
+
+# Set hledger\[aq]s amount1 to the CSV amount1 field followed by USD
+amount1 %amount1 USD
+
+# Set comment to the CSV amount1 (not the amount1 assigned above)
+comment %amount1
+\f[R]
+.fi
 .PP
-You can read multiple CSV files at once using multiple \f[C]-f\f[R]
-arguments on the command line, and hledger will look for a
-correspondingly-named rules file for each.
-Note if you use the \f[C]--rules-file\f[R] option, this one rules file
-will be used for all the CSV files being read.
-.SS Valid CSV
+Here, since there\[aq]s no CSV amount1 field, %amount1 will produce a
+literal \[dq]amount1\[dq]:
+.IP
+.nf
+\f[C]
+fields date,description,csvamount
+amount1 %csvamount USD
+# Can\[aq]t interpolate amount1 here
+comment %amount1
+\f[R]
+.fi
 .PP
-hledger follows RFC 4180, with the addition of a customisable separator
-character.
+When there are multiple field assignments to the same hledger field,
+only the last one takes effect.
+Here, comment\[aq]s value will be be B, or C if \[dq]something\[dq] is
+matched, but never A:
+.IP
+.nf
+\f[C]
+comment A
+comment B
+if something
+ comment C
+\f[R]
+.fi
+.SS How CSV rules are evaluated
 .PP
-Some things to note:
-.PP
-When quoting fields,
+Here\[aq]s how to think of CSV rules being evaluated (if you really need
+to).
+First,
 .IP \[bu] 2
-you must use double quotes, not single quotes
+include - all includes are inlined, from top to bottom, depth first.
+(At each include point the file is inlined and scanned for further
+includes, before proceeding.)
+.PP
+Then \[dq]global\[dq] rules are evaluated, top to bottom.
+If a rule is repeated, the last one wins:
 .IP \[bu] 2
-spaces outside the quotes are not allowed.
+skip (at top level)
+.IP \[bu] 2
+date-format
+.IP \[bu] 2
+newest-first
+.IP \[bu] 2
+fields - names the CSV fields, optionally sets up initial assignments to
+hledger fields
+.PP
+Then for each CSV record in turn:
+.IP \[bu] 2
+test all \f[C]if\f[R] blocks.
+If any of them contain a \f[C]end\f[R] rule, skip all remaining CSV
+records.
+Otherwise if any of them contain a \f[C]skip\f[R] rule, skip that many
+CSV records.
+If there are multiple matched skip rules, the first one wins.
+.IP \[bu] 2
+collect all field assignments at top level and in matched if blocks.
+When there are multiple assignments for a field, keep only the last one.
+.IP \[bu] 2
+compute a value for each hledger field - either the one that was
+assigned to it (and interpolate the %CSVFIELDNAME references), or a
+default
+.IP \[bu] 2
+generate a synthetic hledger transaction from these values, which
+becomes part of the input to the hledger command that has been selected
+.SS Valid transactions
+.PP
+hledger currently does not post-process and validate transactions
+generated from CSV as thoroughly as transactions read from a journal
+file.
+This means that if your rules are wrong, you can generate invalid
+transactions.
+Or, amounts may not be displayed with a canonical display style.
+.PP
+So when setting up or adjusting CSV rules, you should check your results
+visually with the print command.
+You can pipe print\[aq]s output through hledger once more to validate
+and canonicalise fully.
+Eg:
+.IP
+.nf
+\f[C]
+$ hledger -f some.csv print | hledger -f- print -I
+\f[R]
+.fi
+.PP
+(The -I/--ignore-assertions flag disables balance assertion checks,
+usually needed when re-parsing print output.)


 .SH "REPORTING BUGS"
--- a/hledger-lib/hledger_csv.info
+++ b/hledger-lib/hledger_csv.info
@ -14,8 +14,8 @@ transaction.  (To learn about _writing_ CSV, see CSV output.)
 rules.  These do several things:

   * they describe the layout and format of the CSV data
-   * they can customize the generated journal entries using a simple
-     templating language
+   * they can customize the generated journal entries (transactions)
+     using a simple templating language
   * they can add refinements based on patterns in the CSV data, eg
     categorizing transactions with more detailed account names.

@ -33,93 +33,164 @@ fields date, _, _, amount
 date-format  %d/%m/%Y
 skip 1

-   A more complete example:
-
-# hledger CSV rules for amazon.com order history
-
-# sample:
-# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
-# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
-
-# skip one header line
-skip 1
-
-# name the csv fields (and assign the transaction's date, amount and code)
-fields date, _, toorfrom, name, amzstatus, amount, fees, code
-
-# how to parse the date
-date-format %b %-d, %Y
-
-# combine two fields to make the description
-description %toorfrom %name
-
-# save these fields as tags
-comment     status:%amzstatus, fees:%fees
-
-# set the base account for all transactions
-account1    assets:amazon
-
-# flip the sign on the amount
-amount      -%amount
-
-   For more examples, see Convert CSV files.
+   More examples in the EXAMPLES section below.

 * Menu:

 * CSV RULES::
-* CSV TIPS::
+* EXAMPLES::
+* TIPS::


-File: hledger_csv.info,  Node: CSV RULES,  Next: CSV TIPS,  Prev: Top,  Up: Top
+File: hledger_csv.info,  Node: CSV RULES,  Next: EXAMPLES,  Prev: Top,  Up: Top

 1 CSV RULES
 ***********

-The following seven kinds of rule can appear in the rules file, in any
-order.  Blank lines and lines beginning with '#' or ';' are ignored.
+The following kinds of rule can appear in the rules file, in any order
+(except for 'end' which can appear only inside a conditional block).
+Blank lines and lines beginning with '#' or ';' are ignored.

 * Menu:

 * skip::
-* date-format::
-* field list::
+* fields::
 * field assignment::
-* conditional block::
+* date-format::
+* if::
+* end::
 * include::
 * newest-first::


-File: hledger_csv.info,  Node: skip,  Next: date-format,  Up: CSV RULES
+File: hledger_csv.info,  Node: skip,  Next: fields,  Up: CSV RULES

-1.1 skip
-========
+1.1 'skip'
+==========

-'skip'_'N'_
+skip N

-   Skip this many non-empty lines preceding the CSV data.  (Empty/blank
-lines are skipped automatically.)  You'll need this whenever your CSV
-data contains header lines.  Eg:
+   The word "skip" followed by a number (or no number, meaning 1) tells
+hledger to ignore this many non-empty lines preceding the CSV data.
+(Empty/blank lines are skipped automatically.)  You'll need this
+whenever your CSV data contains header lines.

-# ignore the first CSV line
-skip 1
+   It also has a second purpose: it can be used to ignore certain CSV
+records, see conditional blocks below.


-File: hledger_csv.info,  Node: date-format,  Next: field list,  Prev: skip,  Up: CSV RULES
+File: hledger_csv.info,  Node: fields,  Next: field assignment,  Prev: skip,  Up: CSV RULES

-1.2 date-format
-===============
+1.2 'fields'
+============

-'date-format'_'DATEFMT'_
+fields FIELDNAME1, FIELDNAME2, ...

-   When your CSV date fields are not formatted like 'YYYY/MM/DD' (or
-'YYYY-MM-DD' or 'YYYY.MM.DD'), you'll need to specify the format.
-DATEFMT is a strptime-like date parsing pattern, which must parse the
-date field values completely.  Examples:
+   A fields list ("fields" followed by one or more comma-separated field
+names) is the quick way to assign CSV field values to hledger fields.
+It (a) names the CSV fields, in order (names may not contain whitespace;
+fields you don't care about can be left unnamed), and (b) assigns them
+to hledger fields if you use standard hledger field names.  Here's an
+example:
+
+# use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
+# ignore the 3rd, 5th and 6th fields,
+# and name the 7th and 8th fields for later reference:
+#      1     2           3  4       5 6  7          8
+
+fields date, description, , amount1, , , somefield, anotherfield
+
+   Here are the standard hledger field names:
+
+* Menu:
+
+* Transaction fields::
+* Posting fields::
+
+
+File: hledger_csv.info,  Node: Transaction fields,  Next: Posting fields,  Up: fields
+
+1.2.1 Transaction fields
+------------------------
+
+'date', 'date2', 'status', 'code', 'description', 'comment' can be used
+to form the transaction's first line.  Only 'date' is required.  (See
+also date-format below.)
+
+
+File: hledger_csv.info,  Node: Posting fields,  Prev: Transaction fields,  Up: fields
+
+1.2.2 Posting fields
+--------------------
+
+'accountN', where N is 1 to 9, sets the Nth posting's account name.
+Most often there are two postings, so you'll want to set 'account1' and
+'account2'.
+
+   A number of field/pseudo-field names are available for setting
+posting amounts:
+
+   * 'amountN' sets posting N's amount
+   * 'amountN-in' and 'amountN-out' can be used instead, if the CSV has
+     separate fields for debits and credits
+   * 'currencyN' sets a currency symbol to be left-prefixed to the
+     amount, useful if the CSV provides that as a separate field
+   * 'balanceN' sets a (separate) balance assertion amount (or when no
+     posting amount is set, a balance assignment)
+
+   If you write these with no number ('amount', 'amount-in',
+'amount-out', 'currency', 'balance'), it means posting 1.  Also, if you
+set an amount for posting 1 only, a second posting that balances the
+transaction will be generated automatically.  This helps support CSV
+rules created before hledger 1.16.
+
+   Finally, 'commentN' sets a comment on the Nth posting.  Comments can
+of course contain tags.
+
+
+File: hledger_csv.info,  Node: field assignment,  Next: date-format,  Prev: fields,  Up: CSV RULES
+
+1.3 '(field assignment)'
+========================
+
+HLEDGERFIELDNAME FIELDVALUE
+
+   Instead of or in addition to a fields list, you can assign a value to
+a hledger field by writing its name (any of the standard names above)
+followed by a text value.  The value may contain interpolated CSV
+fields, referenced by their 1-based position in the CSV record ('%N'),
+or by the name they were given in the fields list ('%CSVFIELDNAME').
+Eg:
+
+# set the amount to the 4th CSV field, with " USD" appended
+amount %4 USD
+
+# combine three fields to make a comment, containing note: and date: tags
+comment note: %somefield - %anotherfield, date: %1
+
+   Interpolation strips any outer whitespace, so a CSV value like '" 1
+"' becomes '1' when interpolated (#1051).  Note you can only interpolate
+CSV fields, not the hledger fields being assigned to; for more on this,
+see TIPS.
+
+
+File: hledger_csv.info,  Node: date-format,  Next: if,  Prev: field assignment,  Up: CSV RULES
+
+1.4 'date-format'
+=================
+
+date-format DATEFMT
+
+   This is a helper for the 'date' (and 'date2') fields.  If your CSV
+dates are not formatted like 'YYYY-MM-DD', 'YYYY/MM/DD' or 'YYYY.MM.DD',
+you'll need to specify the format by writing "date-format" followed by a
+strptime-like date parsing pattern, which must parse the date field
+values completely.  Examples:

 # for dates like "11/06/2013":
 date-format %m/%d/%Y

-# for dates like "6/11/2013" (note the - to make leading zeros optional):
+# for dates like "6/11/2013". The - allows leading zeros to be optional.
 date-format %-d/%-m/%Y

 # for dates like "2013-Nov-06":
@ -129,73 +200,43 @@ date-format %Y-%h-%d
 date-format %-m/%-d/%Y %l:%M %p


-File: hledger_csv.info,  Node: field list,  Next: field assignment,  Prev: date-format,  Up: CSV RULES
+File: hledger_csv.info,  Node: if,  Next: end,  Prev: date-format,  Up: CSV RULES

-1.3 field list
-==============
+1.5 'if'
+========

-'fields'_'FIELDNAME1'_, _'FIELDNAME2'_...
+if PATTERN
+ RULE

-   This (a) names the CSV fields, in order (names may not contain
-whitespace; uninteresting names may be left blank), and (b) assigns them
-to journal entry fields if you use any of these standard field names:
-'date', 'date2', 'status', 'code', 'description', 'comment', 'account1',
-'account2', 'amount', 'amount-in', 'amount-out', 'currency', 'balance',
-'balance1', 'balance2'.  Eg:
+if
+PATTERN
+PATTERN
+PATTERN
+ RULE
+ RULE

-# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
-# and give the 7th and 8th fields meaningful names for later reference:
-#
-# CSV field:
-#      1     2            3 4       5 6 7          8
-# entry field:
-fields date, description, , amount, , , somefield, anotherfield
+   Conditional blocks apply one or more rules to CSV records which are
+matched by any of the PATTERNs.  This allows transactions to be
+customised or categorised based on patterns in the data.

-
-File: hledger_csv.info,  Node: field assignment,  Next: conditional block,  Prev: field list,  Up: CSV RULES
+   A single pattern can be written on the same line as the "if"; or
+multiple patterns can be written on the following lines, non-indented.

-1.4 field assignment
-====================
+   Patterns are case-insensitive regular expressions which try to match
+any part of the whole CSV record.  It's not yet possible to match within
+a specific field.  Note the CSV record they see is close but not
+identical to the one in the CSV file; eg double quotes are removed, and
+the separator character becomes comma.

-_'ENTRYFIELDNAME'_ _'FIELDVALUE'_
+   After the patterns, there should be one or more rules to apply, all
+indented by at least one space.  Three kinds of rule are allowed in
+conditional blocks:

-   This sets a journal entry field (one of the standard names above) to
-the given text value, which can include CSV field values interpolated by
-name ('%CSVFIELDNAME') or 1-based position ('%N').  Eg:
+   * field assignments (to set a field's value)
+   * skip (to skip the matched CSV record)
+   * end (to skip all remaining CSV records).

-# set the amount to the 4th CSV field with "USD " prepended
-amount USD %4
-
-# combine three fields to make a comment (containing two tags)
-comment note: %somefield - %anotherfield, date: %1
-
-   Field assignments can be used instead of or in addition to a field
-list.
-
-   Note, interpolation strips any outer whitespace, so a CSV value like
-'" 1 "' becomes '1' when interpolated (#1051).
-
-
-File: hledger_csv.info,  Node: conditional block,  Next: include,  Prev: field assignment,  Up: CSV RULES
-
-1.5 conditional block
-=====================
-
-'if' _'PATTERN'_
-    _'FIELDASSIGNMENTS'_...
-
-   'if'
-_'PATTERN'_
-_'PATTERN'_...
-    _'FIELDASSIGNMENTS'_...
-
-   This applies one or more field assignments, only to those CSV records
-matched by one of the PATTERNs.  The patterns are case-insensitive
-regular expressions which match anywhere within the whole CSV record
-(it's not yet possible to match within a specific field).  When there
-are multiple patterns they can be written on separate lines, unindented.
-The field assignments are on separate lines indented by at least one
-space.  Examples:
+   Examples:

 # if the CSV record contains "groceries", set account2 to "expenses:groceries"
 if groceries
@ -210,176 +251,369 @@ banking thru software
 comment  XXX deductible ? check it


-File: hledger_csv.info,  Node: include,  Next: newest-first,  Prev: conditional block,  Up: CSV RULES
+File: hledger_csv.info,  Node: end,  Next: include,  Prev: if,  Up: CSV RULES

-1.6 include
-===========
+1.6 'end'
+=========

-'include'_'RULESFILE'_
+As mentioned above, this rule can be used inside conditional blocks
+(only) to cause hledger to stop reading CSV records and proceed with
+command execution.  Eg:

-   Include another rules file at this point.  'RULESFILE' is either an
-absolute file path or a path relative to the current file's directory.
-Eg:
+# ignore everything following the first empty record
+if ,,,,
+ end

-# rules reused with several CSV files
-include common.rules
+
+File: hledger_csv.info,  Node: include,  Next: newest-first,  Prev: end,  Up: CSV RULES
+
+1.7 'include'
+=============
+
+include RULESFILE
+
+   Include another CSV rules file at this point, as if it were written
+inline.  'RULESFILE' is an absolute file path or a path relative to the
+current file's directory.
+
+   This can be useful eg for reusing common rules in several rules
+files:
+
+# someaccount.csv.rules
+
+## someaccount-specific rules
+fields date,description,amount
+account1 some:account
+account2 some:misc
+
+## common rules
+include categorisation.rules


 File: hledger_csv.info,  Node: newest-first,  Prev: include,  Up: CSV RULES

-1.7 newest-first
-================
+1.8 'newest-first'
+==================

-'newest-first'
+hledger always sorts the generated transactions by date.  Transactions
+on the same date should appear in the same order as their CSV records,
+as hledger can usually auto-detect whether the CSV's normal order is
+oldest first or newest first.  But if all of the following are true:

-   Consider adding this rule if all of the following are true: you might
-be processing just one day of data, your CSV records are in reverse
-chronological order (newest first), and you care about preserving the
-order of same-day transactions.  It usually isn't needed, because
-hledger autodetects the CSV order, but when all CSV records have the
-same date it will assume they are oldest first.
+   * the CSV might sometimes contain just one day of data (all records
+     having the same date)
+   * the CSV records are normally in reverse chronological order (newest
+     first)
+   * and you care about preserving the order of same-day transactions
+
+   you should add the 'newest-first' rule as a hint.  Eg:
+
+# tell hledger explicitly that the CSV is normally newest-first
+newest-first


-File: hledger_csv.info,  Node: CSV TIPS,  Prev: CSV RULES,  Up: Top
+File: hledger_csv.info,  Node: EXAMPLES,  Next: TIPS,  Prev: CSV RULES,  Up: Top

-2 CSV TIPS
+2 EXAMPLES
 **********

+A more complete example, generating three-posting transactions:
+
+# hledger CSV rules for amazon.com order history
+
+# sample:
+# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
+# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
+
+# skip one header line
+skip 1
+
+# name the csv fields (and assign the transaction's date, amount and code)
+fields date, _, toorfrom, name, amzstatus, amount1, fees, code
+
+# how to parse the date
+date-format %b %-d, %Y
+
+# combine two fields to make the description
+description %toorfrom %name
+
+# save these fields as tags
+comment     status:%amzstatus
+
+# set the base account for all transactions
+account1    assets:amazon
+
+# flip the sign on the amount
+amount      -%amount
+
+# Put fees in a separate posting
+amount3     %fees
+comment3    fees
+
+   For more examples, see Convert CSV files.
+
+
+File: hledger_csv.info,  Node: TIPS,  Prev: EXAMPLES,  Up: Top
+
+3 TIPS
+******
+
 * Menu:

-* CSV ordering::
-* CSV accounts::
-* CSV amounts::
-* CSV balance assertions/assignments::
 * Reading multiple CSV files::
+* Deduplicating importing::
+* Other import methods::
 * Valid CSV::
+* Other separator characters::
+* Setting amounts::
+* Referencing other fields::
+* How CSV rules are evaluated::
+* Valid transactions::


-File: hledger_csv.info,  Node: CSV ordering,  Next: CSV accounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Reading multiple CSV files,  Next: Deduplicating importing,  Up: TIPS

-2.1 CSV ordering
-================
+3.1 Reading multiple CSV files
+==============================

-The generated journal entries will be sorted by date.  The order of
-same-day entries will be preserved (except in the special case where you
-might need 'newest-first', see above).
+You can read multiple CSV files at once using multiple '-f' arguments on
+the command line.  hledger will look for a correspondingly-named rules
+file for each CSV file.  If you use the '--rules-file' option, that
+rules file will be used for all the CSV files.


-File: hledger_csv.info,  Node: CSV accounts,  Next: CSV amounts,  Prev: CSV ordering,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Deduplicating importing,  Next: Other import methods,  Prev: Reading multiple CSV files,  Up: TIPS

-2.2 CSV accounts
-================
+3.2 Deduplicating, importing
+============================

-Each journal entry will have two postings, to 'account1' and 'account2'
-respectively.  It's not yet possible to generate entries with more than
-two postings.  It's conventional and recommended to use 'account1' for
-the account whose CSV we are reading.
+When you download a CSV file repeatedly, eg to get your latest bank
+transactions, the new file may contain some of the same records as the
+old one.  The print -new command is one simple way to detect just the
+new transactions.  Or better still, the import command appends those new
+transactions to your main journal.  This is the easiest way to import
+CSV data.  Eg, after downloading your latest CSV files:
+
+$ hledger import *.csv [--dry]


-File: hledger_csv.info,  Node: CSV amounts,  Next: CSV balance assertions/assignments,  Prev: CSV accounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Other import methods,  Next: Valid CSV,  Prev: Deduplicating importing,  Up: TIPS

-2.3 CSV amounts
-===============
+3.3 Other import methods
+========================

-A transaction amount must be set, in one of these ways:
+A number of other tools and workflows, hledger-specific and otherwise,
+exist for converting, deduplicating, classifying and managing CSV data.
+See:

-   * with an 'amount' field assignment, which sets the first posting's
-     amount
+   * https://hledger.org -> sidebar -> real world setups
+   * https://plaintextaccounting.org -> data import/conversion

-   * (When the CSV has debit and credit amounts in separate fields:)
-     with field assignments for the 'amount-in' and 'amount-out' pseudo
-     fields (both of them).  Whichever one has a value will be used,
-     with appropriate sign.  If both contain a value, it might not work
-     so well.
+
+File: hledger_csv.info,  Node: Valid CSV,  Next: Other separator characters,  Prev: Other import methods,  Up: TIPS

-   * or implicitly by means of a balance assignment (see below).
+3.4 Valid CSV
+=============
+
+hledger accepts CSV conforming to RFC 4180.  Some things to note when
+values are enclosed in quotes:
+
+   * you must use double quotes (not single quotes)
+   * spaces outside the quotes are not allowed
+
+
+File: hledger_csv.info,  Node: Other separator characters,  Next: Setting amounts,  Prev: Valid CSV,  Up: TIPS
+
+3.5 Other separator characters
+==============================
+
+With the '--separator 'CHAR'' option, hledger will expect the separator
+to be CHAR instead of a comma.  Ie it will read other "Character
+Separated Values" formats, such as TSV (Tab Separated Values).  Note: on
+the command line, use a real tab character in quotes, not
+
+$ hledger -f foo.tsv --separator '  ' print
+
+   (Experimental.)
+
+
+File: hledger_csv.info,  Node: Setting amounts,  Next: Referencing other fields,  Prev: Other separator characters,  Up: TIPS
+
+3.6 Setting amounts
+===================
+
+A posting amount can be set in one of these ways:
+
+   * by assigning (with a fields list or field assigment) to 'amountN'
+     (posting N's amount) or 'amount' (posting 1's amount)
+
+   * by assigning to 'amountN-in' and 'amountN-out' (or 'amount-in' and
+     'amount-out').  For each CSV record, whichever of these has a
+     non-zero value will be used, with appropriate sign.  If both
+     contain a non-zero value, this may not work.
+
+   * by assigning to 'balanceN' (or 'balance') instead of the above,
+     setting the amount indirectly via a balance assignment.

   There is some special handling for sign in amounts:

   * If an amount value is parenthesised, it will be de-parenthesised
     and sign-flipped.
-   * If an amount value begins with a double minus sign, those will
-     cancel out and be removed.
+   * If an amount value begins with a double minus sign, those cancel
+     out and are removed.

   If the currency/commodity symbol is provided as a separate CSV field,
-assign it to the 'currency' pseudo field; the symbol will be prepended
-to the amount (TODO: when there is an amount).  Or, you can use an
-'amount' field assignment for more control, eg:
+you can assign it to 'currency' (affects all posting amounts) or
+'currencyN' (affects just posting N's amount).  The symbol will be
+prepended to the amount.  Or for more control, you can set both currency
+symbol and amount with a field assignment, eg:

 fields date,description,currency,amount
+# add currency symbol on the right:
 amount %amount %currency


-File: hledger_csv.info,  Node: CSV balance assertions/assignments,  Next: Reading multiple CSV files,  Prev: CSV amounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Referencing other fields,  Next: How CSV rules are evaluated,  Prev: Setting amounts,  Up: TIPS

-2.4 CSV balance assertions/assignments
-======================================
+3.7 Referencing other fields
+============================

-If the CSV includes a running balance, you can assign that to one of the
-pseudo fields 'balance' (or 'balance1') or 'balance2'.  This will
-generate a balance assertion (or if the amount is left empty, a balance
-assignment), on the first or second posting, whenever the running
-balance field is non-empty.  (TODO: #1000)
+In field assignments, you can interpolate only CSV fields, not hledger
+fields.  In the example below, there's both a CSV field and a hledger
+field named amount1, but %amount1 always means the CSV field, not the
+hledger field:
+
+# Name the third CSV field "amount1"
+fields date,description,amount1
+
+# Set hledger's amount1 to the CSV amount1 field followed by USD
+amount1 %amount1 USD
+
+# Set comment to the CSV amount1 (not the amount1 assigned above)
+comment %amount1
+
+   Here, since there's no CSV amount1 field, %amount1 will produce a
+literal "amount1":
+
+fields date,description,csvamount
+amount1 %csvamount USD
+# Can't interpolate amount1 here
+comment %amount1
+
+   When there are multiple field assignments to the same hledger field,
+only the last one takes effect.  Here, comment's value will be be B, or
+C if "something" is matched, but never A:
+
+comment A
+comment B
+if something
+ comment C


-File: hledger_csv.info,  Node: Reading multiple CSV files,  Next: Valid CSV,  Prev: CSV balance assertions/assignments,  Up: CSV TIPS
+File: hledger_csv.info,  Node: How CSV rules are evaluated,  Next: Valid transactions,  Prev: Referencing other fields,  Up: TIPS

-2.5 Reading multiple CSV files
-==============================
+3.8 How CSV rules are evaluated
+===============================

-You can read multiple CSV files at once using multiple '-f' arguments on
-the command line, and hledger will look for a correspondingly-named
-rules file for each.  Note if you use the '--rules-file' option, this
-one rules file will be used for all the CSV files being read.
+Here's how to think of CSV rules being evaluated (if you really need
+to).  First,
+
+   * include - all includes are inlined, from top to bottom, depth
+     first.  (At each include point the file is inlined and scanned for
+     further includes, before proceeding.)
+
+   Then "global" rules are evaluated, top to bottom.  If a rule is
+repeated, the last one wins:
+
+   * skip (at top level)
+   * date-format
+   * newest-first
+   * fields - names the CSV fields, optionally sets up initial
+     assignments to hledger fields
+
+   Then for each CSV record in turn:
+
+   * test all 'if' blocks.  If any of them contain a 'end' rule, skip
+     all remaining CSV records.  Otherwise if any of them contain a
+     'skip' rule, skip that many CSV records.  If there are multiple
+     matched skip rules, the first one wins.
+   * collect all field assignments at top level and in matched if
+     blocks.  When there are multiple assignments for a field, keep only
+     the last one.
+   * compute a value for each hledger field - either the one that was
+     assigned to it (and interpolate the %CSVFIELDNAME references), or a
+     default
+   * generate a synthetic hledger transaction from these values, which
+     becomes part of the input to the hledger command that has been
+     selected


-File: hledger_csv.info,  Node: Valid CSV,  Prev: Reading multiple CSV files,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Valid transactions,  Prev: How CSV rules are evaluated,  Up: TIPS

-2.6 Valid CSV
-=============
+3.9 Valid transactions
+======================

-hledger follows RFC 4180, with the addition of a customisable separator
-character.
+hledger currently does not post-process and validate transactions
+generated from CSV as thoroughly as transactions read from a journal
+file.  This means that if your rules are wrong, you can generate invalid
+transactions.  Or, amounts may not be displayed with a canonical display
+style.

-   Some things to note:
+   So when setting up or adjusting CSV rules, you should check your
+results visually with the print command.  You can pipe print's output
+through hledger once more to validate and canonicalise fully.  Eg:

-   When quoting fields,
+$ hledger -f some.csv print | hledger -f- print -I

-   * you must use double quotes, not single quotes
-   * spaces outside the quotes are not allowed.
+   (The -I/-ignore-assertions flag disables balance assertion checks,
+usually needed when re-parsing print output.)


 Tag Table:
 Node: Top72
-Node: CSV RULES2167
-Ref: #csv-rules2275
-Node: skip2538
-Ref: #skip2632
-Node: date-format2857
-Ref: #date-format2984
-Node: field list3534
-Ref: #field-list3671
-Node: field assignment4401
-Ref: #field-assignment4556
-Node: conditional block5180
-Ref: #conditional-block5334
-Node: include6230
-Ref: #include6360
-Node: newest-first6591
-Ref: #newest-first6705
-Node: CSV TIPS7116
-Ref: #csv-tips7210
-Node: CSV ordering7354
-Ref: #csv-ordering7472
-Node: CSV accounts7653
-Ref: #csv-accounts7791
-Node: CSV amounts8045
-Ref: #csv-amounts8203
-Node: CSV balance assertions/assignments9283
-Ref: #csv-balance-assertionsassignments9501
-Node: Reading multiple CSV files9822
-Ref: #reading-multiple-csv-files10022
-Node: Valid CSV10296
-Ref: #valid-csv10419
+Node: CSV RULES1428
+Ref: #csv-rules1536
+Node: skip1849
+Ref: #skip1942
+Node: fields2312
+Ref: #fields2434
+Node: Transaction fields3239
+Ref: #transaction-fields3379
+Node: Posting fields3547
+Ref: #posting-fields3679
+Node: field assignment4729
+Ref: #field-assignment4882
+Node: date-format5693
+Ref: #date-format5828
+Node: if6440
+Ref: #if6544
+Node: end7915
+Ref: #end8017
+Node: include8246
+Ref: #include8366
+Node: newest-first8804
+Ref: #newest-first8922
+Node: EXAMPLES9594
+Ref: #examples9701
+Node: TIPS10607
+Ref: #tips10688
+Node: Reading multiple CSV files10931
+Ref: #reading-multiple-csv-files11098
+Node: Deduplicating importing11358
+Ref: #deduplicating-importing11550
+Node: Other import methods11991
+Ref: #other-import-methods12158
+Node: Valid CSV12428
+Ref: #valid-csv12576
+Node: Other separator characters12778
+Ref: #other-separator-characters12955
+Node: Setting amounts13289
+Ref: #setting-amounts13459
+Node: Referencing other fields14702
+Ref: #referencing-other-fields14891
+Node: How CSV rules are evaluated15788
+Ref: #how-csv-rules-are-evaluated15986
+Node: Valid transactions17266
+Ref: #valid-transactions17413

 End Tag Table
--- a/hledger-lib/hledger_csv.txt
+++ b/hledger-lib/hledger_csv.txt
@ -16,8 +16,8 @@ DESCRIPTION

       o they describe the layout and format of the CSV data

-       o they can customize the generated journal entries using a simple  tem-
-         plating language
+       o they can customize the generated journal entries (transactions) using
+         a simple templating language

       o they  can add refinements based on patterns in the CSV data, eg cate-
         gorizing transactions with more detailed account names.
@ -36,63 +36,109 @@ DESCRIPTION
              date-format  %d/%m/%Y
              skip 1

-       A more complete example:
-
-              # hledger CSV rules for amazon.com order history
-
-              # sample:
-              # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
-              # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
-
-              # skip one header line
-              skip 1
-
-              # name the csv fields (and assign the transaction's date, amount and code)
-              fields date, _, toorfrom, name, amzstatus, amount, fees, code
-
-              # how to parse the date
-              date-format %b %-d, %Y
-
-              # combine two fields to make the description
-              description %toorfrom %name
-
-              # save these fields as tags
-              comment     status:%amzstatus, fees:%fees
-
-              # set the base account for all transactions
-              account1    assets:amazon
-
-              # flip the sign on the amount
-              amount      -%amount
-
-       For more examples, see Convert CSV files.
+       More examples in the EXAMPLES section below.

 CSV RULES
-       The following seven kinds of rule can appear in the rules file, in  any
-       order.  Blank lines and lines beginning with # or ; are ignored.
+       The following kinds of rule can appear in the rules file, in any  order
+       (except  for  end  which  can  appear only inside a conditional block).
+       Blank lines and lines beginning with # or ; are ignored.

   skip
              skip N

-       Skip  this  many  non-empty lines preceding the CSV data.  (Empty/blank
-       lines are skipped automatically.) You'll need this  whenever  your  CSV
-       data contains header lines.  Eg:
+       The word "skip" followed by a number (or no number,  meaning  1)  tells
+       hledger  to  ignore  this  many non-empty lines preceding the CSV data.
+       (Empty/blank lines are skipped automatically.) You'll need  this  when-
+       ever your CSV data contains header lines.

-              # ignore the first CSV line
-              skip 1
+       It  also  has  a  second  purpose: it can be used to ignore certain CSV
+       records, see conditional blocks below.
+
+   fields
+              fields FIELDNAME1, FIELDNAME2, ...
+
+       A fields list ("fields" followed by one or more  comma-separated  field
+       names)  is  the quick way to assign CSV field values to hledger fields.
+       It (a) names the CSV fields, in order (names  may  not  contain  white-
+       space;  fields  you  don't care about can be left unnamed), and (b) as-
+       signs them to hledger fields if you use standard hledger  field  names.
+       Here's an example:
+
+              # use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
+              # ignore the 3rd, 5th and 6th fields,
+              # and name the 7th and 8th fields for later reference:
+              #      1     2           3  4       5 6  7          8
+
+              fields date, description, , amount1, , , somefield, anotherfield
+
+       Here are the standard hledger field names:
+
+   Transaction fields
+       date, date2, status, code, description, comment can be used to form the
+       transaction's first line.  Only date is required.  (See also  date-for-
+       mat below.)
+
+   Posting fields
+       accountN, where N is 1 to 9, sets the Nth posting's account name.  Most
+       often there are two postings, so you'll want to set  account1  and  ac-
+       count2.
+
+       A  number of field/pseudo-field names are available for setting posting
+       amounts:
+
+       o amountN sets posting N's amount
+
+       o amountN-in and amountN-out can be used instead, if the CSV has  sepa-
+         rate fields for debits and credits
+
+       o currencyN  sets  a currency symbol to be left-prefixed to the amount,
+         useful if the CSV provides that as a separate field
+
+       o balanceN sets a (separate) balance assertion amount (or when no post-
+         ing amount is set, a balance assignment)
+
+       If  you write these with no number (amount, amount-in, amount-out, cur-
+       rency, balance), it means posting 1.  Also, if you set  an  amount  for
+       posting  1 only, a second posting that balances the transaction will be
+       generated automatically.  This helps support CSV rules  created  before
+       hledger 1.16.
+
+       Finally,  commentN  sets a comment on the Nth posting.  Comments can of
+       course contain tags.
+
+   (field assignment)
+              HLEDGERFIELDNAME FIELDVALUE
+
+       Instead of or in addition to a fields list, you can assign a value to a
+       hledger  field  by  writing  its name (any of the standard names above)
+       followed by a text value.   The  value  may  contain  interpolated  CSV
+       fields, referenced by their 1-based position in the CSV record (%N), or
+       by the name they were given in the fields list (%CSVFIELDNAME).  Eg:
+
+              # set the amount to the 4th CSV field, with " USD" appended
+              amount %4 USD
+
+              # combine three fields to make a comment, containing note: and date: tags
+              comment note: %somefield - %anotherfield, date: %1
+
+       Interpolation strips any outer whitespace, so a CSV value like  "  1  "
+       becomes 1 when interpolated (#1051).  Note you can only interpolate CSV
+       fields, not the hledger fields being assigned to; for more on this, see
+       TIPS.

   date-format
              date-format DATEFMT

-       When  your  CSV date fields are not formatted like YYYY/MM/DD (or YYYY-
-       MM-DD or YYYY.MM.DD), you'll need to specify the format.  DATEFMT is  a
-       strptime-like  date  parsing  pattern,  which must parse the date field
-       values completely.  Examples:
+       This  is  a  helper for the date (and date2) fields.  If your CSV dates
+       are not formatted like YYYY-MM-DD,  YYYY/MM/DD  or  YYYY.MM.DD,  you'll
+       need to specify the format by writing "date-format" followed by a strp-
+       time-like date parsing pattern, which must parse the date field  values
+       completely.  Examples:

              # for dates like "11/06/2013":
              date-format %m/%d/%Y

-              # for dates like "6/11/2013" (note the - to make leading zeros optional):
+              # for dates like "6/11/2013". The - allows leading zeros to be optional.
              date-format %-d/%-m/%Y

              # for dates like "2013-Nov-06":
@ -101,59 +147,41 @@ CSV RULES
              # for dates like "11/6/2013 11:32 PM":
              date-format %-m/%-d/%Y %l:%M %p

-   field list
-       fieldsFIELDNAME1, FIELDNAME2...
-
-       This (a) names the CSV fields, in order (names may not  contain  white-
-       space;  uninteresting names may be left blank), and (b) assigns them to
-       journal entry fields if you use any  of  these  standard  field  names:
-       date,  date2,  status,  code, description, comment, account1, account2,
-       amount, amount-in, amount-out, currency, balance,  balance1,  balance2.
-       Eg:
-
-              # use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
-              # and give the 7th and 8th fields meaningful names for later reference:
-              #
-              # CSV field:
-              #      1     2            3 4       5 6 7          8
-              # entry field:
-              fields date, description, , amount, , , somefield, anotherfield
-
-   field assignment
-       ENTRYFIELDNAME FIELDVALUE
-
-       This  sets  a  journal entry field (one of the standard names above) to
-       the given text value, which can include CSV field  values  interpolated
-       by name (%CSVFIELDNAME) or 1-based position (%N).  Eg:
-
-              # set the amount to the 4th CSV field with "USD " prepended
-              amount USD %4
-
-              # combine three fields to make a comment (containing two tags)
-              comment note: %somefield - %anotherfield, date: %1
-
-       Field  assignments  can  be  used  instead of or in addition to a field
-       list.
-
-       Note, interpolation strips any outer whitespace, so a CSV value like  "
-       1 " becomes 1 when interpolated (#1051).
-
-   conditional block
+   if
              if PATTERN
-           FIELDASSIGNMENTS...
+               RULE

              if
              PATTERN
-       PATTERN...
-           FIELDASSIGNMENTS...
+              PATTERN
+              PATTERN
+               RULE
+               RULE

-       This  applies  one or more field assignments, only to those CSV records
-       matched by one of the PATTERNs.  The patterns are case-insensitive reg-
-       ular expressions which match anywhere within the whole CSV record (it's
-       not yet possible to match within a specific  field).   When  there  are
-       multiple  patterns  they  can be written on separate lines, unindented.
-       The field assignments are on separate lines indented by  at  least  one
-       space.  Examples:
+       Conditional  blocks  apply  one  or more rules to CSV records which are
+       matched by any of the PATTERNs.  This allows transactions  to  be  cus-
+       tomised or categorised based on patterns in the data.
+
+       A single pattern can be written on the same line as the "if"; or multi-
+       ple patterns can be written on the following lines, non-indented.
+
+       Patterns are case-insensitive regular expressions which  try  to  match
+       any  part  of  the  whole  CSV  record.  It's not yet possible to match
+       within a specific field.  Note the CSV record they see is close but not
+       identical to the one in the CSV file; eg double quotes are removed, and
+       the separator character becomes comma.
+
+       After the patterns, there should be one or more rules to apply, all in-
+       dented  by at least one space.  Three kinds of rule are allowed in con-
+       ditional blocks:
+
+       o field assignments (to set a field's value)
+
+       o skip (to skip the matched CSV record)
+
+       o end (to skip all remaining CSV records).
+
+       Examples:

              # if the CSV record contains "groceries", set account2 to "expenses:groceries"
              if groceries
@ -167,90 +195,250 @@ CSV RULES
               account2 expenses:business:banking
               comment  XXX deductible ? check it

+   end
+       As mentioned above, this rule can be  used  inside  conditional  blocks
+       (only)  to  cause  hledger to stop reading CSV records and proceed with
+       command execution.  Eg:
+
+              # ignore everything following the first empty record
+              if ,,,,
+               end
+
   include
              include RULESFILE

-       Include another rules file at this point.  RULESFILE is either an abso-
-       lute file path or a path relative to the current file's directory.  Eg:
+       Include another CSV rules file at this point, as if it were written in-
+       line.   RULESFILE  is  an  absolute file path or a path relative to the
+       current file's directory.

-              # rules reused with several CSV files
-              include common.rules
+       This can be useful eg for reusing common rules in several rules files:
+
+              # someaccount.csv.rules
+
+              ## someaccount-specific rules
+              fields date,description,amount
+              account1 some:account
+              account2 some:misc
+
+              ## common rules
+              include categorisation.rules

   newest-first
+       hledger always sorts the generated transactions by date.   Transactions
+       on  the same date should appear in the same order as their CSV records,
+       as hledger can usually auto-detect whether the CSV's  normal  order  is
+       oldest first or newest first.  But if all of the following are true:
+
+       o the  CSV  might  sometimes  contain just one day of data (all records
+         having the same date)
+
+       o the CSV records are normally in reverse chronological  order  (newest
+         first)
+
+       o and you care about preserving the order of same-day transactions
+
+       you should add the newest-first rule as a hint.  Eg:
+
+              # tell hledger explicitly that the CSV is normally newest-first
              newest-first

-       Consider adding this rule if all of the following are true:  you  might
-       be  processing  just  one  day of data, your CSV records are in reverse
-       chronological order (newest first), and you care about  preserving  the
-       order  of  same-day  transactions.   It  usually  isn't needed, because
-       hledger autodetects the CSV order, but when all CSV  records  have  the
-       same date it will assume they are oldest first.
+EXAMPLES
+       A more complete example, generating three-posting transactions:

-CSV TIPS
-   CSV ordering
-       The  generated  journal  entries  will be sorted by date.  The order of
-       same-day entries will be preserved (except in the  special  case  where
-       you might need newest-first, see above).
+              # hledger CSV rules for amazon.com order history

-   CSV accounts
-       Each journal entry will have two postings, to account1 and account2 re-
-       spectively.  It's not yet possible to generate entries with  more  than
-       two  postings.   It's  conventional and recommended to use account1 for
-       the account whose CSV we are reading.
+              # sample:
+              # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
+              # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"

-   CSV amounts
-       A transaction amount must be set, in one of these ways:
+              # skip one header line
+              skip 1

-       o with an amount field  assignment,  which  sets  the  first  posting's
-         amount
+              # name the csv fields (and assign the transaction's date, amount and code)
+              fields date, _, toorfrom, name, amzstatus, amount1, fees, code

-       o (When the CSV has debit and credit amounts in separate fields:)
-       with  field  assignments for the amount-in and amount-out pseudo fields
-       (both of them).  Whichever one has a value will be used, with appropri-
-       ate sign.  If both contain a value, it might not work so well.
+              # how to parse the date
+              date-format %b %-d, %Y

-       o or implicitly by means of a balance assignment (see below).
+              # combine two fields to make the description
+              description %toorfrom %name
+
+              # save these fields as tags
+              comment     status:%amzstatus
+
+              # set the base account for all transactions
+              account1    assets:amazon
+
+              # flip the sign on the amount
+              amount      -%amount
+
+              # Put fees in a separate posting
+              amount3     %fees
+              comment3    fees
+
+       For more examples, see Convert CSV files.
+
+TIPS
+   Reading multiple CSV files
+       You  can read multiple CSV files at once using multiple -f arguments on
+       the command line.  hledger will look for a correspondingly-named  rules
+       file for each CSV file.  If you use the --rules-file option, that rules
+       file will be used for all the CSV files.
+
+   Deduplicating, importing
+       When you download a CSV file repeatedly, eg to  get  your  latest  bank
+       transactions,  the new file may contain some of the same records as the
+       old one.  The print --new command is one simple way to detect just  the
+       new  transactions.   Or  better still, the import command appends those
+       new transactions to your main journal.  This is the easiest way to  im-
+       port CSV data.  Eg, after downloading your latest CSV files:
+
+              $ hledger import *.csv [--dry]
+
+   Other import methods
+       A  number of other tools and workflows, hledger-specific and otherwise,
+       exist for converting, deduplicating, classifying and managing CSV data.
+       See:
+
+       o https://hledger.org -> sidebar -> real world setups
+
+       o https://plaintextaccounting.org -> data import/conversion
+
+   Valid CSV
+       hledger  accepts  CSV conforming to RFC 4180.  Some things to note when
+       values are enclosed in quotes:
+
+       o you must use double quotes (not single quotes)
+
+       o spaces outside the quotes are not allowed
+
+   Other separator characters
+       With the --separator 'CHAR' option, hledger will expect  the  separator
+       to  be CHAR instead of a comma.  Ie it will read other "Character Sepa-
+       rated Values" formats, such as TSV (Tab Separated  Values).   Note:  on
+       the command line, use a real tab character in quotes, not Eg:
+
+              $ hledger -f foo.tsv --separator '  ' print
+
+       (Experimental.)
+
+   Setting amounts
+       A posting amount can be set in one of these ways:
+
+       o by  assigning  (with  a  fields  list  or field assigment) to amountN
+         (posting N's amount) or amount (posting 1's amount)
+
+       o by assigning to amountN-in and amountN-out (or amount-in and  amount-
+         out).   For  each CSV record, whichever of these has a non-zero value
+         will be used, with appropriate sign.   If  both  contain  a  non-zero
+         value, this may not work.
+
+       o by  assigning  to balanceN (or balance) instead of the above, setting
+         the amount indirectly via a balance assignment.

       There is some special handling for sign in amounts:

       o If an amount value is parenthesised, it will be de-parenthesised  and
         sign-flipped.

-       o If an amount value begins with a double minus sign, those will cancel
-         out and be removed.
+       o If  an amount value begins with a double minus sign, those cancel out
+         and are removed.

       If the currency/commodity symbol is provided as a separate  CSV  field,
-       assign it to the currency pseudo field; the symbol will be prepended to
-       the  amount (TODO: when there is an amount).  Or, you can use an amount
-       field assignment for more control, eg:
+       you  can assign it to currency (affects all posting amounts) or curren-
+       cyN (affects just posting N's amount).  The symbol will be prepended to
+       the  amount.  Or for more control, you can set both currency symbol and
+       amount with a field assignment, eg:

              fields date,description,currency,amount
+              # add currency symbol on the right:
              amount %amount %currency

-   CSV balance assertions/assignments
-       If the CSV includes a running balance, you can assign that  to  one  of
-       the  pseudo fields balance (or balance1) or balance2.  This will gener-
-       ate a balance assertion (or if the amount is left empty, a balance  as-
-       signment), on the first or second posting, whenever the running balance
-       field is non-empty.  (TODO: #1000)
+   Referencing other fields
+       In field assignments, you can interpolate only CSV fields, not  hledger
+       fields.   In  the example below, there's both a CSV field and a hledger
+       field named amount1, but %amount1 always means the CSV field,  not  the
+       hledger field:

-   Reading multiple CSV files
-       You can read multiple CSV files at once using multiple -f arguments  on
-       the  command  line,  and  hledger will look for a correspondingly-named
-       rules file for each.  Note if you use the --rules-file option, this one
-       rules file will be used for all the CSV files being read.
+              # Name the third CSV field "amount1"
+              fields date,description,amount1

-   Valid CSV
-       hledger follows RFC 4180, with the addition of a customisable separator
-       character.
+              # Set hledger's amount1 to the CSV amount1 field followed by USD
+              amount1 %amount1 USD

-       Some things to note:
+              # Set comment to the CSV amount1 (not the amount1 assigned above)
+              comment %amount1

-       When quoting fields,
+       Here,  since there's no CSV amount1 field, %amount1 will produce a lit-
+       eral "amount1":

-       o you must use double quotes, not single quotes
+              fields date,description,csvamount
+              amount1 %csvamount USD
+              # Can't interpolate amount1 here
+              comment %amount1

-       o spaces outside the quotes are not allowed.
+       When there are multiple field assignments to the  same  hledger  field,
+       only the last one takes effect.  Here, comment's value will be be B, or
+       C if "something" is matched, but never A:
+
+              comment A
+              comment B
+              if something
+               comment C
+
+   How CSV rules are evaluated
+       Here's how to think of CSV rules being evaluated (if  you  really  need
+       to).  First,
+
+       o include  - all includes are inlined, from top to bottom, depth first.
+         (At each include point the file is inlined and  scanned  for  further
+         includes, before proceeding.)
+
+       Then  "global"  rules  are  evaluated, top to bottom.  If a rule is re-
+       peated, the last one wins:
+
+       o skip (at top level)
+
+       o date-format
+
+       o newest-first
+
+       o fields - names the CSV fields, optionally sets up initial assignments
+         to hledger fields
+
+       Then for each CSV record in turn:
+
+       o test  all if blocks.  If any of them contain a end rule, skip all re-
+         maining CSV records.  Otherwise if any of them contain a  skip  rule,
+         skip  that  many  CSV  records.   If  there are multiple matched skip
+         rules, the first one wins.
+
+       o collect all field assignments at top level and in matched if  blocks.
+         When  there  are multiple assignments for a field, keep only the last
+         one.
+
+       o compute a value for each hledger field - either the one that was  as-
+         signed to it (and interpolate the %CSVFIELDNAME references), or a de-
+         fault
+
+       o generate a synthetic hledger transaction from these values, which be-
+         comes part of the input to the hledger command that has been selected
+
+   Valid transactions
+       hledger  currently does not post-process and validate transactions gen-
+       erated from CSV as thoroughly as transactions read from a journal file.
+       This  means  that  if  your  rules  are wrong, you can generate invalid
+       transactions.  Or, amounts may not be displayed with a  canonical  dis-
+       play style.
+
+       So  when  setting  up or adjusting CSV rules, you should check your re-
+       sults visually with the print command.  You  can  pipe  print's  output
+       through hledger once more to validate and canonicalise fully.  Eg:
+
+              $ hledger -f some.csv print | hledger -f- print -I
+
+       (The  -I/--ignore-assertions  flag  disables  balance assertion checks,
+       usually needed when re-parsing print output.)