;doc: regen manuals

[ci skip]
2019-11-06 13:10:17 -08:00 · 2019-11-06 13:10:17 -08:00 · 7ecc42f142
commit 7ecc42f142
parent d92351e21a
3 changed files with 1224 additions and 566 deletions
--- a/hledger-lib/hledger_csv.5
+++ b/hledger-lib/hledger_csv.5
@ -18,8 +18,8 @@ These do several things:
 .IP \[bu] 2
 they describe the layout and format of the CSV data
 .IP \[bu] 2
-they can customize the generated journal entries using a simple
+they can customize the generated journal entries (transactions) using a
-templating language
+simple templating language
 .IP \[bu] 2
 they can add refinements based on patterns in the CSV data, eg
 categorizing transactions with more detailed account names.
@ -44,70 +44,142 @@ skip 1
 \f[R]
 .fi
 .PP
-A more complete example:
+More examples in the EXAMPLES section below.
 .SH CSV RULES
 .PP
 The following kinds of rule can appear in the rules file, in any order
 (except for \f[C]end\f[R] which can appear only inside a conditional
 block).
 Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
 ignored.
 .SS \f[C]skip\f[R]
 .IP
 .nf
 \f[C]
-# hledger CSV rules for amazon.com order history
+skip N
 # sample:
 # \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
 # \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
 # skip one header line
 skip 1
 # name the csv fields (and assign the transaction\[aq]s date, amount and code)
 fields date, _, toorfrom, name, amzstatus, amount, fees, code
 # how to parse the date
 date-format %b %-d, %Y
 # combine two fields to make the description
 description %toorfrom %name
 # save these fields as tags
 comment     status:%amzstatus, fees:%fees
 # set the base account for all transactions
 account1    assets:amazon
 # flip the sign on the amount
 amount      -%amount
 \f[R]
 .fi
 .PP
-For more examples, see Convert CSV files.
+The word \[dq]skip\[dq] followed by a number (or no number, meaning 1)
-.SH CSV RULES
+tells hledger to ignore this many non-empty lines preceding the CSV
-.PP
+data.
 The following seven kinds of rule can appear in the rules file, in any
 order.
 Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
 ignored.
 .SS skip
 .PP
 \f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
 .PP
 Skip this many non-empty lines preceding the CSV data.
 (Empty/blank lines are skipped automatically.) You\[aq]ll need this
 whenever your CSV data contains header lines.
 .PP
 It also has a second purpose: it can be used to ignore certain CSV
 records, see conditional blocks below.
 .SS \f[C]fields\f[R]
 .IP
 .nf
 \f[C]
 fields FIELDNAME1, FIELDNAME2, ...
 \f[R]
 .fi
 .PP
 A fields list (\[dq]fields\[dq] followed by one or more comma-separated
 field names) is the quick way to assign CSV field values to hledger
 fields.
 It (a) names the CSV fields, in order (names may not contain whitespace;
 fields you don\[aq]t care about can be left unnamed), and (b) assigns
 them to hledger fields if you use standard hledger field names.
 Here\[aq]s an example:
 .IP
 .nf
 \f[C]
 # use the 1st, 2nd and 4th CSV fields as the transaction\[aq]s date, description and amount,
 # ignore the 3rd, 5th and 6th fields,
 # and name the 7th and 8th fields for later reference:
 #      1     2           3  4       5 6  7          8
 fields date, description, , amount1, , , somefield, anotherfield
 \f[R]
 .fi
 .PP
 Here are the standard hledger field names:
 .SS Transaction fields
 .PP
 \f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
 \f[C]description\f[R], \f[C]comment\f[R] can be used to form the
 transaction\[aq]s first line.
 Only \f[C]date\f[R] is required.
 (See also date-format below.)
 .SS Posting fields
 .PP
 \f[C]accountN\f[R], where N is 1 to 9, sets the Nth posting\[aq]s
 account name.
 Most often there are two postings, so you\[aq]ll want to set
 \f[C]account1\f[R] and \f[C]account2\f[R].
 .PP
 A number of field/pseudo-field names are available for setting posting
 amounts:
 .IP \[bu] 2
 \f[C]amountN\f[R] sets posting N\[aq]s amount
 .IP \[bu] 2
 \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] can be used instead, if
 the CSV has separate fields for debits and credits
 .IP \[bu] 2
 \f[C]currencyN\f[R] sets a currency symbol to be left-prefixed to the
 amount, useful if the CSV provides that as a separate field
 .IP \[bu] 2
 \f[C]balanceN\f[R] sets a (separate) balance assertion amount (or when
 no posting amount is set, a balance assignment)
 .PP
 If you write these with no number (\f[C]amount\f[R],
 \f[C]amount-in\f[R], \f[C]amount-out\f[R], \f[C]currency\f[R],
 \f[C]balance\f[R]), it means posting 1.
 Also, if you set an amount for posting 1 only, a second posting that
 balances the transaction will be generated automatically.
 This helps support CSV rules created before hledger 1.16.
 .PP
 Finally, \f[C]commentN\f[R] sets a comment on the Nth posting.
 Comments can of course contain tags.
 .SS \f[C](field assignment)\f[R]
 .IP
 .nf
 \f[C]
 HLEDGERFIELDNAME FIELDVALUE
 \f[R]
 .fi
 .PP
 Instead of or in addition to a fields list, you can assign a value to a
 hledger field by writing its name (any of the standard names above)
 followed by a text value.
 The value may contain interpolated CSV fields, referenced by their
 1-based position in the CSV record (\f[C]%N\f[R]), or by the name they
 were given in the fields list (\f[C]%CSVFIELDNAME\f[R]).
 Eg:
 .IP
 .nf
 \f[C]
-# ignore the first CSV line
+# set the amount to the 4th CSV field, with \[dq] USD\[dq] appended
-skip 1
+amount %4 USD
 \f[R]
 .fi
 .IP
 .nf
 \f[C]
 # combine three fields to make a comment, containing note: and date: tags
 comment note: %somefield - %anotherfield, date: %1
 \f[R]
 .fi
 .SS date-format
 .PP
-\f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R]
+Interpolation strips any outer whitespace, so a CSV value like
 \f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
 Note you can only interpolate CSV fields, not the hledger fields being
 assigned to; for more on this, see TIPS.
 .SS \f[C]date-format\f[R]
 .IP
 .nf
 \f[C]
 date-format DATEFMT
 \f[R]
 .fi
 .PP
-When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R]
+This is a helper for the \f[C]date\f[R] (and \f[C]date2\f[R]) fields.
-(or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to
+If your CSV dates are not formatted like \f[C]YYYY-MM-DD\f[R],
-specify the format.
+\f[C]YYYY/MM/DD\f[R] or \f[C]YYYY.MM.DD\f[R], you\[aq]ll need to specify
-DATEFMT is a strptime-like date parsing pattern, which must parse the
+the format by writing \[dq]date-format\[dq] followed by a strptime-like
-date field values completely.
+date parsing pattern, which must parse the date field values completely.
 Examples:
 .IP
 .nf
@ -119,7 +191,7 @@ date-format %m/%d/%Y
 .IP
 .nf
 \f[C]
-# for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional):
+# for dates like \[dq]6/11/2013\[dq]. The - allows leading zeros to be optional.
 date-format %-d/%-m/%Y
 \f[R]
 .fi
@ -137,90 +209,47 @@ date-format %Y-%h-%d
 date-format %-m/%-d/%Y %l:%M %p
 \f[R]
 .fi
-.SS field list
+.SS \f[C]if\f[R]
 .PP
 \f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
 \f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
 .PP
 This (a) names the CSV fields, in order (names may not contain
 whitespace; uninteresting names may be left blank), and (b) assigns them
 to journal entry fields if you use any of these standard field names:
 \f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
 \f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
 \f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
 \f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
 \f[C]balance1\f[R], \f[C]balance2\f[R].
 Eg:
 .IP
 .nf
 \f[C]
-# use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount,
+if PATTERN
-# and give the 7th and 8th fields meaningful names for later reference:
+ RULE
-#
+
-# CSV field:
+if
-#      1     2            3 4       5 6 7          8
+PATTERN
-# entry field:
+PATTERN
-fields date, description, , amount, , , somefield, anotherfield
+PATTERN
-\f[R]
+ RULE
-.fi
+ RULE
 .SS field assignment
 .PP
 \f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
 .PP
 This sets a journal entry field (one of the standard names above) to the
 given text value, which can include CSV field values interpolated by
 name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
 Eg:
 .IP
 .nf
 \f[C]
 # set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
 amount USD %4
 \f[R]
 .fi
 .IP
 .nf
 \f[C]
 # combine three fields to make a comment (containing two tags)
 comment note: %somefield - %anotherfield, date: %1
 \f[R]
 .fi
 .PP
-Field assignments can be used instead of or in addition to a field list.
+Conditional blocks apply one or more rules to CSV records which are
 matched by any of the PATTERNs.
 This allows transactions to be customised or categorised based on
 patterns in the data.
 .PP
-Note, interpolation strips any outer whitespace, so a CSV value like
+A single pattern can be written on the same line as the \[dq]if\[dq]; or
-\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
+multiple patterns can be written on the following lines, non-indented.
 .SS conditional block
 .PP
-\f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R]
+Patterns are case-insensitive regular expressions which try to match any
-.PD 0
+part of the whole CSV record.
-.P
+It\[aq]s not yet possible to match within a specific field.
-.PD
+Note the CSV record they see is close but not identical to the one in
-\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
+the CSV file; eg double quotes are removed, and the separator character
 becomes comma.
 .PP
-\f[C]if\f[R]
+After the patterns, there should be one or more rules to apply, all
-.PD 0
+indented by at least one space.
-.P
+Three kinds of rule are allowed in conditional blocks:
-.PD
+.IP \[bu] 2
-\f[I]\f[CI]PATTERN\f[I]\f[R]
+field assignments (to set a field\[aq]s value)
-.PD 0
+.IP \[bu] 2
-.P
+skip (to skip the matched CSV record)
-.PD
+.IP \[bu] 2
-\f[I]\f[CI]PATTERN\f[I]\f[R]...
+end (to skip all remaining CSV records).
 .PD 0
 .P
 .PD
 \ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
 .PP
 This applies one or more field assignments, only to those CSV records
 matched by one of the PATTERNs.
 The patterns are case-insensitive regular expressions which match
 anywhere within the whole CSV record (it\[aq]s not yet possible to match
 within a specific field).
 When there are multiple patterns they can be written on separate lines,
 unindented.
 The field assignments are on separate lines indented by at least one
 space.
 Examples:
 .IP
 .nf
@ -242,112 +271,319 @@ banking thru software
 comment  XXX deductible ? check it
 \f[R]
 .fi
-.SS include
+.SS \f[C]end\f[R]
 .PP
-\f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R]
+As mentioned above, this rule can be used inside conditional blocks
-.PP
+(only) to cause hledger to stop reading CSV records and proceed with
-Include another rules file at this point.
+command execution.
 \f[C]RULESFILE\f[R] is either an absolute file path or a path relative
 to the current file\[aq]s directory.
 Eg:
 .IP
 .nf
 \f[C]
-# rules reused with several CSV files
+# ignore everything following the first empty record
-include common.rules
+if ,,,,
 end
 \f[R]
 .fi
 .SS \f[C]include\f[R]
 .IP
 .nf
 \f[C]
 include RULESFILE
 \f[R]
 .fi
 .SS newest-first
 .PP
-\f[C]newest-first\f[R]
+Include another CSV rules file at this point, as if it were written
 inline.
 \f[C]RULESFILE\f[R] is an absolute file path or a path relative to the
 current file\[aq]s directory.
 .PP
-Consider adding this rule if all of the following are true: you might be
+This can be useful eg for reusing common rules in several rules files:
-processing just one day of data, your CSV records are in reverse
+.IP
-chronological order (newest first), and you care about preserving the
+.nf
-order of same-day transactions.
+\f[C]
-It usually isn\[aq]t needed, because hledger autodetects the CSV order,
+# someaccount.csv.rules
-but when all CSV records have the same date it will assume they are
+
-oldest first.
+## someaccount-specific rules
-.SH CSV TIPS
+fields date,description,amount
-.SS CSV ordering
+account1 some:account
 account2 some:misc
 ## common rules
 include categorisation.rules
 \f[R]
 .fi
 .SS \f[C]newest-first\f[R]
 .PP
-The generated journal entries will be sorted by date.
+hledger always sorts the generated transactions by date.
-The order of same-day entries will be preserved (except in the special
+Transactions on the same date should appear in the same order as their
-case where you might need \f[C]newest-first\f[R], see above).
+CSV records, as hledger can usually auto-detect whether the CSV\[aq]s
-.SS CSV accounts
+normal order is oldest first or newest first.
-.PP
+But if all of the following are true:
 Each journal entry will have two postings, to \f[C]account1\f[R] and
 \f[C]account2\f[R] respectively.
 It\[aq]s not yet possible to generate entries with more than two
 postings.
 It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
 account whose CSV we are reading.
 .SS CSV amounts
 .PP
 A transaction amount must be set, in one of these ways:
 .IP \[bu] 2
-with an \f[C]amount\f[R] field assignment, which sets the first
+the CSV might sometimes contain just one day of data (all records having
-posting\[aq]s amount
+the same date)
 .IP \[bu] 2
-(When the CSV has debit and credit amounts in separate fields:)
+the CSV records are normally in reverse chronological order (newest
-.PD 0
+first)
 .P
 .PD
 with field assignments for the \f[C]amount-in\f[R] and
 \f[C]amount-out\f[R] pseudo fields (both of them).
 Whichever one has a value will be used, with appropriate sign.
 If both contain a value, it might not work so well.
 .IP \[bu] 2
-or implicitly by means of a balance assignment (see below).
+and you care about preserving the order of same-day transactions
 .PP
 you should add the \f[C]newest-first\f[R] rule as a hint.
 Eg:
 .IP
 .nf
 \f[C]
 # tell hledger explicitly that the CSV is normally newest-first
 newest-first
 \f[R]
 .fi
 .SH EXAMPLES
 .PP
 A more complete example, generating three-posting transactions:
 .IP
 .nf
 \f[C]
 # hledger CSV rules for amazon.com order history
 # sample:
 # \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
 # \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
 # skip one header line
 skip 1
 # name the csv fields (and assign the transaction\[aq]s date, amount and code)
 fields date, _, toorfrom, name, amzstatus, amount1, fees, code
 # how to parse the date
 date-format %b %-d, %Y
 # combine two fields to make the description
 description %toorfrom %name
 # save these fields as tags
 comment     status:%amzstatus
 # set the base account for all transactions
 account1    assets:amazon
 # flip the sign on the amount
 amount      -%amount
 # Put fees in a separate posting
 amount3     %fees
 comment3    fees
 \f[R]
 .fi
 .PP
 For more examples, see Convert CSV files.
 .SH TIPS
 .SS Reading multiple CSV files
 .PP
 You can read multiple CSV files at once using multiple \f[C]-f\f[R]
 arguments on the command line.
 hledger will look for a correspondingly-named rules file for each CSV
 file.
 If you use the \f[C]--rules-file\f[R] option, that rules file will be
 used for all the CSV files.
 .SS Deduplicating, importing
 .PP
 When you download a CSV file repeatedly, eg to get your latest bank
 transactions, the new file may contain some of the same records as the
 old one.
 The print --new command is one simple way to detect just the new
 transactions.
 Or better still, the import command appends those new transactions to
 your main journal.
 This is the easiest way to import CSV data.
 Eg, after downloading your latest CSV files:
 .IP
 .nf
 \f[C]
 $ hledger import *.csv [--dry]
 \f[R]
 .fi
 .SS Other import methods
 .PP
 A number of other tools and workflows, hledger-specific and otherwise,
 exist for converting, deduplicating, classifying and managing CSV data.
 See:
 .IP \[bu] 2
 https://hledger.org -> sidebar -> real world setups
 .IP \[bu] 2
 https://plaintextaccounting.org -> data import/conversion
 .SS Valid CSV
 .PP
 hledger accepts CSV conforming to RFC 4180.
 Some things to note when values are enclosed in quotes:
 .IP \[bu] 2
 you must use double quotes (not single quotes)
 .IP \[bu] 2
 spaces outside the quotes are not allowed
 .SS Other separator characters
 .PP
 With the \f[C]--separator \[aq]CHAR\[aq]\f[R] option, hledger will
 expect the separator to be CHAR instead of a comma.
 Ie it will read other \[dq]Character Separated Values\[dq] formats, such
 as TSV (Tab Separated Values).
 Note: on the command line, use a real tab character in quotes, not Eg:
 .IP
 .nf
 \f[C]
 $ hledger -f foo.tsv --separator \[aq]  \[aq] print
 \f[R]
 .fi
 .PP
 (Experimental.)
 .SS Setting amounts
 .PP
 A posting amount can be set in one of these ways:
 .IP \[bu] 2
 by assigning (with a fields list or field assigment) to
 \f[C]amountN\f[R] (posting N\[aq]s amount) or \f[C]amount\f[R] (posting
 1\[aq]s amount)
 .IP \[bu] 2
 by assigning to \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] (or
 \f[C]amount-in\f[R] and \f[C]amount-out\f[R]).
 For each CSV record, whichever of these has a non-zero value will be
 used, with appropriate sign.
 If both contain a non-zero value, this may not work.
 .IP \[bu] 2
 by assigning to \f[C]balanceN\f[R] (or \f[C]balance\f[R]) instead of the
 above, setting the amount indirectly via a balance assignment.
 .PP
 There is some special handling for sign in amounts:
 .IP \[bu] 2
 If an amount value is parenthesised, it will be de-parenthesised and
 sign-flipped.
 .IP \[bu] 2
-If an amount value begins with a double minus sign, those will cancel
+If an amount value begins with a double minus sign, those cancel out and
-out and be removed.
+are removed.
 .PP
 If the currency/commodity symbol is provided as a separate CSV field,
-assign it to the \f[C]currency\f[R] pseudo field; the symbol will be
+you can assign it to \f[C]currency\f[R] (affects all posting amounts) or
-prepended to the amount (TODO: when there is an amount).
+\f[C]currencyN\f[R] (affects just posting N\[aq]s amount).
-Or, you can use an \f[C]amount\f[R] field assignment for more control,
+The symbol will be prepended to the amount.
-eg:
+Or for more control, you can set both currency symbol and amount with a
 field assignment, eg:
 .IP
 .nf
 \f[C]
 fields date,description,currency,amount
 # add currency symbol on the right:
 amount %amount %currency
 \f[R]
 .fi
-.SS CSV balance assertions/assignments
+.SS Referencing other fields
 .PP
-If the CSV includes a running balance, you can assign that to one of the
+In field assignments, you can interpolate only CSV fields, not hledger
-pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or
+fields.
-\f[C]balance2\f[R].
+In the example below, there\[aq]s both a CSV field and a hledger field
-This will generate a balance assertion (or if the amount is left empty,
+named amount1, but %amount1 always means the CSV field, not the hledger
-a balance assignment), on the first or second posting, whenever the
+field:
-running balance field is non-empty.
+.IP
-(TODO: #1000)
+.nf
-.SS Reading multiple CSV files
+\f[C]
 # Name the third CSV field \[dq]amount1\[dq]
 fields date,description,amount1
 # Set hledger\[aq]s amount1 to the CSV amount1 field followed by USD
 amount1 %amount1 USD
 # Set comment to the CSV amount1 (not the amount1 assigned above)
 comment %amount1
 \f[R]
 .fi
 .PP
-You can read multiple CSV files at once using multiple \f[C]-f\f[R]
+Here, since there\[aq]s no CSV amount1 field, %amount1 will produce a
-arguments on the command line, and hledger will look for a
+literal \[dq]amount1\[dq]:
-correspondingly-named rules file for each.
+.IP
-Note if you use the \f[C]--rules-file\f[R] option, this one rules file
+.nf
-will be used for all the CSV files being read.
+\f[C]
-.SS Valid CSV
+fields date,description,csvamount
 amount1 %csvamount USD
 # Can\[aq]t interpolate amount1 here
 comment %amount1
 \f[R]
 .fi
 .PP
-hledger follows RFC 4180, with the addition of a customisable separator
+When there are multiple field assignments to the same hledger field,
-character.
+only the last one takes effect.
 Here, comment\[aq]s value will be be B, or C if \[dq]something\[dq] is
 matched, but never A:
 .IP
 .nf
 \f[C]
 comment A
 comment B
 if something
 comment C
 \f[R]
 .fi
 .SS How CSV rules are evaluated
 .PP
-Some things to note:
+Here\[aq]s how to think of CSV rules being evaluated (if you really need
-.PP
+to).
-When quoting fields,
+First,
 .IP \[bu] 2
-you must use double quotes, not single quotes
+include - all includes are inlined, from top to bottom, depth first.
 (At each include point the file is inlined and scanned for further
 includes, before proceeding.)
 .PP
 Then \[dq]global\[dq] rules are evaluated, top to bottom.
 If a rule is repeated, the last one wins:
 .IP \[bu] 2
-spaces outside the quotes are not allowed.
+skip (at top level)
 .IP \[bu] 2
 date-format
 .IP \[bu] 2
 newest-first
 .IP \[bu] 2
 fields - names the CSV fields, optionally sets up initial assignments to
 hledger fields
 .PP
 Then for each CSV record in turn:
 .IP \[bu] 2
 test all \f[C]if\f[R] blocks.
 If any of them contain a \f[C]end\f[R] rule, skip all remaining CSV
 records.
 Otherwise if any of them contain a \f[C]skip\f[R] rule, skip that many
 CSV records.
 If there are multiple matched skip rules, the first one wins.
 .IP \[bu] 2
 collect all field assignments at top level and in matched if blocks.
 When there are multiple assignments for a field, keep only the last one.
 .IP \[bu] 2
 compute a value for each hledger field - either the one that was
 assigned to it (and interpolate the %CSVFIELDNAME references), or a
 default
 .IP \[bu] 2
 generate a synthetic hledger transaction from these values, which
 becomes part of the input to the hledger command that has been selected
 .SS Valid transactions
 .PP
 hledger currently does not post-process and validate transactions
 generated from CSV as thoroughly as transactions read from a journal
 file.
 This means that if your rules are wrong, you can generate invalid
 transactions.
 Or, amounts may not be displayed with a canonical display style.
 .PP
 So when setting up or adjusting CSV rules, you should check your results
 visually with the print command.
 You can pipe print\[aq]s output through hledger once more to validate
 and canonicalise fully.
 Eg:
 .IP
 .nf
 \f[C]
 $ hledger -f some.csv print | hledger -f- print -I
 \f[R]
 .fi
 .PP
 (The -I/--ignore-assertions flag disables balance assertion checks,
 usually needed when re-parsing print output.)
 .SH "REPORTING BUGS"
--- a/hledger-lib/hledger_csv.info
+++ b/hledger-lib/hledger_csv.info
@ -14,8 +14,8 @@ transaction.  (To learn about _writing_ CSV, see CSV output.)
 rules.  These do several things:
   * they describe the layout and format of the CSV data
-   * they can customize the generated journal entries using a simple
+   * they can customize the generated journal entries (transactions)
-     templating language
+     using a simple templating language
   * they can add refinements based on patterns in the CSV data, eg
     categorizing transactions with more detailed account names.
@ -33,93 +33,164 @@ fields date, _, _, amount
 date-format  %d/%m/%Y
 skip 1
-   A more complete example:
+   More examples in the EXAMPLES section below.
 # hledger CSV rules for amazon.com order history
 # sample:
 # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
 # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
 # skip one header line
 skip 1
 # name the csv fields (and assign the transaction's date, amount and code)
 fields date, _, toorfrom, name, amzstatus, amount, fees, code
 # how to parse the date
 date-format %b %-d, %Y
 # combine two fields to make the description
 description %toorfrom %name
 # save these fields as tags
 comment     status:%amzstatus, fees:%fees
 # set the base account for all transactions
 account1    assets:amazon
 # flip the sign on the amount
 amount      -%amount
   For more examples, see Convert CSV files.
 * Menu:
 * CSV RULES::
-* CSV TIPS::
+* EXAMPLES::
 * TIPS::
-File: hledger_csv.info,  Node: CSV RULES,  Next: CSV TIPS,  Prev: Top,  Up: Top
+File: hledger_csv.info,  Node: CSV RULES,  Next: EXAMPLES,  Prev: Top,  Up: Top
 1 CSV RULES
 ***********
-The following seven kinds of rule can appear in the rules file, in any
+The following kinds of rule can appear in the rules file, in any order
-order.  Blank lines and lines beginning with '#' or ';' are ignored.
+(except for 'end' which can appear only inside a conditional block).
 Blank lines and lines beginning with '#' or ';' are ignored.
 * Menu:
 * skip::
-* date-format::
+* fields::
 * field list::
 * field assignment::
-* conditional block::
+* date-format::
 * if::
 * end::
 * include::
 * newest-first::
-File: hledger_csv.info,  Node: skip,  Next: date-format,  Up: CSV RULES
+File: hledger_csv.info,  Node: skip,  Next: fields,  Up: CSV RULES
-1.1 skip
+1.1 'skip'
-========
+==========
-'skip'_'N'_
+skip N
-   Skip this many non-empty lines preceding the CSV data.  (Empty/blank
+   The word "skip" followed by a number (or no number, meaning 1) tells
-lines are skipped automatically.)  You'll need this whenever your CSV
+hledger to ignore this many non-empty lines preceding the CSV data.
-data contains header lines.  Eg:
+(Empty/blank lines are skipped automatically.)  You'll need this
 whenever your CSV data contains header lines.
-# ignore the first CSV line
+   It also has a second purpose: it can be used to ignore certain CSV
-skip 1
+records, see conditional blocks below.
-File: hledger_csv.info,  Node: date-format,  Next: field list,  Prev: skip,  Up: CSV RULES
+File: hledger_csv.info,  Node: fields,  Next: field assignment,  Prev: skip,  Up: CSV RULES
-1.2 date-format
+1.2 'fields'
-===============
+============
-'date-format'_'DATEFMT'_
+fields FIELDNAME1, FIELDNAME2, ...
-   When your CSV date fields are not formatted like 'YYYY/MM/DD' (or
+   A fields list ("fields" followed by one or more comma-separated field
-'YYYY-MM-DD' or 'YYYY.MM.DD'), you'll need to specify the format.
+names) is the quick way to assign CSV field values to hledger fields.
-DATEFMT is a strptime-like date parsing pattern, which must parse the
+It (a) names the CSV fields, in order (names may not contain whitespace;
-date field values completely.  Examples:
+fields you don't care about can be left unnamed), and (b) assigns them
 to hledger fields if you use standard hledger field names.  Here's an
 example:
 # use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
 # ignore the 3rd, 5th and 6th fields,
 # and name the 7th and 8th fields for later reference:
 #      1     2           3  4       5 6  7          8
 fields date, description, , amount1, , , somefield, anotherfield
   Here are the standard hledger field names:
 * Menu:
 * Transaction fields::
 * Posting fields::
 File: hledger_csv.info,  Node: Transaction fields,  Next: Posting fields,  Up: fields
 1.2.1 Transaction fields
 ------------------------
 'date', 'date2', 'status', 'code', 'description', 'comment' can be used
 to form the transaction's first line.  Only 'date' is required.  (See
 also date-format below.)
 File: hledger_csv.info,  Node: Posting fields,  Prev: Transaction fields,  Up: fields
 1.2.2 Posting fields
 --------------------
 'accountN', where N is 1 to 9, sets the Nth posting's account name.
 Most often there are two postings, so you'll want to set 'account1' and
 'account2'.
   A number of field/pseudo-field names are available for setting
 posting amounts:
   * 'amountN' sets posting N's amount
   * 'amountN-in' and 'amountN-out' can be used instead, if the CSV has
     separate fields for debits and credits
   * 'currencyN' sets a currency symbol to be left-prefixed to the
     amount, useful if the CSV provides that as a separate field
   * 'balanceN' sets a (separate) balance assertion amount (or when no
     posting amount is set, a balance assignment)
   If you write these with no number ('amount', 'amount-in',
 'amount-out', 'currency', 'balance'), it means posting 1.  Also, if you
 set an amount for posting 1 only, a second posting that balances the
 transaction will be generated automatically.  This helps support CSV
 rules created before hledger 1.16.
   Finally, 'commentN' sets a comment on the Nth posting.  Comments can
 of course contain tags.
 File: hledger_csv.info,  Node: field assignment,  Next: date-format,  Prev: fields,  Up: CSV RULES
 1.3 '(field assignment)'
 ========================
 HLEDGERFIELDNAME FIELDVALUE
   Instead of or in addition to a fields list, you can assign a value to
 a hledger field by writing its name (any of the standard names above)
 followed by a text value.  The value may contain interpolated CSV
 fields, referenced by their 1-based position in the CSV record ('%N'),
 or by the name they were given in the fields list ('%CSVFIELDNAME').
 Eg:
 # set the amount to the 4th CSV field, with " USD" appended
 amount %4 USD
 # combine three fields to make a comment, containing note: and date: tags
 comment note: %somefield - %anotherfield, date: %1
   Interpolation strips any outer whitespace, so a CSV value like '" 1
 "' becomes '1' when interpolated (#1051).  Note you can only interpolate
 CSV fields, not the hledger fields being assigned to; for more on this,
 see TIPS.
 File: hledger_csv.info,  Node: date-format,  Next: if,  Prev: field assignment,  Up: CSV RULES
 1.4 'date-format'
 =================
 date-format DATEFMT
   This is a helper for the 'date' (and 'date2') fields.  If your CSV
 dates are not formatted like 'YYYY-MM-DD', 'YYYY/MM/DD' or 'YYYY.MM.DD',
 you'll need to specify the format by writing "date-format" followed by a
 strptime-like date parsing pattern, which must parse the date field
 values completely.  Examples:
 # for dates like "11/06/2013":
 date-format %m/%d/%Y
-# for dates like "6/11/2013" (note the - to make leading zeros optional):
+# for dates like "6/11/2013". The - allows leading zeros to be optional.
 date-format %-d/%-m/%Y
 # for dates like "2013-Nov-06":
@ -129,73 +200,43 @@ date-format %Y-%h-%d
 date-format %-m/%-d/%Y %l:%M %p
-File: hledger_csv.info,  Node: field list,  Next: field assignment,  Prev: date-format,  Up: CSV RULES
+File: hledger_csv.info,  Node: if,  Next: end,  Prev: date-format,  Up: CSV RULES
-1.3 field list
+1.5 'if'
-==============
+========
-'fields'_'FIELDNAME1'_, _'FIELDNAME2'_...
+if PATTERN
 RULE
-   This (a) names the CSV fields, in order (names may not contain
+if
-whitespace; uninteresting names may be left blank), and (b) assigns them
+PATTERN
-to journal entry fields if you use any of these standard field names:
+PATTERN
-'date', 'date2', 'status', 'code', 'description', 'comment', 'account1',
+PATTERN
-'account2', 'amount', 'amount-in', 'amount-out', 'currency', 'balance',
+ RULE
-'balance1', 'balance2'.  Eg:
+ RULE
-# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
+   Conditional blocks apply one or more rules to CSV records which are
-# and give the 7th and 8th fields meaningful names for later reference:
+matched by any of the PATTERNs.  This allows transactions to be
-#
+customised or categorised based on patterns in the data.
 # CSV field:
 #      1     2            3 4       5 6 7          8
 # entry field:
 fields date, description, , amount, , , somefield, anotherfield
-
+   A single pattern can be written on the same line as the "if"; or
-File: hledger_csv.info,  Node: field assignment,  Next: conditional block,  Prev: field list,  Up: CSV RULES
+multiple patterns can be written on the following lines, non-indented.
-1.4 field assignment
+   Patterns are case-insensitive regular expressions which try to match
-====================
+any part of the whole CSV record.  It's not yet possible to match within
 a specific field.  Note the CSV record they see is close but not
 identical to the one in the CSV file; eg double quotes are removed, and
 the separator character becomes comma.
-_'ENTRYFIELDNAME'_ _'FIELDVALUE'_
+   After the patterns, there should be one or more rules to apply, all
 indented by at least one space.  Three kinds of rule are allowed in
 conditional blocks:
-   This sets a journal entry field (one of the standard names above) to
+   * field assignments (to set a field's value)
-the given text value, which can include CSV field values interpolated by
+   * skip (to skip the matched CSV record)
-name ('%CSVFIELDNAME') or 1-based position ('%N').  Eg:
+   * end (to skip all remaining CSV records).
-# set the amount to the 4th CSV field with "USD " prepended
+   Examples:
 amount USD %4
 # combine three fields to make a comment (containing two tags)
 comment note: %somefield - %anotherfield, date: %1
   Field assignments can be used instead of or in addition to a field
 list.
   Note, interpolation strips any outer whitespace, so a CSV value like
 '" 1 "' becomes '1' when interpolated (#1051).
 File: hledger_csv.info,  Node: conditional block,  Next: include,  Prev: field assignment,  Up: CSV RULES
 1.5 conditional block
 =====================
 'if' _'PATTERN'_
    _'FIELDASSIGNMENTS'_...
   'if'
 _'PATTERN'_
 _'PATTERN'_...
    _'FIELDASSIGNMENTS'_...
   This applies one or more field assignments, only to those CSV records
 matched by one of the PATTERNs.  The patterns are case-insensitive
 regular expressions which match anywhere within the whole CSV record
 (it's not yet possible to match within a specific field).  When there
 are multiple patterns they can be written on separate lines, unindented.
 The field assignments are on separate lines indented by at least one
 space.  Examples:
 # if the CSV record contains "groceries", set account2 to "expenses:groceries"
 if groceries
@ -210,176 +251,369 @@ banking thru software
 comment  XXX deductible ? check it
-File: hledger_csv.info,  Node: include,  Next: newest-first,  Prev: conditional block,  Up: CSV RULES
+File: hledger_csv.info,  Node: end,  Next: include,  Prev: if,  Up: CSV RULES
-1.6 include
+1.6 'end'
-===========
+=========
-'include'_'RULESFILE'_
+As mentioned above, this rule can be used inside conditional blocks
 (only) to cause hledger to stop reading CSV records and proceed with
 command execution.  Eg:
-   Include another rules file at this point.  'RULESFILE' is either an
+# ignore everything following the first empty record
-absolute file path or a path relative to the current file's directory.
+if ,,,,
-Eg:
+ end
-# rules reused with several CSV files
+
-include common.rules
+File: hledger_csv.info,  Node: include,  Next: newest-first,  Prev: end,  Up: CSV RULES
 1.7 'include'
 =============
 include RULESFILE
   Include another CSV rules file at this point, as if it were written
 inline.  'RULESFILE' is an absolute file path or a path relative to the
 current file's directory.
   This can be useful eg for reusing common rules in several rules
 files:
 # someaccount.csv.rules
 ## someaccount-specific rules
 fields date,description,amount
 account1 some:account
 account2 some:misc
 ## common rules
 include categorisation.rules
 File: hledger_csv.info,  Node: newest-first,  Prev: include,  Up: CSV RULES
-1.7 newest-first
+1.8 'newest-first'
-================
+==================
-'newest-first'
+hledger always sorts the generated transactions by date.  Transactions
 on the same date should appear in the same order as their CSV records,
 as hledger can usually auto-detect whether the CSV's normal order is
 oldest first or newest first.  But if all of the following are true:
-   Consider adding this rule if all of the following are true: you might
+   * the CSV might sometimes contain just one day of data (all records
-be processing just one day of data, your CSV records are in reverse
+     having the same date)
-chronological order (newest first), and you care about preserving the
+   * the CSV records are normally in reverse chronological order (newest
-order of same-day transactions.  It usually isn't needed, because
+     first)
-hledger autodetects the CSV order, but when all CSV records have the
+   * and you care about preserving the order of same-day transactions
-same date it will assume they are oldest first.
+
   you should add the 'newest-first' rule as a hint.  Eg:
 # tell hledger explicitly that the CSV is normally newest-first
 newest-first
-File: hledger_csv.info,  Node: CSV TIPS,  Prev: CSV RULES,  Up: Top
+File: hledger_csv.info,  Node: EXAMPLES,  Next: TIPS,  Prev: CSV RULES,  Up: Top
-2 CSV TIPS
+2 EXAMPLES
 **********
 A more complete example, generating three-posting transactions:
 # hledger CSV rules for amazon.com order history
 # sample:
 # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
 # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
 # skip one header line
 skip 1
 # name the csv fields (and assign the transaction's date, amount and code)
 fields date, _, toorfrom, name, amzstatus, amount1, fees, code
 # how to parse the date
 date-format %b %-d, %Y
 # combine two fields to make the description
 description %toorfrom %name
 # save these fields as tags
 comment     status:%amzstatus
 # set the base account for all transactions
 account1    assets:amazon
 # flip the sign on the amount
 amount      -%amount
 # Put fees in a separate posting
 amount3     %fees
 comment3    fees
   For more examples, see Convert CSV files.
 File: hledger_csv.info,  Node: TIPS,  Prev: EXAMPLES,  Up: Top
 3 TIPS
 ******
 * Menu:
 * CSV ordering::
 * CSV accounts::
 * CSV amounts::
 * CSV balance assertions/assignments::
 * Reading multiple CSV files::
 * Deduplicating importing::
 * Other import methods::
 * Valid CSV::
 * Other separator characters::
 * Setting amounts::
 * Referencing other fields::
 * How CSV rules are evaluated::
 * Valid transactions::
-File: hledger_csv.info,  Node: CSV ordering,  Next: CSV accounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Reading multiple CSV files,  Next: Deduplicating importing,  Up: TIPS
-2.1 CSV ordering
+3.1 Reading multiple CSV files
-================
+==============================
-The generated journal entries will be sorted by date.  The order of
+You can read multiple CSV files at once using multiple '-f' arguments on
-same-day entries will be preserved (except in the special case where you
+the command line.  hledger will look for a correspondingly-named rules
-might need 'newest-first', see above).
+file for each CSV file.  If you use the '--rules-file' option, that
 rules file will be used for all the CSV files.
-File: hledger_csv.info,  Node: CSV accounts,  Next: CSV amounts,  Prev: CSV ordering,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Deduplicating importing,  Next: Other import methods,  Prev: Reading multiple CSV files,  Up: TIPS
-2.2 CSV accounts
+3.2 Deduplicating, importing
-================
+============================
-Each journal entry will have two postings, to 'account1' and 'account2'
+When you download a CSV file repeatedly, eg to get your latest bank
-respectively.  It's not yet possible to generate entries with more than
+transactions, the new file may contain some of the same records as the
-two postings.  It's conventional and recommended to use 'account1' for
+old one.  The print -new command is one simple way to detect just the
-the account whose CSV we are reading.
+new transactions.  Or better still, the import command appends those new
 transactions to your main journal.  This is the easiest way to import
 CSV data.  Eg, after downloading your latest CSV files:
 $ hledger import *.csv [--dry]
-File: hledger_csv.info,  Node: CSV amounts,  Next: CSV balance assertions/assignments,  Prev: CSV accounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Other import methods,  Next: Valid CSV,  Prev: Deduplicating importing,  Up: TIPS
-2.3 CSV amounts
+3.3 Other import methods
-===============
+========================
-A transaction amount must be set, in one of these ways:
+A number of other tools and workflows, hledger-specific and otherwise,
 exist for converting, deduplicating, classifying and managing CSV data.
 See:
-   * with an 'amount' field assignment, which sets the first posting's
+   * https://hledger.org -> sidebar -> real world setups
-     amount
+   * https://plaintextaccounting.org -> data import/conversion
-   * (When the CSV has debit and credit amounts in separate fields:)
+
-     with field assignments for the 'amount-in' and 'amount-out' pseudo
+File: hledger_csv.info,  Node: Valid CSV,  Next: Other separator characters,  Prev: Other import methods,  Up: TIPS
     fields (both of them).  Whichever one has a value will be used,
     with appropriate sign.  If both contain a value, it might not work
     so well.
-   * or implicitly by means of a balance assignment (see below).
+3.4 Valid CSV
 =============
 hledger accepts CSV conforming to RFC 4180.  Some things to note when
 values are enclosed in quotes:
   * you must use double quotes (not single quotes)
   * spaces outside the quotes are not allowed
 File: hledger_csv.info,  Node: Other separator characters,  Next: Setting amounts,  Prev: Valid CSV,  Up: TIPS
 3.5 Other separator characters
 ==============================
 With the '--separator 'CHAR'' option, hledger will expect the separator
 to be CHAR instead of a comma.  Ie it will read other "Character
 Separated Values" formats, such as TSV (Tab Separated Values).  Note: on
 the command line, use a real tab character in quotes, not
 $ hledger -f foo.tsv --separator '  ' print
   (Experimental.)
 File: hledger_csv.info,  Node: Setting amounts,  Next: Referencing other fields,  Prev: Other separator characters,  Up: TIPS
 3.6 Setting amounts
 ===================
 A posting amount can be set in one of these ways:
   * by assigning (with a fields list or field assigment) to 'amountN'
     (posting N's amount) or 'amount' (posting 1's amount)
   * by assigning to 'amountN-in' and 'amountN-out' (or 'amount-in' and
     'amount-out').  For each CSV record, whichever of these has a
     non-zero value will be used, with appropriate sign.  If both
     contain a non-zero value, this may not work.
   * by assigning to 'balanceN' (or 'balance') instead of the above,
     setting the amount indirectly via a balance assignment.
   There is some special handling for sign in amounts:
   * If an amount value is parenthesised, it will be de-parenthesised
     and sign-flipped.
-   * If an amount value begins with a double minus sign, those will
+   * If an amount value begins with a double minus sign, those cancel
-     cancel out and be removed.
+     out and are removed.
   If the currency/commodity symbol is provided as a separate CSV field,
-assign it to the 'currency' pseudo field; the symbol will be prepended
+you can assign it to 'currency' (affects all posting amounts) or
-to the amount (TODO: when there is an amount).  Or, you can use an
+'currencyN' (affects just posting N's amount).  The symbol will be
-'amount' field assignment for more control, eg:
+prepended to the amount.  Or for more control, you can set both currency
 symbol and amount with a field assignment, eg:
 fields date,description,currency,amount
 # add currency symbol on the right:
 amount %amount %currency
-File: hledger_csv.info,  Node: CSV balance assertions/assignments,  Next: Reading multiple CSV files,  Prev: CSV amounts,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Referencing other fields,  Next: How CSV rules are evaluated,  Prev: Setting amounts,  Up: TIPS
-2.4 CSV balance assertions/assignments
+3.7 Referencing other fields
-======================================
+============================
-If the CSV includes a running balance, you can assign that to one of the
+In field assignments, you can interpolate only CSV fields, not hledger
-pseudo fields 'balance' (or 'balance1') or 'balance2'.  This will
+fields.  In the example below, there's both a CSV field and a hledger
-generate a balance assertion (or if the amount is left empty, a balance
+field named amount1, but %amount1 always means the CSV field, not the
-assignment), on the first or second posting, whenever the running
+hledger field:
-balance field is non-empty.  (TODO: #1000)
+
 # Name the third CSV field "amount1"
 fields date,description,amount1
 # Set hledger's amount1 to the CSV amount1 field followed by USD
 amount1 %amount1 USD
 # Set comment to the CSV amount1 (not the amount1 assigned above)
 comment %amount1
   Here, since there's no CSV amount1 field, %amount1 will produce a
 literal "amount1":
 fields date,description,csvamount
 amount1 %csvamount USD
 # Can't interpolate amount1 here
 comment %amount1
   When there are multiple field assignments to the same hledger field,
 only the last one takes effect.  Here, comment's value will be be B, or
 C if "something" is matched, but never A:
 comment A
 comment B
 if something
 comment C
-File: hledger_csv.info,  Node: Reading multiple CSV files,  Next: Valid CSV,  Prev: CSV balance assertions/assignments,  Up: CSV TIPS
+File: hledger_csv.info,  Node: How CSV rules are evaluated,  Next: Valid transactions,  Prev: Referencing other fields,  Up: TIPS
-2.5 Reading multiple CSV files
+3.8 How CSV rules are evaluated
-==============================
+===============================
-You can read multiple CSV files at once using multiple '-f' arguments on
+Here's how to think of CSV rules being evaluated (if you really need
-the command line, and hledger will look for a correspondingly-named
+to).  First,
-rules file for each.  Note if you use the '--rules-file' option, this
+
-one rules file will be used for all the CSV files being read.
+   * include - all includes are inlined, from top to bottom, depth
     first.  (At each include point the file is inlined and scanned for
     further includes, before proceeding.)
   Then "global" rules are evaluated, top to bottom.  If a rule is
 repeated, the last one wins:
   * skip (at top level)
   * date-format
   * newest-first
   * fields - names the CSV fields, optionally sets up initial
     assignments to hledger fields
   Then for each CSV record in turn:
   * test all 'if' blocks.  If any of them contain a 'end' rule, skip
     all remaining CSV records.  Otherwise if any of them contain a
     'skip' rule, skip that many CSV records.  If there are multiple
     matched skip rules, the first one wins.
   * collect all field assignments at top level and in matched if
     blocks.  When there are multiple assignments for a field, keep only
     the last one.
   * compute a value for each hledger field - either the one that was
     assigned to it (and interpolate the %CSVFIELDNAME references), or a
     default
   * generate a synthetic hledger transaction from these values, which
     becomes part of the input to the hledger command that has been
     selected
-File: hledger_csv.info,  Node: Valid CSV,  Prev: Reading multiple CSV files,  Up: CSV TIPS
+File: hledger_csv.info,  Node: Valid transactions,  Prev: How CSV rules are evaluated,  Up: TIPS
-2.6 Valid CSV
+3.9 Valid transactions
-=============
+======================
-hledger follows RFC 4180, with the addition of a customisable separator
+hledger currently does not post-process and validate transactions
-character.
+generated from CSV as thoroughly as transactions read from a journal
 file.  This means that if your rules are wrong, you can generate invalid
 transactions.  Or, amounts may not be displayed with a canonical display
 style.
-   Some things to note:
+   So when setting up or adjusting CSV rules, you should check your
 results visually with the print command.  You can pipe print's output
 through hledger once more to validate and canonicalise fully.  Eg:
-   When quoting fields,
+$ hledger -f some.csv print | hledger -f- print -I
-   * you must use double quotes, not single quotes
+   (The -I/-ignore-assertions flag disables balance assertion checks,
-   * spaces outside the quotes are not allowed.
+usually needed when re-parsing print output.)
 Tag Table:
 Node: Top72
-Node: CSV RULES2167
+Node: CSV RULES1428
-Ref: #csv-rules2275
+Ref: #csv-rules1536
-Node: skip2538
+Node: skip1849
-Ref: #skip2632
+Ref: #skip1942
-Node: date-format2857
+Node: fields2312
-Ref: #date-format2984
+Ref: #fields2434
-Node: field list3534
+Node: Transaction fields3239
-Ref: #field-list3671
+Ref: #transaction-fields3379
-Node: field assignment4401
+Node: Posting fields3547
-Ref: #field-assignment4556
+Ref: #posting-fields3679
-Node: conditional block5180
+Node: field assignment4729
-Ref: #conditional-block5334
+Ref: #field-assignment4882
-Node: include6230
+Node: date-format5693
-Ref: #include6360
+Ref: #date-format5828
-Node: newest-first6591
+Node: if6440
-Ref: #newest-first6705
+Ref: #if6544
-Node: CSV TIPS7116
+Node: end7915
-Ref: #csv-tips7210
+Ref: #end8017
-Node: CSV ordering7354
+Node: include8246
-Ref: #csv-ordering7472
+Ref: #include8366
-Node: CSV accounts7653
+Node: newest-first8804
-Ref: #csv-accounts7791
+Ref: #newest-first8922
-Node: CSV amounts8045
+Node: EXAMPLES9594
-Ref: #csv-amounts8203
+Ref: #examples9701
-Node: CSV balance assertions/assignments9283
+Node: TIPS10607
-Ref: #csv-balance-assertionsassignments9501
+Ref: #tips10688
-Node: Reading multiple CSV files9822
+Node: Reading multiple CSV files10931
-Ref: #reading-multiple-csv-files10022
+Ref: #reading-multiple-csv-files11098
-Node: Valid CSV10296
+Node: Deduplicating importing11358
-Ref: #valid-csv10419
+Ref: #deduplicating-importing11550
 Node: Other import methods11991
 Ref: #other-import-methods12158
 Node: Valid CSV12428
 Ref: #valid-csv12576
 Node: Other separator characters12778
 Ref: #other-separator-characters12955
 Node: Setting amounts13289
 Ref: #setting-amounts13459
 Node: Referencing other fields14702
 Ref: #referencing-other-fields14891
 Node: How CSV rules are evaluated15788
 Ref: #how-csv-rules-are-evaluated15986
 Node: Valid transactions17266
 Ref: #valid-transactions17413
 End Tag Table
--- a/hledger-lib/hledger_csv.txt
+++ b/hledger-lib/hledger_csv.txt
@ -16,8 +16,8 @@ DESCRIPTION
       o they describe the layout and format of the CSV data
-       o they can customize the generated journal entries using a simple  tem-
+       o they can customize the generated journal entries (transactions) using
-         plating language
+         a simple templating language
       o they  can add refinements based on patterns in the CSV data, eg cate-
         gorizing transactions with more detailed account names.
@ -36,63 +36,109 @@ DESCRIPTION
              date-format  %d/%m/%Y
              skip 1
-       A more complete example:
+       More examples in the EXAMPLES section below.
              # hledger CSV rules for amazon.com order history
              # sample:
              # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
              # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
              # skip one header line
              skip 1
              # name the csv fields (and assign the transaction's date, amount and code)
              fields date, _, toorfrom, name, amzstatus, amount, fees, code
              # how to parse the date
              date-format %b %-d, %Y
              # combine two fields to make the description
              description %toorfrom %name
              # save these fields as tags
              comment     status:%amzstatus, fees:%fees
              # set the base account for all transactions
              account1    assets:amazon
              # flip the sign on the amount
              amount      -%amount
       For more examples, see Convert CSV files.
 CSV RULES
-       The following seven kinds of rule can appear in the rules file, in  any
+       The following kinds of rule can appear in the rules file, in any  order
-       order.  Blank lines and lines beginning with # or ; are ignored.
+       (except  for  end  which  can  appear only inside a conditional block).
       Blank lines and lines beginning with # or ; are ignored.
   skip
-       skipN
+              skip N
-       Skip  this  many  non-empty lines preceding the CSV data.  (Empty/blank
+       The word "skip" followed by a number (or no number,  meaning  1)  tells
-       lines are skipped automatically.) You'll need this  whenever  your  CSV
+       hledger  to  ignore  this  many non-empty lines preceding the CSV data.
-       data contains header lines.  Eg:
+       (Empty/blank lines are skipped automatically.) You'll need  this  when-
       ever your CSV data contains header lines.
-              # ignore the first CSV line
+       It  also  has  a  second  purpose: it can be used to ignore certain CSV
-              skip 1
+       records, see conditional blocks below.
   fields
              fields FIELDNAME1, FIELDNAME2, ...
       A fields list ("fields" followed by one or more  comma-separated  field
       names)  is  the quick way to assign CSV field values to hledger fields.
       It (a) names the CSV fields, in order (names  may  not  contain  white-
       space;  fields  you  don't care about can be left unnamed), and (b) as-
       signs them to hledger fields if you use standard hledger  field  names.
       Here's an example:
              # use the 1st, 2nd and 4th CSV fields as the transaction's date, description and amount,
              # ignore the 3rd, 5th and 6th fields,
              # and name the 7th and 8th fields for later reference:
              #      1     2           3  4       5 6  7          8
              fields date, description, , amount1, , , somefield, anotherfield
       Here are the standard hledger field names:
   Transaction fields
       date, date2, status, code, description, comment can be used to form the
       transaction's first line.  Only date is required.  (See also  date-for-
       mat below.)
   Posting fields
       accountN, where N is 1 to 9, sets the Nth posting's account name.  Most
       often there are two postings, so you'll want to set  account1  and  ac-
       count2.
       A  number of field/pseudo-field names are available for setting posting
       amounts:
       o amountN sets posting N's amount
       o amountN-in and amountN-out can be used instead, if the CSV has  sepa-
         rate fields for debits and credits
       o currencyN  sets  a currency symbol to be left-prefixed to the amount,
         useful if the CSV provides that as a separate field
       o balanceN sets a (separate) balance assertion amount (or when no post-
         ing amount is set, a balance assignment)
       If  you write these with no number (amount, amount-in, amount-out, cur-
       rency, balance), it means posting 1.  Also, if you set  an  amount  for
       posting  1 only, a second posting that balances the transaction will be
       generated automatically.  This helps support CSV rules  created  before
       hledger 1.16.
       Finally,  commentN  sets a comment on the Nth posting.  Comments can of
       course contain tags.
   (field assignment)
              HLEDGERFIELDNAME FIELDVALUE
       Instead of or in addition to a fields list, you can assign a value to a
       hledger  field  by  writing  its name (any of the standard names above)
       followed by a text value.   The  value  may  contain  interpolated  CSV
       fields, referenced by their 1-based position in the CSV record (%N), or
       by the name they were given in the fields list (%CSVFIELDNAME).  Eg:
              # set the amount to the 4th CSV field, with " USD" appended
              amount %4 USD
              # combine three fields to make a comment, containing note: and date: tags
              comment note: %somefield - %anotherfield, date: %1
       Interpolation strips any outer whitespace, so a CSV value like  "  1  "
       becomes 1 when interpolated (#1051).  Note you can only interpolate CSV
       fields, not the hledger fields being assigned to; for more on this, see
       TIPS.
   date-format
-       date-formatDATEFMT
+              date-format DATEFMT
-       When  your  CSV date fields are not formatted like YYYY/MM/DD (or YYYY-
+       This  is  a  helper for the date (and date2) fields.  If your CSV dates
-       MM-DD or YYYY.MM.DD), you'll need to specify the format.  DATEFMT is  a
+       are not formatted like YYYY-MM-DD,  YYYY/MM/DD  or  YYYY.MM.DD,  you'll
-       strptime-like  date  parsing  pattern,  which must parse the date field
+       need to specify the format by writing "date-format" followed by a strp-
-       values completely.  Examples:
+       time-like date parsing pattern, which must parse the date field  values
       completely.  Examples:
              # for dates like "11/06/2013":
              date-format %m/%d/%Y
-              # for dates like "6/11/2013" (note the - to make leading zeros optional):
+              # for dates like "6/11/2013". The - allows leading zeros to be optional.
              date-format %-d/%-m/%Y
              # for dates like "2013-Nov-06":
@ -101,59 +147,41 @@ CSV RULES
              # for dates like "11/6/2013 11:32 PM":
              date-format %-m/%-d/%Y %l:%M %p
-   field list
+   if
-       fieldsFIELDNAME1, FIELDNAME2...
+              if PATTERN
               RULE
-       This (a) names the CSV fields, in order (names may not  contain  white-
+              if
-       space;  uninteresting names may be left blank), and (b) assigns them to
+              PATTERN
-       journal entry fields if you use any  of  these  standard  field  names:
+              PATTERN
-       date,  date2,  status,  code, description, comment, account1, account2,
+              PATTERN
-       amount, amount-in, amount-out, currency, balance,  balance1,  balance2.
+               RULE
-       Eg:
+               RULE
-              # use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
+       Conditional  blocks  apply  one  or more rules to CSV records which are
-              # and give the 7th and 8th fields meaningful names for later reference:
+       matched by any of the PATTERNs.  This allows transactions  to  be  cus-
-              #
+       tomised or categorised based on patterns in the data.
              # CSV field:
              #      1     2            3 4       5 6 7          8
              # entry field:
              fields date, description, , amount, , , somefield, anotherfield
-   field assignment
+       A single pattern can be written on the same line as the "if"; or multi-
-       ENTRYFIELDNAME FIELDVALUE
+       ple patterns can be written on the following lines, non-indented.
-       This  sets  a  journal entry field (one of the standard names above) to
+       Patterns are case-insensitive regular expressions which  try  to  match
-       the given text value, which can include CSV field  values  interpolated
+       any  part  of  the  whole  CSV  record.  It's not yet possible to match
-       by name (%CSVFIELDNAME) or 1-based position (%N).  Eg:
+       within a specific field.  Note the CSV record they see is close but not
       identical to the one in the CSV file; eg double quotes are removed, and
       the separator character becomes comma.
-              # set the amount to the 4th CSV field with "USD " prepended
+       After the patterns, there should be one or more rules to apply, all in-
-              amount USD %4
+       dented  by at least one space.  Three kinds of rule are allowed in con-
       ditional blocks:
-              # combine three fields to make a comment (containing two tags)
+       o field assignments (to set a field's value)
              comment note: %somefield - %anotherfield, date: %1
-       Field  assignments  can  be  used  instead of or in addition to a field
+       o skip (to skip the matched CSV record)
       list.
-       Note, interpolation strips any outer whitespace, so a CSV value like  "
+       o end (to skip all remaining CSV records).
       1 " becomes 1 when interpolated (#1051).
-   conditional block
+       Examples:
       if PATTERN
           FIELDASSIGNMENTS...
       if
       PATTERN
       PATTERN...
           FIELDASSIGNMENTS...
       This  applies  one or more field assignments, only to those CSV records
       matched by one of the PATTERNs.  The patterns are case-insensitive reg-
       ular expressions which match anywhere within the whole CSV record (it's
       not yet possible to match within a specific  field).   When  there  are
       multiple  patterns  they  can be written on separate lines, unindented.
       The field assignments are on separate lines indented by  at  least  one
       space.  Examples:
              # if the CSV record contains "groceries", set account2 to "expenses:groceries"
              if groceries
@ -167,90 +195,250 @@ CSV RULES
               account2 expenses:business:banking
               comment  XXX deductible ? check it
   end
       As mentioned above, this rule can be  used  inside  conditional  blocks
       (only)  to  cause  hledger to stop reading CSV records and proceed with
       command execution.  Eg:
              # ignore everything following the first empty record
              if ,,,,
               end
   include
-       includeRULESFILE
+              include RULESFILE
-       Include another rules file at this point.  RULESFILE is either an abso-
+       Include another CSV rules file at this point, as if it were written in-
-       lute file path or a path relative to the current file's directory.  Eg:
+       line.   RULESFILE  is  an  absolute file path or a path relative to the
       current file's directory.
-              # rules reused with several CSV files
+       This can be useful eg for reusing common rules in several rules files:
-              include common.rules
+
              # someaccount.csv.rules
              ## someaccount-specific rules
              fields date,description,amount
              account1 some:account
              account2 some:misc
              ## common rules
              include categorisation.rules
   newest-first
-       newest-first
+       hledger always sorts the generated transactions by date.   Transactions
       on  the same date should appear in the same order as their CSV records,
       as hledger can usually auto-detect whether the CSV's  normal  order  is
       oldest first or newest first.  But if all of the following are true:
-       Consider adding this rule if all of the following are true:  you  might
+       o the  CSV  might  sometimes  contain just one day of data (all records
-       be  processing  just  one  day of data, your CSV records are in reverse
+         having the same date)
       chronological order (newest first), and you care about  preserving  the
       order  of  same-day  transactions.   It  usually  isn't needed, because
       hledger autodetects the CSV order, but when all CSV  records  have  the
       same date it will assume they are oldest first.
-CSV TIPS
+       o the CSV records are normally in reverse chronological  order  (newest
-   CSV ordering
+         first)
       The  generated  journal  entries  will be sorted by date.  The order of
       same-day entries will be preserved (except in the  special  case  where
       you might need newest-first, see above).
-   CSV accounts
+       o and you care about preserving the order of same-day transactions
       Each journal entry will have two postings, to account1 and account2 re-
       spectively.  It's not yet possible to generate entries with  more  than
       two  postings.   It's  conventional and recommended to use account1 for
       the account whose CSV we are reading.
-   CSV amounts
+       you should add the newest-first rule as a hint.  Eg:
       A transaction amount must be set, in one of these ways:
-       o with an amount field  assignment,  which  sets  the  first  posting's
+              # tell hledger explicitly that the CSV is normally newest-first
-         amount
+              newest-first
-       o (When the CSV has debit and credit amounts in separate fields:)
+EXAMPLES
-       with  field  assignments for the amount-in and amount-out pseudo fields
+       A more complete example, generating three-posting transactions:
       (both of them).  Whichever one has a value will be used, with appropri-
       ate sign.  If both contain a value, it might not work so well.
-       o or implicitly by means of a balance assignment (see below).
+              # hledger CSV rules for amazon.com order history
              # sample:
              # "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
              # "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
              # skip one header line
              skip 1
              # name the csv fields (and assign the transaction's date, amount and code)
              fields date, _, toorfrom, name, amzstatus, amount1, fees, code
              # how to parse the date
              date-format %b %-d, %Y
              # combine two fields to make the description
              description %toorfrom %name
              # save these fields as tags
              comment     status:%amzstatus
              # set the base account for all transactions
              account1    assets:amazon
              # flip the sign on the amount
              amount      -%amount
              # Put fees in a separate posting
              amount3     %fees
              comment3    fees
       For more examples, see Convert CSV files.
 TIPS
   Reading multiple CSV files
       You  can read multiple CSV files at once using multiple -f arguments on
       the command line.  hledger will look for a correspondingly-named  rules
       file for each CSV file.  If you use the --rules-file option, that rules
       file will be used for all the CSV files.
   Deduplicating, importing
       When you download a CSV file repeatedly, eg to  get  your  latest  bank
       transactions,  the new file may contain some of the same records as the
       old one.  The print --new command is one simple way to detect just  the
       new  transactions.   Or  better still, the import command appends those
       new transactions to your main journal.  This is the easiest way to  im-
       port CSV data.  Eg, after downloading your latest CSV files:
              $ hledger import *.csv [--dry]
   Other import methods
       A  number of other tools and workflows, hledger-specific and otherwise,
       exist for converting, deduplicating, classifying and managing CSV data.
       See:
       o https://hledger.org -> sidebar -> real world setups
       o https://plaintextaccounting.org -> data import/conversion
   Valid CSV
       hledger  accepts  CSV conforming to RFC 4180.  Some things to note when
       values are enclosed in quotes:
       o you must use double quotes (not single quotes)
       o spaces outside the quotes are not allowed
   Other separator characters
       With the --separator 'CHAR' option, hledger will expect  the  separator
       to  be CHAR instead of a comma.  Ie it will read other "Character Sepa-
       rated Values" formats, such as TSV (Tab Separated  Values).   Note:  on
       the command line, use a real tab character in quotes, not Eg:
              $ hledger -f foo.tsv --separator '  ' print
       (Experimental.)
   Setting amounts
       A posting amount can be set in one of these ways:
       o by  assigning  (with  a  fields  list  or field assigment) to amountN
         (posting N's amount) or amount (posting 1's amount)
       o by assigning to amountN-in and amountN-out (or amount-in and  amount-
         out).   For  each CSV record, whichever of these has a non-zero value
         will be used, with appropriate sign.   If  both  contain  a  non-zero
         value, this may not work.
       o by  assigning  to balanceN (or balance) instead of the above, setting
         the amount indirectly via a balance assignment.
       There is some special handling for sign in amounts:
-       o If  an amount value is parenthesised, it will be de-parenthesised and
+       o If an amount value is parenthesised, it will be de-parenthesised  and
         sign-flipped.
-       o If an amount value begins with a double minus sign, those will cancel
+       o If  an amount value begins with a double minus sign, those cancel out
-         out and be removed.
+         and are removed.
-       If  the  currency/commodity symbol is provided as a separate CSV field,
+       If the currency/commodity symbol is provided as a separate  CSV  field,
-       assign it to the currency pseudo field; the symbol will be prepended to
+       you  can assign it to currency (affects all posting amounts) or curren-
-       the  amount (TODO: when there is an amount).  Or, you can use an amount
+       cyN (affects just posting N's amount).  The symbol will be prepended to
-       field assignment for more control, eg:
+       the  amount.  Or for more control, you can set both currency symbol and
       amount with a field assignment, eg:
              fields date,description,currency,amount
              # add currency symbol on the right:
              amount %amount %currency
-   CSV balance assertions/assignments
+   Referencing other fields
-       If the CSV includes a running balance, you can assign that  to  one  of
+       In field assignments, you can interpolate only CSV fields, not  hledger
-       the  pseudo fields balance (or balance1) or balance2.  This will gener-
+       fields.   In  the example below, there's both a CSV field and a hledger
-       ate a balance assertion (or if the amount is left empty, a balance  as-
+       field named amount1, but %amount1 always means the CSV field,  not  the
-       signment), on the first or second posting, whenever the running balance
+       hledger field:
       field is non-empty.  (TODO: #1000)
-   Reading multiple CSV files
+              # Name the third CSV field "amount1"
-       You can read multiple CSV files at once using multiple -f arguments  on
+              fields date,description,amount1
       the  command  line,  and  hledger will look for a correspondingly-named
       rules file for each.  Note if you use the --rules-file option, this one
       rules file will be used for all the CSV files being read.
-   Valid CSV
+              # Set hledger's amount1 to the CSV amount1 field followed by USD
-       hledger follows RFC 4180, with the addition of a customisable separator
+              amount1 %amount1 USD
       character.
-       Some things to note:
+              # Set comment to the CSV amount1 (not the amount1 assigned above)
              comment %amount1
-       When quoting fields,
+       Here,  since there's no CSV amount1 field, %amount1 will produce a lit-
       eral "amount1":
-       o you must use double quotes, not single quotes
+              fields date,description,csvamount
              amount1 %csvamount USD
              # Can't interpolate amount1 here
              comment %amount1
-       o spaces outside the quotes are not allowed.
+       When there are multiple field assignments to the  same  hledger  field,
       only the last one takes effect.  Here, comment's value will be be B, or
       C if "something" is matched, but never A:
              comment A
              comment B
              if something
               comment C
   How CSV rules are evaluated
       Here's how to think of CSV rules being evaluated (if  you  really  need
       to).  First,
       o include  - all includes are inlined, from top to bottom, depth first.
         (At each include point the file is inlined and  scanned  for  further
         includes, before proceeding.)
       Then  "global"  rules  are  evaluated, top to bottom.  If a rule is re-
       peated, the last one wins:
       o skip (at top level)
       o date-format
       o newest-first
       o fields - names the CSV fields, optionally sets up initial assignments
         to hledger fields
       Then for each CSV record in turn:
       o test  all if blocks.  If any of them contain a end rule, skip all re-
         maining CSV records.  Otherwise if any of them contain a  skip  rule,
         skip  that  many  CSV  records.   If  there are multiple matched skip
         rules, the first one wins.
       o collect all field assignments at top level and in matched if  blocks.
         When  there  are multiple assignments for a field, keep only the last
         one.
       o compute a value for each hledger field - either the one that was  as-
         signed to it (and interpolate the %CSVFIELDNAME references), or a de-
         fault
       o generate a synthetic hledger transaction from these values, which be-
         comes part of the input to the hledger command that has been selected
   Valid transactions
       hledger  currently does not post-process and validate transactions gen-
       erated from CSV as thoroughly as transactions read from a journal file.
       This  means  that  if  your  rules  are wrong, you can generate invalid
       transactions.  Or, amounts may not be displayed with a  canonical  dis-
       play style.
       So  when  setting  up or adjusting CSV rules, you should check your re-
       sults visually with the print command.  You  can  pipe  print's  output
       through hledger once more to validate and canonicalise fully.  Eg:
              $ hledger -f some.csv print | hledger -f- print -I
       (The  -I/--ignore-assertions  flag  disables  balance assertion checks,
       usually needed when re-parsing print output.)