docs: CSV rules version 2 syntax
This commit is contained in:
		
							parent
							
								
									57eebd9ae5
								
							
						
					
					
						commit
						9c6ee3ae70
					
				
							
								
								
									
										226
									
								
								MANUAL.md
									
									
									
									
									
								
							
							
						
						
									
										226
									
								
								MANUAL.md
									
									
									
									
									
								
							@ -492,161 +492,123 @@ to each account name.
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
### CSV files
 | 
					### CSV files
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Since version 0.18, hledger can also read
 | 
					hledger can also read
 | 
				
			||||||
[CSV](http://en.wikipedia.org/wiki/Comma-separated_values) files natively
 | 
					[CSV](http://en.wikipedia.org/wiki/Comma-separated_values) files,
 | 
				
			||||||
(previous versions provided a special `convert` command.)
 | 
					translating the CSV records into journal entries on the fly. In this
 | 
				
			||||||
 | 
					case, we must provide an additional "rules file", which is a file
 | 
				
			||||||
 | 
					named like the CSV file with an extra `.rules` suffix, containing
 | 
				
			||||||
 | 
					rules specifying things like:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
An arbitrary CSV file does not provide enough information to be parsed as
 | 
					- which CSV fields correspond to which journal entry fields
 | 
				
			||||||
a journal. So when reading CSV, hledger looks for an additional
 | 
					- which date format is being used
 | 
				
			||||||
[rules file](#the-rules-file), which identifies the CSV fields and assigns
 | 
					- which account name(s) to use
 | 
				
			||||||
accounts. For reading `FILE.csv`, hledger uses `FILE.csv.rules` in the same
 | 
					 | 
				
			||||||
directory, auto-creating it if needed. You should configure the rules file
 | 
					 | 
				
			||||||
to get the best data from your CSV file. You can specify a different rules
 | 
					 | 
				
			||||||
file with `--rules-file` (useful when reading from standard input).
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
An example - sample.csv:
 | 
					Typically you'll keep one rules file for each account which you
 | 
				
			||||||
 | 
					download as CSV. A default rules file will be created if it doesn't
 | 
				
			||||||
 | 
					exist, in which case you'll need to refine it to get the best results.
 | 
				
			||||||
 | 
					You can override the default rules file name with `--rules-file`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Here's a quick example.  Say we have downloaded `checking.csv` from a
 | 
				
			||||||
 | 
					bank for the first time:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
    sample.csv:
 | 
					 | 
				
			||||||
    "Date","Note","Amount"
 | 
					    "Date","Note","Amount"
 | 
				
			||||||
    "2012/3/22","TRANSFER TO SAVINGS","-10.00"
 | 
					    "2012/3/22","DEPOSIT","50.00"
 | 
				
			||||||
    "2012/3/23","SOMETHING ELSE","5.50"
 | 
					    "2012/3/23","TRANSFER TO SAVINGS","-10.00"
 | 
				
			||||||
 | 
					
 | 
				
			||||||
sample.rules:
 | 
					We could create `checking.csv.rules` containing:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
    skip-lines 1
 | 
					    account1 assets:bank:checking
 | 
				
			||||||
    date-field 0
 | 
					    skip     1
 | 
				
			||||||
    description-field 1
 | 
					    fields   date, description, amount
 | 
				
			||||||
    amount-field 2
 | 
					 | 
				
			||||||
    currency $
 | 
					    currency $
 | 
				
			||||||
    base-account assets:bank:checking
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
    SAVINGS
 | 
					    if ~ SAVINGS
 | 
				
			||||||
    assets:bank:savings
 | 
					     account2 assets:bank:savings
 | 
				
			||||||
 | 
					
 | 
				
			||||||
the resulting journal:
 | 
					This says:
 | 
				
			||||||
 | 
					"always use assets:bank:checking as the first account;
 | 
				
			||||||
 | 
					ignore the first line;
 | 
				
			||||||
 | 
					use the first, second and third CSV fields as the entry date, description and amount respectively;
 | 
				
			||||||
 | 
					always prepend $ to the amount value;
 | 
				
			||||||
 | 
					if the CSV record contains 'SAVINGS', use assets:bank:savings as the second account".
 | 
				
			||||||
 | 
					Now hledger can read this CSV file:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
    $ hledger -f sample.csv print
 | 
					    $ hledger -f checking.csv print
 | 
				
			||||||
    using conversion rules file sample.rules
 | 
					    using conversion rules file checking.csv.rules
 | 
				
			||||||
    2012/03/22 TRANSFER TO SAVINGS
 | 
					    2012/03/22 DEPOSIT
 | 
				
			||||||
 | 
					        income:unknown             $-50.00
 | 
				
			||||||
 | 
					        assets:bank:checking        $50.00
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    2012/03/23 TRANSFER TO SAVINGS
 | 
				
			||||||
        assets:bank:savings         $10.00
 | 
					        assets:bank:savings         $10.00
 | 
				
			||||||
        assets:bank:checking       $-10.00
 | 
					        assets:bank:checking       $-10.00
 | 
				
			||||||
 | 
					
 | 
				
			||||||
    2012/03/23 SOMETHING ELSE
 | 
					We might save this output as `checking.journal`, and/or merge it (manually) into the main journal file.
 | 
				
			||||||
        income:unknown              $-5.50
 | 
					 | 
				
			||||||
        assets:bank:checking         $5.50
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
### The rules file
 | 
					#### Rules syntax
 | 
				
			||||||
 | 
					
 | 
				
			||||||
A rules file consists of the following optional directives, followed by
 | 
					The rules file is simple. Lines beginning with `#` or `;` and blank lines are ignored.
 | 
				
			||||||
account-assigning rules.  (Tip: rules file parse errors are not the
 | 
					The only requirement is that we specify how to fill journal entries' date and amount fields (at least),
 | 
				
			||||||
greatest, so check your rules file format if you're getting unexpected
 | 
					using a *field list*, or individual *field assignments*, or both:
 | 
				
			||||||
results.)
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
`account-field`
 | 
					> **fields** *CSVFIELDNAME1*, *CSVFIELDNAME1*, ...
 | 
				
			||||||
 | 
					> :   This (a field list) names the CSV fields (names may not contain whitespace or `;` or `#`),
 | 
				
			||||||
> If the CSV file contains data corresponding to several accounts (for
 | 
					>     and also assigns them to journal entry fields when you use any of these names:
 | 
				
			||||||
> example - bulk export from other accounting software), the specified
 | 
					 | 
				
			||||||
> field's value, if non-empty, will override the value of `base-account`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`account2-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> If the CSV file contains fields for both accounts in the transaction,
 | 
					 | 
				
			||||||
> you can use this in addition to `account-field`.  If `account2-field` is
 | 
					 | 
				
			||||||
> unspecified, the [account-assigning rules](#account-assigning-rules) are
 | 
					 | 
				
			||||||
> used.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`amount-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> This directive specifies the CSV field containing the transaction
 | 
					 | 
				
			||||||
> amount.  The field may contain a simple number or an hledger-style
 | 
					 | 
				
			||||||
> [amount](#amounts), perhaps with a [price](#prices). See also
 | 
					 | 
				
			||||||
> `amount-in-field`, `amount-out-field`, `currency-field` and
 | 
					 | 
				
			||||||
> `base-currency`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`amount-in-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`amount-out-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> If the CSV file uses two different columns for in and out movements, use
 | 
					 | 
				
			||||||
> these directives instead of `amount-field`.  Note these expect each
 | 
					 | 
				
			||||||
> record to have a positive number in one of these fields and nothing in
 | 
					 | 
				
			||||||
> the other.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`base-account`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> A default account to use in all transactions. May be overridden by
 | 
					 | 
				
			||||||
> `account1-field` and `account2-field`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`base-currency`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> A default currency symbol which will be prepended to all amounts.
 | 
					 | 
				
			||||||
> See also `currency-field`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`code-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> Which field contains the transaction code or check number (`(NNN)`).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`currency-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> The currency symbol in this field will be prepended to all amounts. This
 | 
					 | 
				
			||||||
> overrides `base-currency`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`date-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> Which field contains the transaction date. A number of common
 | 
					 | 
				
			||||||
> four-digit-year date formats are understood by default; other formats
 | 
					 | 
				
			||||||
> will require a `date-format` directive.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`date-format`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> This directive specifies one additional format to try when parsing the
 | 
					 | 
				
			||||||
> date field, using the syntax of Haskell's
 | 
					 | 
				
			||||||
> [formatTime](http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Format.html#v:formatTime).
 | 
					 | 
				
			||||||
> Eg, if the CSV dates are non-padded D/M/YY, use:
 | 
					 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
>     date-format %-d/%-m/%y
 | 
					> :   `date`
 | 
				
			||||||
 | 
					> :   `date2`
 | 
				
			||||||
 | 
					> :   `status`
 | 
				
			||||||
 | 
					> :   `code`
 | 
				
			||||||
 | 
					> :   `description`
 | 
				
			||||||
 | 
					> :   `comment`
 | 
				
			||||||
 | 
					> :   `account1`
 | 
				
			||||||
 | 
					> :   `account2`
 | 
				
			||||||
 | 
					> :   `currency`
 | 
				
			||||||
 | 
					> :   `amount`
 | 
				
			||||||
 | 
					> :   `amount-in`
 | 
				
			||||||
 | 
					> :   `amount-out`
 | 
				
			||||||
 | 
					> :   
 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
> Note custom date formats work best when hledger is built with version
 | 
					> <!--  -->
 | 
				
			||||||
> 1.2.0.5 or greater of the [time](http://hackage.haskell.org/package/time) library.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`description-field`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
> Which field contains the transaction's description. This can be a simple
 | 
					 | 
				
			||||||
> field number, or a custom format combining multiple fields, eg:
 | 
					 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
>     description-field %(1) - %(3)
 | 
					> *JOURNALFIELDNAME* *FIELDVALUE*
 | 
				
			||||||
 | 
					> :   This (a field assignment) assigns the given text value,
 | 
				
			||||||
 | 
					>     which can have CSV field values interpolated via `%name` or `%1`,
 | 
				
			||||||
 | 
					>     to a journal entry field (one of the field names above).
 | 
				
			||||||
 | 
					>     Field assignments may be used in addition to or instead of a field list.
 | 
				
			||||||
 | 
					>
 | 
				
			||||||
 | 
					> :    
 | 
				
			||||||
 | 
					
 | 
				
			||||||
`date2-field`
 | 
					We can also have conditional field assignments which apply only to certain CSV records:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
> Which field contains the transaction's [secondary date](#primary-secondary-dates).
 | 
					> **if** *PATTERNS*<br>  *FIELDASSIGNMENTS*
 | 
				
			||||||
 | 
					> :   PATTERNS is one or more regular expressions on the same or following lines.
 | 
				
			||||||
 | 
					>     <!-- then an optional `~` (indicating case-insensitive infix regular expression matching),\ -->
 | 
				
			||||||
 | 
					>     These are followed by one or more indented field assignment lines.\
 | 
				
			||||||
 | 
					>     In this example, any CSV record containing "groc" (case insensitive, anywhere within the whole record)
 | 
				
			||||||
 | 
					>     will have its account2 and comment set as shown:
 | 
				
			||||||
 | 
					> 
 | 
				
			||||||
 | 
					>         if groc
 | 
				
			||||||
 | 
					>          account2 expenses:groceries
 | 
				
			||||||
 | 
					>          comment  household stuff
 | 
				
			||||||
 | 
					
 | 
				
			||||||
`status-field`
 | 
					And we may sometimes need these as well:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
> Which field contains the transaction cleared status (`*`).
 | 
					> **skip** [*N*]
 | 
				
			||||||
 | 
					> :   Skip this number of CSV lines (1 by default).
 | 
				
			||||||
`skip-lines`
 | 
					>     Use this to skip the initial CSV header line(s).
 | 
				
			||||||
 | 
					>     <!-- hledger tries to skip initial CSV header lines automatically. -->
 | 
				
			||||||
> How many lines to skip in the beginning of the file, e.g. to skip a
 | 
					>     <!-- If it guesses wrong, use this directive to skip exactly N lines. -->
 | 
				
			||||||
> line of column headings.
 | 
					>     <!-- This can also be used in a conditional block to ignore certain CSV records. -->
 | 
				
			||||||
 | 
					>
 | 
				
			||||||
Account-assigning rules select an account to transfer to based on the
 | 
					> **date-format** *DATEFMT*
 | 
				
			||||||
description field (unless `account2-field` is used.) Each
 | 
					> :   This is required if the values for `date` or `date2` fields are not in YYYY/MM/DD format (or close to it).
 | 
				
			||||||
account-assigning rule is a paragraph consisting of one or more
 | 
					>     DATEFMT specifies a strptime-style date parsing pattern containing [year/month/date format codes](http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Format.html#v:formatTime).
 | 
				
			||||||
case-insensitive regular expressions), one per line, followed by the
 | 
					>     Some common values:
 | 
				
			||||||
account name to use when the transaction's description matches any of
 | 
					>
 | 
				
			||||||
these patterns. Eg:
 | 
					>         %-d/%-m/%Y
 | 
				
			||||||
 | 
					>         %-m/%-d/%Y
 | 
				
			||||||
    WHOLE FOODS
 | 
					>         %Y-%h-%d
 | 
				
			||||||
    SUPERMARKET
 | 
					 | 
				
			||||||
    expenses:food:groceries
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you want to clean up messy bank data, you can add `=` and a replacement
 | 
					 | 
				
			||||||
pattern, which rewrites the matched part of the description. (To rewrite
 | 
					 | 
				
			||||||
the entire description, use `.*PAT.*=REPL`). You can also refer to matched
 | 
					 | 
				
			||||||
groups in the usual way with `\0` etc. Eg:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    BLKBSTR=BLOCKBUSTER
 | 
					 | 
				
			||||||
    expenses:entertainment
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
### Timelog files
 | 
					### Timelog files
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
				
			|||||||
		Loading…
	
		Reference in New Issue
	
	Block a user