docs: CSV rules version 2 syntax
This commit is contained in:
parent
57eebd9ae5
commit
9c6ee3ae70
226
MANUAL.md
226
MANUAL.md
@ -492,161 +492,123 @@ to each account name.
|
|||||||
|
|
||||||
### CSV files
|
### CSV files
|
||||||
|
|
||||||
Since version 0.18, hledger can also read
|
hledger can also read
|
||||||
[CSV](http://en.wikipedia.org/wiki/Comma-separated_values) files natively
|
[CSV](http://en.wikipedia.org/wiki/Comma-separated_values) files,
|
||||||
(previous versions provided a special `convert` command.)
|
translating the CSV records into journal entries on the fly. In this
|
||||||
|
case, we must provide an additional "rules file", which is a file
|
||||||
|
named like the CSV file with an extra `.rules` suffix, containing
|
||||||
|
rules specifying things like:
|
||||||
|
|
||||||
An arbitrary CSV file does not provide enough information to be parsed as
|
- which CSV fields correspond to which journal entry fields
|
||||||
a journal. So when reading CSV, hledger looks for an additional
|
- which date format is being used
|
||||||
[rules file](#the-rules-file), which identifies the CSV fields and assigns
|
- which account name(s) to use
|
||||||
accounts. For reading `FILE.csv`, hledger uses `FILE.csv.rules` in the same
|
|
||||||
directory, auto-creating it if needed. You should configure the rules file
|
|
||||||
to get the best data from your CSV file. You can specify a different rules
|
|
||||||
file with `--rules-file` (useful when reading from standard input).
|
|
||||||
|
|
||||||
An example - sample.csv:
|
Typically you'll keep one rules file for each account which you
|
||||||
|
download as CSV. A default rules file will be created if it doesn't
|
||||||
|
exist, in which case you'll need to refine it to get the best results.
|
||||||
|
You can override the default rules file name with `--rules-file`.
|
||||||
|
|
||||||
|
Here's a quick example. Say we have downloaded `checking.csv` from a
|
||||||
|
bank for the first time:
|
||||||
|
|
||||||
sample.csv:
|
|
||||||
"Date","Note","Amount"
|
"Date","Note","Amount"
|
||||||
"2012/3/22","TRANSFER TO SAVINGS","-10.00"
|
"2012/3/22","DEPOSIT","50.00"
|
||||||
"2012/3/23","SOMETHING ELSE","5.50"
|
"2012/3/23","TRANSFER TO SAVINGS","-10.00"
|
||||||
|
|
||||||
sample.rules:
|
We could create `checking.csv.rules` containing:
|
||||||
|
|
||||||
skip-lines 1
|
account1 assets:bank:checking
|
||||||
date-field 0
|
skip 1
|
||||||
description-field 1
|
fields date, description, amount
|
||||||
amount-field 2
|
|
||||||
currency $
|
currency $
|
||||||
base-account assets:bank:checking
|
|
||||||
|
|
||||||
SAVINGS
|
if ~ SAVINGS
|
||||||
assets:bank:savings
|
account2 assets:bank:savings
|
||||||
|
|
||||||
the resulting journal:
|
This says:
|
||||||
|
"always use assets:bank:checking as the first account;
|
||||||
|
ignore the first line;
|
||||||
|
use the first, second and third CSV fields as the entry date, description and amount respectively;
|
||||||
|
always prepend $ to the amount value;
|
||||||
|
if the CSV record contains 'SAVINGS', use assets:bank:savings as the second account".
|
||||||
|
Now hledger can read this CSV file:
|
||||||
|
|
||||||
$ hledger -f sample.csv print
|
$ hledger -f checking.csv print
|
||||||
using conversion rules file sample.rules
|
using conversion rules file checking.csv.rules
|
||||||
2012/03/22 TRANSFER TO SAVINGS
|
2012/03/22 DEPOSIT
|
||||||
|
income:unknown $-50.00
|
||||||
|
assets:bank:checking $50.00
|
||||||
|
|
||||||
|
2012/03/23 TRANSFER TO SAVINGS
|
||||||
assets:bank:savings $10.00
|
assets:bank:savings $10.00
|
||||||
assets:bank:checking $-10.00
|
assets:bank:checking $-10.00
|
||||||
|
|
||||||
2012/03/23 SOMETHING ELSE
|
We might save this output as `checking.journal`, and/or merge it (manually) into the main journal file.
|
||||||
income:unknown $-5.50
|
|
||||||
assets:bank:checking $5.50
|
|
||||||
|
|
||||||
### The rules file
|
#### Rules syntax
|
||||||
|
|
||||||
A rules file consists of the following optional directives, followed by
|
The rules file is simple. Lines beginning with `#` or `;` and blank lines are ignored.
|
||||||
account-assigning rules. (Tip: rules file parse errors are not the
|
The only requirement is that we specify how to fill journal entries' date and amount fields (at least),
|
||||||
greatest, so check your rules file format if you're getting unexpected
|
using a *field list*, or individual *field assignments*, or both:
|
||||||
results.)
|
|
||||||
|
|
||||||
`account-field`
|
> **fields** *CSVFIELDNAME1*, *CSVFIELDNAME1*, ...
|
||||||
|
> : This (a field list) names the CSV fields (names may not contain whitespace or `;` or `#`),
|
||||||
> If the CSV file contains data corresponding to several accounts (for
|
> and also assigns them to journal entry fields when you use any of these names:
|
||||||
> example - bulk export from other accounting software), the specified
|
|
||||||
> field's value, if non-empty, will override the value of `base-account`.
|
|
||||||
|
|
||||||
`account2-field`
|
|
||||||
|
|
||||||
> If the CSV file contains fields for both accounts in the transaction,
|
|
||||||
> you can use this in addition to `account-field`. If `account2-field` is
|
|
||||||
> unspecified, the [account-assigning rules](#account-assigning-rules) are
|
|
||||||
> used.
|
|
||||||
|
|
||||||
`amount-field`
|
|
||||||
|
|
||||||
> This directive specifies the CSV field containing the transaction
|
|
||||||
> amount. The field may contain a simple number or an hledger-style
|
|
||||||
> [amount](#amounts), perhaps with a [price](#prices). See also
|
|
||||||
> `amount-in-field`, `amount-out-field`, `currency-field` and
|
|
||||||
> `base-currency`.
|
|
||||||
|
|
||||||
`amount-in-field`
|
|
||||||
|
|
||||||
`amount-out-field`
|
|
||||||
|
|
||||||
> If the CSV file uses two different columns for in and out movements, use
|
|
||||||
> these directives instead of `amount-field`. Note these expect each
|
|
||||||
> record to have a positive number in one of these fields and nothing in
|
|
||||||
> the other.
|
|
||||||
|
|
||||||
`base-account`
|
|
||||||
|
|
||||||
> A default account to use in all transactions. May be overridden by
|
|
||||||
> `account1-field` and `account2-field`.
|
|
||||||
|
|
||||||
`base-currency`
|
|
||||||
|
|
||||||
> A default currency symbol which will be prepended to all amounts.
|
|
||||||
> See also `currency-field`.
|
|
||||||
|
|
||||||
`code-field`
|
|
||||||
|
|
||||||
> Which field contains the transaction code or check number (`(NNN)`).
|
|
||||||
|
|
||||||
`currency-field`
|
|
||||||
|
|
||||||
> The currency symbol in this field will be prepended to all amounts. This
|
|
||||||
> overrides `base-currency`.
|
|
||||||
|
|
||||||
`date-field`
|
|
||||||
|
|
||||||
> Which field contains the transaction date. A number of common
|
|
||||||
> four-digit-year date formats are understood by default; other formats
|
|
||||||
> will require a `date-format` directive.
|
|
||||||
|
|
||||||
`date-format`
|
|
||||||
|
|
||||||
> This directive specifies one additional format to try when parsing the
|
|
||||||
> date field, using the syntax of Haskell's
|
|
||||||
> [formatTime](http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Format.html#v:formatTime).
|
|
||||||
> Eg, if the CSV dates are non-padded D/M/YY, use:
|
|
||||||
>
|
>
|
||||||
> date-format %-d/%-m/%y
|
> : `date`
|
||||||
|
> : `date2`
|
||||||
|
> : `status`
|
||||||
|
> : `code`
|
||||||
|
> : `description`
|
||||||
|
> : `comment`
|
||||||
|
> : `account1`
|
||||||
|
> : `account2`
|
||||||
|
> : `currency`
|
||||||
|
> : `amount`
|
||||||
|
> : `amount-in`
|
||||||
|
> : `amount-out`
|
||||||
|
> :
|
||||||
>
|
>
|
||||||
> Note custom date formats work best when hledger is built with version
|
> <!-- -->
|
||||||
> 1.2.0.5 or greater of the [time](http://hackage.haskell.org/package/time) library.
|
|
||||||
|
|
||||||
`description-field`
|
|
||||||
|
|
||||||
> Which field contains the transaction's description. This can be a simple
|
|
||||||
> field number, or a custom format combining multiple fields, eg:
|
|
||||||
>
|
>
|
||||||
> description-field %(1) - %(3)
|
> *JOURNALFIELDNAME* *FIELDVALUE*
|
||||||
|
> : This (a field assignment) assigns the given text value,
|
||||||
|
> which can have CSV field values interpolated via `%name` or `%1`,
|
||||||
|
> to a journal entry field (one of the field names above).
|
||||||
|
> Field assignments may be used in addition to or instead of a field list.
|
||||||
|
>
|
||||||
|
> :
|
||||||
|
|
||||||
`date2-field`
|
We can also have conditional field assignments which apply only to certain CSV records:
|
||||||
|
|
||||||
> Which field contains the transaction's [secondary date](#primary-secondary-dates).
|
> **if** *PATTERNS*<br> *FIELDASSIGNMENTS*
|
||||||
|
> : PATTERNS is one or more regular expressions on the same or following lines.
|
||||||
|
> <!-- then an optional `~` (indicating case-insensitive infix regular expression matching),\ -->
|
||||||
|
> These are followed by one or more indented field assignment lines.\
|
||||||
|
> In this example, any CSV record containing "groc" (case insensitive, anywhere within the whole record)
|
||||||
|
> will have its account2 and comment set as shown:
|
||||||
|
>
|
||||||
|
> if groc
|
||||||
|
> account2 expenses:groceries
|
||||||
|
> comment household stuff
|
||||||
|
|
||||||
`status-field`
|
And we may sometimes need these as well:
|
||||||
|
|
||||||
> Which field contains the transaction cleared status (`*`).
|
> **skip** [*N*]
|
||||||
|
> : Skip this number of CSV lines (1 by default).
|
||||||
`skip-lines`
|
> Use this to skip the initial CSV header line(s).
|
||||||
|
> <!-- hledger tries to skip initial CSV header lines automatically. -->
|
||||||
> How many lines to skip in the beginning of the file, e.g. to skip a
|
> <!-- If it guesses wrong, use this directive to skip exactly N lines. -->
|
||||||
> line of column headings.
|
> <!-- This can also be used in a conditional block to ignore certain CSV records. -->
|
||||||
|
>
|
||||||
Account-assigning rules select an account to transfer to based on the
|
> **date-format** *DATEFMT*
|
||||||
description field (unless `account2-field` is used.) Each
|
> : This is required if the values for `date` or `date2` fields are not in YYYY/MM/DD format (or close to it).
|
||||||
account-assigning rule is a paragraph consisting of one or more
|
> DATEFMT specifies a strptime-style date parsing pattern containing [year/month/date format codes](http://hackage.haskell.org/packages/archive/time/latest/doc/html/Data-Time-Format.html#v:formatTime).
|
||||||
case-insensitive regular expressions), one per line, followed by the
|
> Some common values:
|
||||||
account name to use when the transaction's description matches any of
|
>
|
||||||
these patterns. Eg:
|
> %-d/%-m/%Y
|
||||||
|
> %-m/%-d/%Y
|
||||||
WHOLE FOODS
|
> %Y-%h-%d
|
||||||
SUPERMARKET
|
|
||||||
expenses:food:groceries
|
|
||||||
|
|
||||||
If you want to clean up messy bank data, you can add `=` and a replacement
|
|
||||||
pattern, which rewrites the matched part of the description. (To rewrite
|
|
||||||
the entire description, use `.*PAT.*=REPL`). You can also refer to matched
|
|
||||||
groups in the usual way with `\0` etc. Eg:
|
|
||||||
|
|
||||||
BLKBSTR=BLOCKBUSTER
|
|
||||||
expenses:entertainment
|
|
||||||
|
|
||||||
### Timelog files
|
### Timelog files
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user