371 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
			
		
		
	
	
			371 lines
		
	
	
		
			10 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
| 
 | |
| .TH "hledger_csv" "5" "September 2019" "hledger 1.15.99" "hledger User Manuals"
 | |
| 
 | |
| 
 | |
| 
 | |
| .SH NAME
 | |
| .PP
 | |
| CSV - how hledger reads CSV data, and the CSV rules file format
 | |
| .SH DESCRIPTION
 | |
| .PP
 | |
| hledger can read CSV (comma-separated value) files as if they were
 | |
| journal files, automatically converting each CSV record into a
 | |
| transaction.
 | |
| (To learn about \f[I]writing\f[R] CSV, see CSV output.)
 | |
| .PP
 | |
| Converting CSV to transactions requires some special conversion rules.
 | |
| These do several things:
 | |
| .IP \[bu] 2
 | |
| they describe the layout and format of the CSV data
 | |
| .IP \[bu] 2
 | |
| they can customize the generated journal entries using a simple
 | |
| templating language
 | |
| .IP \[bu] 2
 | |
| they can add refinements based on patterns in the CSV data, eg
 | |
| categorizing transactions with more detailed account names.
 | |
| .PP
 | |
| When reading a CSV file named \f[C]FILE.csv\f[R], hledger looks for a
 | |
| conversion rules file named \f[C]FILE.csv.rules\f[R] in the same
 | |
| directory.
 | |
| You can override this with the \f[C]--rules-file\f[R] option.
 | |
| If the rules file does not exist, hledger will auto-create one with some
 | |
| example rules, which you\[aq]ll need to adjust.
 | |
| .PP
 | |
| At minimum, the rules file must identify the date and amount fields.
 | |
| It\[aq]s often necessary to specify the date format, and the number of
 | |
| header lines to skip, also.
 | |
| Eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| fields date, _, _, amount
 | |
| date-format  %d/%m/%Y
 | |
| skip 1
 | |
| \f[R]
 | |
| .fi
 | |
| .PP
 | |
| A more complete example:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # hledger CSV rules for amazon.com order history
 | |
| 
 | |
| # sample:
 | |
| # \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
 | |
| # \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
 | |
| 
 | |
| # skip one header line
 | |
| skip 1
 | |
| 
 | |
| # name the csv fields (and assign the transaction\[aq]s date, amount and code)
 | |
| fields date, _, toorfrom, name, amzstatus, amount, fees, code
 | |
| 
 | |
| # how to parse the date
 | |
| date-format %b %-d, %Y
 | |
| 
 | |
| # combine two fields to make the description
 | |
| description %toorfrom %name
 | |
| 
 | |
| # save these fields as tags
 | |
| comment     status:%amzstatus, fees:%fees
 | |
| 
 | |
| # set the base account for all transactions
 | |
| account1    assets:amazon
 | |
| 
 | |
| # flip the sign on the amount
 | |
| amount      -%amount
 | |
| \f[R]
 | |
| .fi
 | |
| .PP
 | |
| For more examples, see Convert CSV files.
 | |
| .SH CSV RULES
 | |
| .PP
 | |
| The following seven kinds of rule can appear in the rules file, in any
 | |
| order.
 | |
| Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
 | |
| ignored.
 | |
| .SS skip
 | |
| .PP
 | |
| \f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
 | |
| .PP
 | |
| Skip this number of CSV records at the beginning.
 | |
| You\[aq]ll need this whenever your CSV data contains header lines.
 | |
| Eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # ignore the first CSV line
 | |
| skip 1
 | |
| \f[R]
 | |
| .fi
 | |
| .SS date-format
 | |
| .PP
 | |
| \f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R]
 | |
| .PP
 | |
| When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R]
 | |
| (or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to
 | |
| specify the format.
 | |
| DATEFMT is a strptime-like date parsing pattern, which must parse the
 | |
| date field values completely.
 | |
| Examples:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # for dates like \[dq]11/06/2013\[dq]:
 | |
| date-format %m/%d/%Y
 | |
| \f[R]
 | |
| .fi
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional):
 | |
| date-format %-d/%-m/%Y
 | |
| \f[R]
 | |
| .fi
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # for dates like \[dq]2013-Nov-06\[dq]:
 | |
| date-format %Y-%h-%d
 | |
| \f[R]
 | |
| .fi
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # for dates like \[dq]11/6/2013 11:32 PM\[dq]:
 | |
| date-format %-m/%-d/%Y %l:%M %p
 | |
| \f[R]
 | |
| .fi
 | |
| .SS field list
 | |
| .PP
 | |
| \f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
 | |
| \f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
 | |
| .PP
 | |
| This (a) names the CSV fields, in order (names may not contain
 | |
| whitespace; uninteresting names may be left blank), and (b) assigns them
 | |
| to journal entry fields if you use any of these standard field names:
 | |
| \f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
 | |
| \f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
 | |
| \f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
 | |
| \f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
 | |
| \f[C]balance1\f[R], \f[C]balance2\f[R].
 | |
| Eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount,
 | |
| # and give the 7th and 8th fields meaningful names for later reference:
 | |
| #
 | |
| # CSV field:
 | |
| #      1     2            3 4       5 6 7          8
 | |
| # entry field:
 | |
| fields date, description, , amount, , , somefield, anotherfield
 | |
| \f[R]
 | |
| .fi
 | |
| .SS field assignment
 | |
| .PP
 | |
| \f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
 | |
| .PP
 | |
| This sets a journal entry field (one of the standard names above) to the
 | |
| given text value, which can include CSV field values interpolated by
 | |
| name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
 | |
| Eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
 | |
| amount USD %4
 | |
| \f[R]
 | |
| .fi
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # combine three fields to make a comment (containing two tags)
 | |
| comment note: %somefield - %anotherfield, date: %1
 | |
| \f[R]
 | |
| .fi
 | |
| .PP
 | |
| Field assignments can be used instead of or in addition to a field list.
 | |
| .PP
 | |
| Note, interpolation strips any outer whitespace, so a CSV value like
 | |
| \f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
 | |
| .SS conditional block
 | |
| .PP
 | |
| \f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R]
 | |
| .PD 0
 | |
| .P
 | |
| .PD
 | |
| \ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
 | |
| .PP
 | |
| \f[C]if\f[R]
 | |
| .PD 0
 | |
| .P
 | |
| .PD
 | |
| \f[I]\f[CI]PATTERN\f[I]\f[R]
 | |
| .PD 0
 | |
| .P
 | |
| .PD
 | |
| \f[I]\f[CI]PATTERN\f[I]\f[R]...
 | |
| .PD 0
 | |
| .P
 | |
| .PD
 | |
| \ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
 | |
| .PP
 | |
| This applies one or more field assignments, only to those CSV records
 | |
| matched by one of the PATTERNs.
 | |
| The patterns are case-insensitive regular expressions which match
 | |
| anywhere within the whole CSV record (it\[aq]s not yet possible to match
 | |
| within a specific field).
 | |
| When there are multiple patterns they can be written on separate lines,
 | |
| unindented.
 | |
| The field assignments are on separate lines indented by at least one
 | |
| space.
 | |
| Examples:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # if the CSV record contains \[dq]groceries\[dq], set account2 to \[dq]expenses:groceries\[dq]
 | |
| if groceries
 | |
|  account2 expenses:groceries
 | |
| \f[R]
 | |
| .fi
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # if the CSV record contains any of these patterns, set account2 and comment as shown
 | |
| if
 | |
| monthly service fee
 | |
| atm transaction fee
 | |
| banking thru software
 | |
|  account2 expenses:business:banking
 | |
|  comment  XXX deductible ? check it
 | |
| \f[R]
 | |
| .fi
 | |
| .SS include
 | |
| .PP
 | |
| \f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R]
 | |
| .PP
 | |
| Include another rules file at this point.
 | |
| \f[C]RULESFILE\f[R] is either an absolute file path or a path relative
 | |
| to the current file\[aq]s directory.
 | |
| Eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| # rules reused with several CSV files
 | |
| include common.rules
 | |
| \f[R]
 | |
| .fi
 | |
| .SS newest-first
 | |
| .PP
 | |
| \f[C]newest-first\f[R]
 | |
| .PP
 | |
| Consider adding this rule if all of the following are true: you might be
 | |
| processing just one day of data, your CSV records are in reverse
 | |
| chronological order (newest first), and you care about preserving the
 | |
| order of same-day transactions.
 | |
| It usually isn\[aq]t needed, because hledger autodetects the CSV order,
 | |
| but when all CSV records have the same date it will assume they are
 | |
| oldest first.
 | |
| .SH CSV TIPS
 | |
| .SS CSV ordering
 | |
| .PP
 | |
| The generated journal entries will be sorted by date.
 | |
| The order of same-day entries will be preserved (except in the special
 | |
| case where you might need \f[C]newest-first\f[R], see above).
 | |
| .SS CSV accounts
 | |
| .PP
 | |
| Each journal entry will have two postings, to \f[C]account1\f[R] and
 | |
| \f[C]account2\f[R] respectively.
 | |
| It\[aq]s not yet possible to generate entries with more than two
 | |
| postings.
 | |
| It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
 | |
| account whose CSV we are reading.
 | |
| .SS CSV amounts
 | |
| .PP
 | |
| A transaction amount must be set, in one of these ways:
 | |
| .IP \[bu] 2
 | |
| with an \f[C]amount\f[R] field assignment, which sets the first
 | |
| posting\[aq]s amount
 | |
| .IP \[bu] 2
 | |
| (When the CSV has debit and credit amounts in separate fields:)
 | |
| .PD 0
 | |
| .P
 | |
| .PD
 | |
| with field assignments for the \f[C]amount-in\f[R] and
 | |
| \f[C]amount-out\f[R] pseudo fields (both of them).
 | |
| Whichever one has a value will be used, with appropriate sign.
 | |
| If both contain a value, it might not work so well.
 | |
| .IP \[bu] 2
 | |
| or implicitly by means of a balance assignment (see below).
 | |
| .PP
 | |
| There is some special handling for sign in amounts:
 | |
| .IP \[bu] 2
 | |
| If an amount value is parenthesised, it will be de-parenthesised and
 | |
| sign-flipped.
 | |
| .IP \[bu] 2
 | |
| If an amount value begins with a double minus sign, those will cancel
 | |
| out and be removed.
 | |
| .PP
 | |
| If the currency/commodity symbol is provided as a separate CSV field,
 | |
| assign it to the \f[C]currency\f[R] pseudo field; the symbol will be
 | |
| prepended to the amount (TODO: when there is an amount).
 | |
| Or, you can use an \f[C]amount\f[R] field assignment for more control,
 | |
| eg:
 | |
| .IP
 | |
| .nf
 | |
| \f[C]
 | |
| fields date,description,currency,amount
 | |
| amount %amount %currency
 | |
| \f[R]
 | |
| .fi
 | |
| .SS CSV balance assertions/assignments
 | |
| .PP
 | |
| If the CSV includes a running balance, you can assign that to one of the
 | |
| pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or
 | |
| \f[C]balance2\f[R].
 | |
| This will generate a balance assertion (or if the amount is left empty,
 | |
| a balance assignment), on the first or second posting, whenever the
 | |
| running balance field is non-empty.
 | |
| (TODO: #1000)
 | |
| .SS Reading multiple CSV files
 | |
| .PP
 | |
| You can read multiple CSV files at once using multiple \f[C]-f\f[R]
 | |
| arguments on the command line, and hledger will look for a
 | |
| correspondingly-named rules file for each.
 | |
| Note if you use the \f[C]--rules-file\f[R] option, this one rules file
 | |
| will be used for all the CSV files being read.
 | |
| .SS Valid CSV
 | |
| .PP
 | |
| hledger follows RFC 4180, with the addition of a customisable separator
 | |
| character.
 | |
| .PP
 | |
| Some things to note:
 | |
| .PP
 | |
| When quoting fields,
 | |
| .IP \[bu] 2
 | |
| you must use double quotes, not single quotes
 | |
| .IP \[bu] 2
 | |
| spaces outside the quotes are not allowed.
 | |
| 
 | |
| 
 | |
| .SH "REPORTING BUGS"
 | |
| Report bugs at http://bugs.hledger.org
 | |
| (or on the #hledger IRC channel or hledger mail list)
 | |
| 
 | |
| .SH AUTHORS
 | |
| Simon Michael <simon@joyful.com> and contributors
 | |
| 
 | |
| .SH COPYRIGHT
 | |
| 
 | |
| Copyright (C) 2007-2019 Simon Michael.
 | |
| .br
 | |
| Released under GNU GPL v3 or later.
 | |
| 
 | |
| .SH SEE ALSO
 | |
| hledger(1), hledger\-ui(1), hledger\-web(1), hledger\-api(1),
 | |
| hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_timedot(5),
 | |
| ledger(1)
 | |
| 
 | |
| http://hledger.org
 |