;doc: update command help
This commit is contained in:
parent
eb6b94ad5a
commit
2889bb6efb
@ -21,22 +21,27 @@ hledger import bank.csv or perhaps hledger import *.csv.
|
|||||||
Note you can import from any file format, though CSV files are the most
|
Note you can import from any file format, though CSV files are the most
|
||||||
common import source, and these docs focus on that case.
|
common import source, and these docs focus on that case.
|
||||||
|
|
||||||
"Deduplication"
|
Skipping
|
||||||
|
|
||||||
import tries to import only the transactions which are new since the
|
import tries to import only the transactions which are new since the
|
||||||
last import. So if your bank's CSV includes the last three months of
|
last import, "skipping over" any that it saw last time. So if your
|
||||||
data, you can download and import it every month (or week, or day) and
|
bank's CSV includes the last three months of data, you can download and
|
||||||
only the new transactions will be imported each time.
|
import it every month (or week, or day) and only the new transactions
|
||||||
|
will be imported each time.
|
||||||
|
|
||||||
It works as follows. For each imported FILE (usually a CSV file): - It
|
It works as follows. For each imported FILE:
|
||||||
tries to find the latest date seen previously, by reading it from a
|
|
||||||
hidden .latest.FILE in the same directory. - Then it processes FILE,
|
- It tries to find the latest date seen previously, by reading it from
|
||||||
ignoring any transactions on or before the "latest seen" date.
|
a hidden .latest.FILE in the same directory.
|
||||||
|
- Then it processes FILE, ignoring any transactions on or before the
|
||||||
|
"latest seen" date.
|
||||||
|
|
||||||
And after a successful import, it updates the .latest.FILE(s) for next
|
And after a successful import, it updates the .latest.FILE(s) for next
|
||||||
time (unless --dry-run was used).
|
time (unless --dry-run was used).
|
||||||
|
|
||||||
This is simple but fairly effective. It assumes:
|
This is simple system that works fairly well for transaction data
|
||||||
|
(usually CSV, but it could be any of hledger's input formats). It
|
||||||
|
assumes:
|
||||||
|
|
||||||
1. new items always have the newest dates
|
1. new items always have the newest dates
|
||||||
2. item dates are stable across successive CSV downloads
|
2. item dates are stable across successive CSV downloads
|
||||||
@ -49,11 +54,15 @@ by importing more often (and in old transactions it doesn't matter).
|
|||||||
|
|
||||||
Note, import avoids reprocessing the same dates across successive runs,
|
Note, import avoids reprocessing the same dates across successive runs,
|
||||||
but it does not detect transactions that are duplicated within a single
|
but it does not detect transactions that are duplicated within a single
|
||||||
run. So eg if you downloaded but did not import bank.1.csv, and later
|
run. I'll call these "skipping" and "deduplication".
|
||||||
downloaded bank.2.csv with overlapping data, you should not import both
|
|
||||||
of them in a single run (hledger import bank.1.csv bank.2.csv); instead,
|
So for example, say you downloaded but did not import bank.1.csv, and
|
||||||
import them one at a time (hledger import bank.1.csv, then
|
later downloaded bank.2.csv with overlapping data. Then you should not
|
||||||
hledger import bank.2.csv).
|
import both of them at once (hledger import bank.1.csv bank.2.csv), as
|
||||||
|
the overlapping data would appear twice and not be deduplicated.
|
||||||
|
Instead, import them one at a time
|
||||||
|
(hledger import bank.1.csv; hledger import bank.2.csv), and the second
|
||||||
|
import will skip the overlapping data.
|
||||||
|
|
||||||
Normally you can ignore the .latest.* files, but if needed, you can
|
Normally you can ignore the .latest.* files, but if needed, you can
|
||||||
delete them (to make all transactions unseen), or construct/modify them
|
delete them (to make all transactions unseen), or construct/modify them
|
||||||
@ -63,7 +72,7 @@ seen transactions up to this date, and this many of them occurring on
|
|||||||
that date".
|
that date".
|
||||||
|
|
||||||
(hledger print --new also uses and updates these .latest.* files, but it
|
(hledger print --new also uses and updates these .latest.* files, but it
|
||||||
is not often used.)
|
is less often used.)
|
||||||
|
|
||||||
Related: CSV > Working with CSV > Deduplicating, importing.
|
Related: CSV > Working with CSV > Deduplicating, importing.
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user