;doc: update manuals
This commit is contained in:
parent
2889bb6efb
commit
8642db786a
@ -9866,24 +9866,28 @@ files to your main journal, you will run
|
||||
.PP
|
||||
Note you can import from any file format, though CSV files are the most
|
||||
common import source, and these docs focus on that case.
|
||||
.SS \[dq]Deduplication\[dq]
|
||||
.SS Skipping
|
||||
\f[CR]import\f[R] tries to import only the transactions which are new
|
||||
since the last import.
|
||||
since the last import, \[dq]skipping over\[dq] any that it saw last
|
||||
time.
|
||||
So if your bank\[aq]s CSV includes the last three months of data, you
|
||||
can download and \f[CR]import\f[R] it every month (or week, or day) and
|
||||
only the new transactions will be imported each time.
|
||||
.PP
|
||||
It works as follows.
|
||||
For each imported \f[CR]FILE\f[R] (usually a CSV file): \- It tries to
|
||||
find the latest date seen previously, by reading it from a hidden
|
||||
\f[CR].latest.FILE\f[R] in the same directory.
|
||||
\- Then it processes \f[CR]FILE\f[R], ignoring any transactions on or
|
||||
For each imported \f[CR]FILE\f[R]:
|
||||
.IP \[bu] 2
|
||||
It tries to find the latest date seen previously, by reading it from a
|
||||
hidden \f[CR].latest.FILE\f[R] in the same directory.
|
||||
.IP \[bu] 2
|
||||
Then it processes \f[CR]FILE\f[R], ignoring any transactions on or
|
||||
before the \[dq]latest seen\[dq] date.
|
||||
.PP
|
||||
And after a successful import, it updates the \f[CR].latest.FILE\f[R](s)
|
||||
for next time (unless \f[CR]\-\-dry\-run\f[R] was used).
|
||||
.PP
|
||||
This is simple but fairly effective.
|
||||
This is simple system that works fairly well for transaction data
|
||||
(usually CSV, but it could be any of hledger\[aq]s input formats).
|
||||
It assumes:
|
||||
.IP "1." 3
|
||||
new items always have the newest dates
|
||||
@ -9901,12 +9905,17 @@ more often (and in old transactions it doesn\[aq]t matter).
|
||||
Note, \f[CR]import\f[R] avoids reprocessing the same dates across
|
||||
successive runs, but it does not detect transactions that are duplicated
|
||||
within a single run.
|
||||
So eg if you downloaded but did not import \f[CR]bank.1.csv\f[R], and
|
||||
later downloaded \f[CR]bank.2.csv\f[R] with overlapping data, you should
|
||||
not import both of them in a single run
|
||||
(\f[CR]hledger import bank.1.csv bank.2.csv\f[R]); instead, import them
|
||||
one at a time (\f[CR]hledger import bank.1.csv\f[R], then
|
||||
\f[CR]hledger import bank.2.csv\f[R]).
|
||||
I\[aq]ll call these \[dq]skipping\[dq] and \[dq]deduplication\[dq].
|
||||
.PP
|
||||
So for example, say you downloaded but did not import
|
||||
\f[CR]bank.1.csv\f[R], and later downloaded \f[CR]bank.2.csv\f[R] with
|
||||
overlapping data.
|
||||
Then you should not import both of them at once
|
||||
(\f[CR]hledger import bank.1.csv bank.2.csv\f[R]), as the overlapping
|
||||
data would appear twice and not be deduplicated.
|
||||
Instead, import them one at a time
|
||||
(\f[CR]hledger import bank.1.csv; hledger import bank.2.csv\f[R]), and
|
||||
the second import will skip the overlapping data.
|
||||
.PP
|
||||
Normally you can ignore the \f[CR].latest.*\f[R] files, but if needed,
|
||||
you can delete them (to make all transactions unseen), or
|
||||
@ -9917,7 +9926,7 @@ It means \[dq]I have seen transactions up to this date, and this many of
|
||||
them occurring on that date\[dq].
|
||||
.PP
|
||||
(\f[CR]hledger print \-\-new\f[R] also uses and updates these
|
||||
\f[CR].latest.*\f[R] files, but it is not often used.)
|
||||
\f[CR].latest.*\f[R] files, but it is less often used.)
|
||||
.PP
|
||||
Related: CSV > Working with CSV > Deduplicating, importing.
|
||||
.SS Import testing
|
||||
|
||||
@ -9546,31 +9546,36 @@ most common import source, and these docs focus on that case.
|
||||
|
||||
* Menu:
|
||||
|
||||
* "Deduplication"::
|
||||
* Skipping::
|
||||
* Import testing::
|
||||
* Importing balance assignments::
|
||||
* Commodity display styles::
|
||||
|
||||
|
||||
File: hledger.info, Node: "Deduplication", Next: Import testing, Up: import
|
||||
File: hledger.info, Node: Skipping, Next: Import testing, Up: import
|
||||
|
||||
24.19.1 "Deduplication"
|
||||
-----------------------
|
||||
24.19.1 Skipping
|
||||
----------------
|
||||
|
||||
'import' tries to import only the transactions which are new since the
|
||||
last import. So if your bank's CSV includes the last three months of
|
||||
data, you can download and 'import' it every month (or week, or day) and
|
||||
only the new transactions will be imported each time.
|
||||
last import, "skipping over" any that it saw last time. So if your
|
||||
bank's CSV includes the last three months of data, you can download and
|
||||
'import' it every month (or week, or day) and only the new transactions
|
||||
will be imported each time.
|
||||
|
||||
It works as follows. For each imported 'FILE' (usually a CSV file):
|
||||
- It tries to find the latest date seen previously, by reading it from a
|
||||
hidden '.latest.FILE' in the same directory. - Then it processes
|
||||
'FILE', ignoring any transactions on or before the "latest seen" date.
|
||||
It works as follows. For each imported 'FILE':
|
||||
|
||||
* It tries to find the latest date seen previously, by reading it
|
||||
from a hidden '.latest.FILE' in the same directory.
|
||||
* Then it processes 'FILE', ignoring any transactions on or before
|
||||
the "latest seen" date.
|
||||
|
||||
And after a successful import, it updates the '.latest.FILE'(s) for
|
||||
next time (unless '--dry-run' was used).
|
||||
|
||||
This is simple but fairly effective. It assumes:
|
||||
This is simple system that works fairly well for transaction data
|
||||
(usually CSV, but it could be any of hledger's input formats). It
|
||||
assumes:
|
||||
|
||||
1. new items always have the newest dates
|
||||
2. item dates are stable across successive CSV downloads
|
||||
@ -9583,11 +9588,15 @@ by importing more often (and in old transactions it doesn't matter).
|
||||
|
||||
Note, 'import' avoids reprocessing the same dates across successive
|
||||
runs, but it does not detect transactions that are duplicated within a
|
||||
single run. So eg if you downloaded but did not import 'bank.1.csv',
|
||||
and later downloaded 'bank.2.csv' with overlapping data, you should not
|
||||
import both of them in a single run ('hledger import bank.1.csv
|
||||
bank.2.csv'); instead, import them one at a time ('hledger import
|
||||
bank.1.csv', then 'hledger import bank.2.csv').
|
||||
single run. I'll call these "skipping" and "deduplication".
|
||||
|
||||
So for example, say you downloaded but did not import 'bank.1.csv',
|
||||
and later downloaded 'bank.2.csv' with overlapping data. Then you
|
||||
should not import both of them at once ('hledger import bank.1.csv
|
||||
bank.2.csv'), as the overlapping data would appear twice and not be
|
||||
deduplicated. Instead, import them one at a time ('hledger import
|
||||
bank.1.csv; hledger import bank.2.csv'), and the second import will skip
|
||||
the overlapping data.
|
||||
|
||||
Normally you can ignore the '.latest.*' files, but if needed, you can
|
||||
delete them (to make all transactions unseen), or construct/modify them
|
||||
@ -9597,12 +9606,12 @@ have seen transactions up to this date, and this many of them occurring
|
||||
on that date".
|
||||
|
||||
('hledger print --new' also uses and updates these '.latest.*' files,
|
||||
but it is not often used.)
|
||||
but it is less often used.)
|
||||
|
||||
Related: CSV > Working with CSV > Deduplicating, importing.
|
||||
|
||||
|
||||
File: hledger.info, Node: Import testing, Next: Importing balance assignments, Prev: "Deduplication", Up: import
|
||||
File: hledger.info, Node: Import testing, Next: Importing balance assignments, Prev: Skipping, Up: import
|
||||
|
||||
24.19.2 Import testing
|
||||
----------------------
|
||||
@ -11717,84 +11726,84 @@ Node: help343889
|
||||
Ref: #help-1343998
|
||||
Node: import345371
|
||||
Ref: #import345494
|
||||
Node: "Deduplication"346604
|
||||
Ref: #deduplication346735
|
||||
Node: Import testing348911
|
||||
Ref: #import-testing349078
|
||||
Node: Importing balance assignments349921
|
||||
Ref: #importing-balance-assignments350127
|
||||
Node: Commodity display styles350776
|
||||
Ref: #commodity-display-styles350949
|
||||
Node: incomestatement351078
|
||||
Ref: #incomestatement351220
|
||||
Node: notes352551
|
||||
Ref: #notes352673
|
||||
Node: payees353035
|
||||
Ref: #payees353150
|
||||
Node: prices353669
|
||||
Ref: #prices353784
|
||||
Node: print354437
|
||||
Ref: #print354552
|
||||
Node: print explicitness355528
|
||||
Ref: #print-explicitness355671
|
||||
Node: print amount style356450
|
||||
Ref: #print-amount-style356620
|
||||
Node: print parseability357690
|
||||
Ref: #print-parseability357862
|
||||
Node: print other features358611
|
||||
Ref: #print-other-features358790
|
||||
Node: print output format359311
|
||||
Ref: #print-output-format359459
|
||||
Node: register362598
|
||||
Ref: #register362720
|
||||
Node: Custom register output367751
|
||||
Ref: #custom-register-output367882
|
||||
Node: rewrite369229
|
||||
Ref: #rewrite369347
|
||||
Node: Re-write rules in a file371245
|
||||
Ref: #re-write-rules-in-a-file371408
|
||||
Node: Diff output format372557
|
||||
Ref: #diff-output-format372740
|
||||
Node: rewrite vs print --auto373832
|
||||
Ref: #rewrite-vs.-print---auto373992
|
||||
Node: roi374548
|
||||
Ref: #roi374655
|
||||
Node: Spaces and special characters in --inv and --pnl376467
|
||||
Ref: #spaces-and-special-characters-in---inv-and---pnl376707
|
||||
Node: Semantics of --inv and --pnl377195
|
||||
Ref: #semantics-of---inv-and---pnl377434
|
||||
Node: IRR and TWR explained379284
|
||||
Ref: #irr-and-twr-explained379444
|
||||
Node: stats382697
|
||||
Ref: #stats382805
|
||||
Node: tags384319
|
||||
Ref: #tags-1384426
|
||||
Node: test385435
|
||||
Ref: #test385528
|
||||
Node: PART 5 COMMON TASKS386270
|
||||
Ref: #part-5-common-tasks386416
|
||||
Node: Getting help386714
|
||||
Ref: #getting-help386855
|
||||
Node: Constructing command lines387615
|
||||
Ref: #constructing-command-lines387816
|
||||
Node: Starting a journal file388473
|
||||
Ref: #starting-a-journal-file388675
|
||||
Node: Setting LEDGER_FILE389877
|
||||
Ref: #setting-ledger_file390069
|
||||
Node: Setting opening balances391026
|
||||
Ref: #setting-opening-balances391227
|
||||
Node: Recording transactions394368
|
||||
Ref: #recording-transactions394557
|
||||
Node: Reconciling395113
|
||||
Ref: #reconciling395265
|
||||
Node: Reporting397522
|
||||
Ref: #reporting397671
|
||||
Node: Migrating to a new file401656
|
||||
Ref: #migrating-to-a-new-file401813
|
||||
Node: BUGS402112
|
||||
Ref: #bugs402202
|
||||
Node: Troubleshooting403081
|
||||
Ref: #troubleshooting403181
|
||||
Node: Skipping346597
|
||||
Ref: #skipping346707
|
||||
Node: Import testing349191
|
||||
Ref: #import-testing349351
|
||||
Node: Importing balance assignments350194
|
||||
Ref: #importing-balance-assignments350400
|
||||
Node: Commodity display styles351049
|
||||
Ref: #commodity-display-styles351222
|
||||
Node: incomestatement351351
|
||||
Ref: #incomestatement351493
|
||||
Node: notes352824
|
||||
Ref: #notes352946
|
||||
Node: payees353308
|
||||
Ref: #payees353423
|
||||
Node: prices353942
|
||||
Ref: #prices354057
|
||||
Node: print354710
|
||||
Ref: #print354825
|
||||
Node: print explicitness355801
|
||||
Ref: #print-explicitness355944
|
||||
Node: print amount style356723
|
||||
Ref: #print-amount-style356893
|
||||
Node: print parseability357963
|
||||
Ref: #print-parseability358135
|
||||
Node: print other features358884
|
||||
Ref: #print-other-features359063
|
||||
Node: print output format359584
|
||||
Ref: #print-output-format359732
|
||||
Node: register362871
|
||||
Ref: #register362993
|
||||
Node: Custom register output368024
|
||||
Ref: #custom-register-output368155
|
||||
Node: rewrite369502
|
||||
Ref: #rewrite369620
|
||||
Node: Re-write rules in a file371518
|
||||
Ref: #re-write-rules-in-a-file371681
|
||||
Node: Diff output format372830
|
||||
Ref: #diff-output-format373013
|
||||
Node: rewrite vs print --auto374105
|
||||
Ref: #rewrite-vs.-print---auto374265
|
||||
Node: roi374821
|
||||
Ref: #roi374928
|
||||
Node: Spaces and special characters in --inv and --pnl376740
|
||||
Ref: #spaces-and-special-characters-in---inv-and---pnl376980
|
||||
Node: Semantics of --inv and --pnl377468
|
||||
Ref: #semantics-of---inv-and---pnl377707
|
||||
Node: IRR and TWR explained379557
|
||||
Ref: #irr-and-twr-explained379717
|
||||
Node: stats382970
|
||||
Ref: #stats383078
|
||||
Node: tags384592
|
||||
Ref: #tags-1384699
|
||||
Node: test385708
|
||||
Ref: #test385801
|
||||
Node: PART 5 COMMON TASKS386543
|
||||
Ref: #part-5-common-tasks386689
|
||||
Node: Getting help386987
|
||||
Ref: #getting-help387128
|
||||
Node: Constructing command lines387888
|
||||
Ref: #constructing-command-lines388089
|
||||
Node: Starting a journal file388746
|
||||
Ref: #starting-a-journal-file388948
|
||||
Node: Setting LEDGER_FILE390150
|
||||
Ref: #setting-ledger_file390342
|
||||
Node: Setting opening balances391299
|
||||
Ref: #setting-opening-balances391500
|
||||
Node: Recording transactions394641
|
||||
Ref: #recording-transactions394830
|
||||
Node: Reconciling395386
|
||||
Ref: #reconciling395538
|
||||
Node: Reporting397795
|
||||
Ref: #reporting397944
|
||||
Node: Migrating to a new file401929
|
||||
Ref: #migrating-to-a-new-file402086
|
||||
Node: BUGS402385
|
||||
Ref: #bugs402475
|
||||
Node: Troubleshooting403354
|
||||
Ref: #troubleshooting403454
|
||||
|
||||
End Tag Table
|
||||
|
||||
|
||||
@ -7719,21 +7719,26 @@ PART 4: COMMANDS
|
||||
Note you can import from any file format, though CSV files are the most
|
||||
common import source, and these docs focus on that case.
|
||||
|
||||
"Deduplication"
|
||||
Skipping
|
||||
import tries to import only the transactions which are new since the
|
||||
last import. So if your bank's CSV includes the last three months of
|
||||
data, you can download and import it every month (or week, or day) and
|
||||
only the new transactions will be imported each time.
|
||||
last import, "skipping over" any that it saw last time. So if your
|
||||
bank's CSV includes the last three months of data, you can download and
|
||||
import it every month (or week, or day) and only the new transactions
|
||||
will be imported each time.
|
||||
|
||||
It works as follows. For each imported FILE (usually a CSV file): - It
|
||||
tries to find the latest date seen previously, by reading it from a
|
||||
hidden .latest.FILE in the same directory. - Then it processes FILE,
|
||||
ignoring any transactions on or before the "latest seen" date.
|
||||
It works as follows. For each imported FILE:
|
||||
|
||||
o It tries to find the latest date seen previously, by reading it from
|
||||
a hidden .latest.FILE in the same directory.
|
||||
|
||||
o Then it processes FILE, ignoring any transactions on or before the
|
||||
"latest seen" date.
|
||||
|
||||
And after a successful import, it updates the .latest.FILE(s) for next
|
||||
time (unless --dry-run was used).
|
||||
|
||||
This is simple but fairly effective. It assumes:
|
||||
This is simple system that works fairly well for transaction data (usu-
|
||||
ally CSV, but it could be any of hledger's input formats). It assumes:
|
||||
|
||||
1. new items always have the newest dates
|
||||
|
||||
@ -7749,11 +7754,15 @@ PART 4: COMMANDS
|
||||
|
||||
Note, import avoids reprocessing the same dates across successive runs,
|
||||
but it does not detect transactions that are duplicated within a single
|
||||
run. So eg if you downloaded but did not import bank.1.csv, and later
|
||||
downloaded bank.2.csv with overlapping data, you should not import both
|
||||
of them in a single run (hledger import bank.1.csv bank.2.csv); in-
|
||||
stead, import them one at a time (hledger import bank.1.csv, then
|
||||
hledger import bank.2.csv).
|
||||
run. I'll call these "skipping" and "deduplication".
|
||||
|
||||
So for example, say you downloaded but did not import bank.1.csv, and
|
||||
later downloaded bank.2.csv with overlapping data. Then you should not
|
||||
import both of them at once (hledger import bank.1.csv bank.2.csv), as
|
||||
the overlapping data would appear twice and not be deduplicated. In-
|
||||
stead, import them one at a time (hledger import bank.1.csv; hledger
|
||||
import bank.2.csv), and the second import will skip the overlapping
|
||||
data.
|
||||
|
||||
Normally you can ignore the .latest.* files, but if needed, you can
|
||||
delete them (to make all transactions unseen), or construct/modify them
|
||||
@ -7763,7 +7772,7 @@ PART 4: COMMANDS
|
||||
ring on that date".
|
||||
|
||||
(hledger print --new also uses and updates these .latest.* files, but
|
||||
it is not often used.)
|
||||
it is less often used.)
|
||||
|
||||
Related: CSV > Working with CSV > Deduplicating, importing.
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user