From 077e3c6a0217e7b60aa429c98094b87125f6a199 Mon Sep 17 00:00:00 2001 From: Simon Michael Date: Thu, 14 May 2015 12:50:32 -0700 Subject: [PATCH] journal: re-add non-regex aliases, as default (#252) The regex account aliases added in 0.24 trip up people switching between hledger and Ledger. (Also they are currently slow). This change makes the old non-regex aliases the default; they are unsurprising, useful, and pretty close in functionality to Ledger's. The new regex aliases are also available; they must be enclosed in forward slashes. Ledger effectively ignores these, which is ok. Also clarify docs, refactor, and use the same parser for alias directives and alias options --- doc/manual.md | 98 ++++++++++++++--------- hledger-lib/Hledger/Data/Journal.hs | 6 +- hledger-lib/Hledger/Data/Posting.hs | 27 ++++--- hledger-lib/Hledger/Data/Types.hs | 11 ++- hledger-lib/Hledger/Read.hs | 1 + hledger-lib/Hledger/Read/JournalReader.hs | 34 ++++++-- hledger/Hledger/Cli/Options.hs | 14 +--- tests/misc/aliases.test | 62 ++++++++++---- 8 files changed, 167 insertions(+), 86 deletions(-) diff --git a/doc/manual.md b/doc/manual.md index 84d370fcb..a8608c029 100644 --- a/doc/manual.md +++ b/doc/manual.md @@ -460,55 +460,77 @@ In [tag queries](manual#queries), remember the tag name must match exactly, whil ##### Account aliases -You can define account aliases to rewrite account names. For a quick example, -see [How to use account aliases](how-to-use-account-aliases.html). +You can define aliases which rewrite your account names (after reading the journal, +before generating reports). hledger's account aliases can be useful for: -In hledger, this feature is quite powerful and requires a little care. -It can be used for +- expanding shorthand account names to their full form, allowing easier data entry and a less verbose journal +- adapting old journals to your current chart of accounts +- experimenting with new account organisations, like a new hierarchy or combining two accounts into one +- customising reports -- expanding shorthand account names to their full form, so your entries require less typing -- adjusting old data to match your current chart of accounts, which tends to change over time -- experimenting with new account organisations -- massaging reports, both cosmetic changes and deeper ones ("combine these separate accounts into one") +See also [How to use account aliases](how-to-use-account-aliases.html). -An account alias can be defined on the command line: -```shell -$ hledger --alias 'REGEX=REPLACEMENT' balance -``` -or with a directive in the journal file: -``` -alias REGEX = REPLACEMENT -``` +###### Basic aliases + +To set an account alias, use the `alias` directive in your journal file. +This affects all subsequent journal entries in the current file or its +[included files](#including-other-files). +The spaces around the = are optional: + + alias OLD = NEW + +Or, you can use the `--alias` option on the command line. +This affects all entries. It's useful for trying out aliases interactively: + + --alias 'OLD=NEW' + +OLD and NEW are full account names. +hledger will replace any occurrence of the old account name with the +new one. Subaccounts are also affected. Eg: + + alias checking = assets:bank:wells fargo:checking + # rewrites "checking" to "assets:bank:wells fargo:checking", or "checking:a" to "assets:bank:wells fargo:checking:a" + +###### Regex aliases + +There is also a more powerful variant that uses a regular expression, +indicated by the forward slashes. (This was the default behaviour in hledger 0.24-0.25): + + alias /REGEX/ = REPLACEMENT + +or: + + --alias '/REGEX/=REPLACEMENT' + + +REGEX is a case-insensitive regular expression. Anywhere it matches +inside an account name, the matched part will be replaced by +REPLACEMENT. +If REGEX contains parenthesised match groups, these can be referenced +by the usual numeric backreferences in REPLACEMENT. +Note, currently regular expression aliases may cause noticeable slow-downs. +(And if you use Ledger on your hledger file, they will be ignored.) Eg: - alias ^expenses = equity:draw:personal + alias /^(.+):bank:([^:]+)(.*)/ = \1:\2 \3 + # rewrites "assets:bank:wells fargo:checking" to "assets:wells fargo checking" -Spaces around the = are optional and ignored. -You can define as many aliases as you like. +###### Multiple aliases -Each alias is tested against each account name as those are read from the journal. -When REGEX (a case-insensitive regular expression) matches -anywhere within the account name, the matched part is replaced by -REPLACEMENT. -An alias can replace multiple matches in one account name. -REGEX can contain parenthesised match groups, and REPLACEMENT can -include these with a numeric backreference (like `\1`). +You can define as many aliases as you like using directives or command-line options. +Aliases are recursive - each alias sees the result of applying previous ones. +(This is different from Ledger, where aliases are non-recursive by default). +Aliases are applied in the following order: -An alias becomes active when it is read, and affects all entries -read after it. It will also affect the entries of any files [included](#including-other-files) -after it. It will not affect a parent file (aliases do not "leak" -upward). To forget all aliases defined to this point, use this -directive: +1. alias directives, most recently seen first (recent directives take precedence over earlier ones; directives not yet seen are ignored) +2. alias options, in the order they appear on the command line + +###### end aliases + +You can clear (forget) all currently defined aliases with the `end aliases` directive: end aliases -Active aliases are applied in the order they were defined, and are -cumulative (each alias sees the result of applying the previous ones). - -Account aliases changed significantly in hledger 0.24 and are -currently somewhat incompatible with Ledger's aliases, which do not -use regular expressions. They can also hurt performance. - ##### Default commodity You can set a default commodity, to be used for amounts without one. diff --git a/hledger-lib/Hledger/Data/Journal.hs b/hledger-lib/Hledger/Data/Journal.hs index 0fdda868f..9f8d6b45b 100644 --- a/hledger-lib/Hledger/Data/Journal.hs +++ b/hledger-lib/Hledger/Data/Journal.hs @@ -392,10 +392,10 @@ journalApplyAliases aliases j@Journal{jtxns=ts} = -- else (dbgtrace $ -- "applying additional command-line aliases:\n" -- ++ chomp (unlines $ map (" "++) $ lines $ ppShow aliases))) $ - j{jtxns=map fixtransaction ts} + j{jtxns=map dotransaction ts} where - fixtransaction t@Transaction{tpostings=ps} = t{tpostings=map fixposting ps} - fixposting p@Posting{paccount=a} = p{paccount=accountNameApplyAliases aliases a} + dotransaction t@Transaction{tpostings=ps} = t{tpostings=map doposting ps} + doposting p@Posting{paccount=a} = p{paccount= accountNameApplyAliases aliases a} -- | Do post-parse processing on a journal to make it ready for use: check -- all transactions balance, canonicalise amount formats, close any open diff --git a/hledger-lib/Hledger/Data/Posting.hs b/hledger-lib/Hledger/Data/Posting.hs index ca1c20883..41faefa12 100644 --- a/hledger-lib/Hledger/Data/Posting.hs +++ b/hledger-lib/Hledger/Data/Posting.hs @@ -37,7 +37,6 @@ module Hledger.Data.Posting ( joinAccountNames, concatAccountNames, accountNameApplyAliases, - accountNameApplyOneAlias, -- * arithmetic sumPostings, -- * rendering @@ -219,22 +218,26 @@ concatAccountNames :: [AccountName] -> AccountName concatAccountNames as = accountNameWithPostingType t $ intercalate ":" $ map accountNameWithoutPostingType as where t = headDef RegularPosting $ filter (/= RegularPosting) $ map accountNamePostingType as --- | Rewrite an account name using all applicable aliases from the given list, in sequence. +-- | Rewrite an account name using all matching aliases from the given list, in sequence. +-- Each alias sees the result of applying the previous aliases. accountNameApplyAliases :: [AccountAlias] -> AccountName -> AccountName accountNameApplyAliases aliases a = accountNameWithPostingType atype aname' where (aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a) - matchingaliases = filter (\(re,_) -> regexMatchesCI re aname) aliases - aname' = foldl (flip (uncurry regexReplaceCI)) aname matchingaliases + aname' = foldl + (\acct alias -> dbg6 "got" $ aliasReplace (dbg6 "alias" alias) acct) + aname + aliases + +-- aliasMatches :: AccountAlias -> AccountName -> Bool +-- aliasMatches (BasicAlias old _) a = old `isAccountNamePrefixOf` a +-- aliasMatches (RegexAlias re _) a = regexMatchesCI re a + +aliasReplace :: AccountAlias -> AccountName -> AccountName +aliasReplace (BasicAlias old new) a | old `isAccountNamePrefixOf` a = new ++ drop (length old) a + | otherwise = a +aliasReplace (RegexAlias re repl) a = regexReplaceCI re repl a --- | Rewrite an account name using the first applicable alias from the given list, if any. -accountNameApplyOneAlias :: [AccountAlias] -> AccountName -> AccountName -accountNameApplyOneAlias aliases a = accountNameWithPostingType atype aname' - where - (aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a) - firstmatchingalias = headDef Nothing $ map Just $ filter (\(re,_) -> regexMatchesCI re aname) aliases - applyAlias = uncurry regexReplaceCI - aname' = maybe id applyAlias firstmatchingalias $ aname tests_Hledger_Data_Posting = TestList [ diff --git a/hledger-lib/Hledger/Data/Types.hs b/hledger-lib/Hledger/Data/Types.hs index a35575e70..57d906705 100644 --- a/hledger-lib/Hledger/Data/Types.hs +++ b/hledger-lib/Hledger/Data/Types.hs @@ -48,7 +48,16 @@ data Interval = NoInterval type AccountName = String -type AccountAlias = (Regexp,Replacement) +data AccountAlias = BasicAlias AccountName AccountName + | RegexAlias Regexp Replacement + deriving ( + Eq + ,Read + ,Show + ,Ord + ,Data + ,Typeable + ) data Side = L | R deriving (Eq,Show,Read,Ord,Typeable,Data) diff --git a/hledger-lib/Hledger/Read.hs b/hledger-lib/Hledger/Read.hs index a027b1a7e..555d34ce9 100644 --- a/hledger-lib/Hledger/Read.hs +++ b/hledger-lib/Hledger/Read.hs @@ -25,6 +25,7 @@ module Hledger.Read ( mamountp', numberp, codep, + accountaliasp, -- * Tests samplejournal, tests_Hledger_Read, diff --git a/hledger-lib/Hledger/Read/JournalReader.hs b/hledger-lib/Hledger/Read/JournalReader.hs index 8ba49fc3a..039d58289 100644 --- a/hledger-lib/Hledger/Read/JournalReader.hs +++ b/hledger-lib/Hledger/Read/JournalReader.hs @@ -37,7 +37,8 @@ module Hledger.Read.JournalReader ( mamountp', numberp, emptyorcommentlinep, - followingcommentp + followingcommentp, + accountaliasp #ifdef TESTS -- * Tests -- disabled by default, HTF not available on windows @@ -243,13 +244,34 @@ aliasdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdat aliasdirective = do string "alias" many1 spacenonewline - orig <- many1 $ noneOf "=" - char '=' - alias <- restofline - addAccountAlias (accountNameWithoutPostingType $ strip orig - ,accountNameWithoutPostingType $ strip alias) + alias <- accountaliasp + addAccountAlias alias return $ return id +accountaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias +accountaliasp = regexaliasp <|> basicaliasp + +basicaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias +basicaliasp = do + -- pdbg 0 "basicaliasp" + old <- rstrip <$> (many1 $ noneOf "=") + char '=' + many spacenonewline + new <- rstrip <$> anyChar `manyTill` eolof -- don't require a final newline, good for cli options + return $ BasicAlias old new + +regexaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias +regexaliasp = do + -- pdbg 0 "regexaliasp" + char '/' + re <- many1 $ noneOf "/\n\r" -- paranoid: don't try to read past line end + char '/' + many spacenonewline + char '=' + many spacenonewline + repl <- rstrip <$> anyChar `manyTill` eolof + return $ RegexAlias re repl + endaliasesdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdate endaliasesdirective = do string "end aliases" diff --git a/hledger/Hledger/Cli/Options.hs b/hledger/Hledger/Cli/Options.hs index 97b4a6179..01018f9b3 100644 --- a/hledger/Hledger/Cli/Options.hs +++ b/hledger/Hledger/Cli/Options.hs @@ -360,17 +360,9 @@ getCliOpts mode' = do -- CliOpts accessors -- | Get the account name aliases from options, if any. -aliasesFromOpts :: CliOpts -> [(AccountName,AccountName)] -aliasesFromOpts = map parseAlias . alias_ - where - -- similar to ledgerAlias - parseAlias :: String -> (AccountName,AccountName) - parseAlias s = (accountNameWithoutPostingType $ strip orig - ,accountNameWithoutPostingType $ strip alias') - where - (orig, alias) = break (=='=') s - alias' = case alias of ('=':rest) -> rest - _ -> orig +aliasesFromOpts :: CliOpts -> [AccountAlias] +aliasesFromOpts = map (\a -> fromparse $ runParser accountaliasp () ("--alias "++quoteIfNeeded a) a) + . alias_ -- | Get the (tilde-expanded, absolute) journal file path from -- 1. options, 2. an environment variable, or 3. the default. diff --git a/tests/misc/aliases.test b/tests/misc/aliases.test index eb5e01e72..79d90a36f 100644 --- a/tests/misc/aliases.test +++ b/tests/misc/aliases.test @@ -1,22 +1,54 @@ # alias-related tests -# 1. alias directive. The pattern is a case-insensitive regular -# expression matching anywhere in the account name. All matching -# aliases will be applied to an account name in turn, most recently -# declared first. The replacement can replace multiple matches within -# the account name. The replacement pattern supports numeric -# backreferences. +# . simple alias directive +hledgerdev -f- accounts +<<< +alias checking = assets:bank:checking +1/1 + (checking:a) 1 +>>> +assets:bank:checking:a +>>>=0 + +# . simple alias matches whole account name components only +hledgerdev -f- accounts +<<< +alias a:b = A:B +1/1 + (a:b:c) 1 ; should match this +1/1 + (a:bb:d) 1 ; should not match this +>>> +A:B:c +a:bb:d +>>>=0 + +# . regex alias directive +hledgerdev -f- accounts +<<< +alias /^(.+):bank:([^:]+):?(.*)/ = \1:\2 \3 +1/1 + (assets:bank:B:checking:a) 1 +>>> +assets:B checking:a +>>>=0 + +# . regex alias pattern is a case-insensitive regular expression +# matching anywhere in the account name. All matching aliases are +# applied to an account name in turn, most recently seen first. The +# replacement can replace multiple matches within the account name. +# The replacement pattern supports numeric backreferences. # hledgerdev -f- print <<< -alias a=b +alias /a/ = b 2011/01/01 A a 1 a a 2 c -alias A (.)=\1 +alias /A (.)/=\1 2011/01/01 A a 1 @@ -36,10 +68,10 @@ alias A (.)=\1 >>>=0 -# 2. command-line --alias option. These are applied in the order -# written. Spaces are allowed if quoted. +# . --alias command-line options are applied in the order written. +# Spaces are allowed if quoted. # -hledgerdev -f- print --alias 'A (.)=a' --alias a=b +hledgerdev -f- print --alias '/A (.)/=a' --alias /a/=b <<< 2011/01/01 a a 1 @@ -54,12 +86,12 @@ hledgerdev -f- print --alias 'A (.)=a' --alias a=b >>>=0 -# 3. Alias options run after alias directives. +# . alias options are applied after alias directives. # -hledgerdev -f- print --alias a=A --alias B=C --alias B=D --alias C=D +hledgerdev -f- print --alias /a/=A --alias /B/=C --alias /B/=D --alias /C/=D <<< -alias ^a=B -alias ^a=E +alias /^a/=B +alias /^a/=E alias E=F 2011/01/01