journal: re-add non-regex aliases, as default (#252)

The regex account aliases added in 0.24 trip up people switching between
hledger and Ledger. (Also they are currently slow).

This change makes the old non-regex aliases the default; they are
unsurprising, useful, and pretty close in functionality to Ledger's.

The new regex aliases are also available; they must be enclosed in
forward slashes. Ledger effectively ignores these, which is ok.

Also clarify docs, refactor, and use the same parser for alias
directives and alias options
This commit is contained in:
Simon Michael 2015-05-14 12:50:32 -07:00
parent 70d87613f2
commit 077e3c6a02
8 changed files with 167 additions and 86 deletions

View File

@ -460,55 +460,77 @@ In [tag queries](manual#queries), remember the tag name must match exactly, whil
##### Account aliases ##### Account aliases
You can define account aliases to rewrite account names. For a quick example, You can define aliases which rewrite your account names (after reading the journal,
see [How to use account aliases](how-to-use-account-aliases.html). before generating reports). hledger's account aliases can be useful for:
In hledger, this feature is quite powerful and requires a little care. - expanding shorthand account names to their full form, allowing easier data entry and a less verbose journal
It can be used for - adapting old journals to your current chart of accounts
- experimenting with new account organisations, like a new hierarchy or combining two accounts into one
- customising reports
- expanding shorthand account names to their full form, so your entries require less typing See also [How to use account aliases](how-to-use-account-aliases.html).
- adjusting old data to match your current chart of accounts, which tends to change over time
- experimenting with new account organisations
- massaging reports, both cosmetic changes and deeper ones ("combine these separate accounts into one")
An account alias can be defined on the command line: ###### Basic aliases
```shell
$ hledger --alias 'REGEX=REPLACEMENT' balance To set an account alias, use the `alias` directive in your journal file.
``` This affects all subsequent journal entries in the current file or its
or with a directive in the journal file: [included files](#including-other-files).
``` The spaces around the = are optional:
alias REGEX = REPLACEMENT
``` alias OLD = NEW
Or, you can use the `--alias` option on the command line.
This affects all entries. It's useful for trying out aliases interactively:
--alias 'OLD=NEW'
OLD and NEW are full account names.
hledger will replace any occurrence of the old account name with the
new one. Subaccounts are also affected. Eg:
alias checking = assets:bank:wells fargo:checking
# rewrites "checking" to "assets:bank:wells fargo:checking", or "checking:a" to "assets:bank:wells fargo:checking:a"
###### Regex aliases
There is also a more powerful variant that uses a regular expression,
indicated by the forward slashes. (This was the default behaviour in hledger 0.24-0.25):
alias /REGEX/ = REPLACEMENT
or:
--alias '/REGEX/=REPLACEMENT'
<!-- (Can also be written `'/REGEX/REPLACEMENT/'`). -->
REGEX is a case-insensitive regular expression. Anywhere it matches
inside an account name, the matched part will be replaced by
REPLACEMENT.
If REGEX contains parenthesised match groups, these can be referenced
by the usual numeric backreferences in REPLACEMENT.
Note, currently regular expression aliases may cause noticeable slow-downs.
(And if you use Ledger on your hledger file, they will be ignored.)
Eg: Eg:
alias ^expenses = equity:draw:personal alias /^(.+):bank:([^:]+)(.*)/ = \1:\2 \3
# rewrites "assets:bank:wells fargo:checking" to "assets:wells fargo checking"
Spaces around the = are optional and ignored. ###### Multiple aliases
You can define as many aliases as you like.
Each alias is tested against each account name as those are read from the journal. You can define as many aliases as you like using directives or command-line options.
When REGEX (a case-insensitive regular expression) matches Aliases are recursive - each alias sees the result of applying previous ones.
anywhere within the account name, the matched part is replaced by (This is different from Ledger, where aliases are non-recursive by default).
REPLACEMENT. Aliases are applied in the following order:
An alias can replace multiple matches in one account name.
REGEX can contain parenthesised match groups, and REPLACEMENT can
include these with a numeric backreference (like `\1`).
An alias becomes active when it is read, and affects all entries 1. alias directives, most recently seen first (recent directives take precedence over earlier ones; directives not yet seen are ignored)
read after it. It will also affect the entries of any files [included](#including-other-files) 2. alias options, in the order they appear on the command line
after it. It will not affect a parent file (aliases do not "leak"
upward). To forget all aliases defined to this point, use this ###### end aliases
directive:
You can clear (forget) all currently defined aliases with the `end aliases` directive:
end aliases end aliases
Active aliases are applied in the order they were defined, and are
cumulative (each alias sees the result of applying the previous ones).
Account aliases changed significantly in hledger 0.24 and are
currently somewhat incompatible with Ledger's aliases, which do not
use regular expressions. They can also hurt performance.
##### Default commodity ##### Default commodity
You can set a default commodity, to be used for amounts without one. You can set a default commodity, to be used for amounts without one.

View File

@ -392,10 +392,10 @@ journalApplyAliases aliases j@Journal{jtxns=ts} =
-- else (dbgtrace $ -- else (dbgtrace $
-- "applying additional command-line aliases:\n" -- "applying additional command-line aliases:\n"
-- ++ chomp (unlines $ map (" "++) $ lines $ ppShow aliases))) $ -- ++ chomp (unlines $ map (" "++) $ lines $ ppShow aliases))) $
j{jtxns=map fixtransaction ts} j{jtxns=map dotransaction ts}
where where
fixtransaction t@Transaction{tpostings=ps} = t{tpostings=map fixposting ps} dotransaction t@Transaction{tpostings=ps} = t{tpostings=map doposting ps}
fixposting p@Posting{paccount=a} = p{paccount=accountNameApplyAliases aliases a} doposting p@Posting{paccount=a} = p{paccount= accountNameApplyAliases aliases a}
-- | Do post-parse processing on a journal to make it ready for use: check -- | Do post-parse processing on a journal to make it ready for use: check
-- all transactions balance, canonicalise amount formats, close any open -- all transactions balance, canonicalise amount formats, close any open

View File

@ -37,7 +37,6 @@ module Hledger.Data.Posting (
joinAccountNames, joinAccountNames,
concatAccountNames, concatAccountNames,
accountNameApplyAliases, accountNameApplyAliases,
accountNameApplyOneAlias,
-- * arithmetic -- * arithmetic
sumPostings, sumPostings,
-- * rendering -- * rendering
@ -219,22 +218,26 @@ concatAccountNames :: [AccountName] -> AccountName
concatAccountNames as = accountNameWithPostingType t $ intercalate ":" $ map accountNameWithoutPostingType as concatAccountNames as = accountNameWithPostingType t $ intercalate ":" $ map accountNameWithoutPostingType as
where t = headDef RegularPosting $ filter (/= RegularPosting) $ map accountNamePostingType as where t = headDef RegularPosting $ filter (/= RegularPosting) $ map accountNamePostingType as
-- | Rewrite an account name using all applicable aliases from the given list, in sequence. -- | Rewrite an account name using all matching aliases from the given list, in sequence.
-- Each alias sees the result of applying the previous aliases.
accountNameApplyAliases :: [AccountAlias] -> AccountName -> AccountName accountNameApplyAliases :: [AccountAlias] -> AccountName -> AccountName
accountNameApplyAliases aliases a = accountNameWithPostingType atype aname' accountNameApplyAliases aliases a = accountNameWithPostingType atype aname'
where where
(aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a) (aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a)
matchingaliases = filter (\(re,_) -> regexMatchesCI re aname) aliases aname' = foldl
aname' = foldl (flip (uncurry regexReplaceCI)) aname matchingaliases (\acct alias -> dbg6 "got" $ aliasReplace (dbg6 "alias" alias) acct)
aname
aliases
-- aliasMatches :: AccountAlias -> AccountName -> Bool
-- aliasMatches (BasicAlias old _) a = old `isAccountNamePrefixOf` a
-- aliasMatches (RegexAlias re _) a = regexMatchesCI re a
aliasReplace :: AccountAlias -> AccountName -> AccountName
aliasReplace (BasicAlias old new) a | old `isAccountNamePrefixOf` a = new ++ drop (length old) a
| otherwise = a
aliasReplace (RegexAlias re repl) a = regexReplaceCI re repl a
-- | Rewrite an account name using the first applicable alias from the given list, if any.
accountNameApplyOneAlias :: [AccountAlias] -> AccountName -> AccountName
accountNameApplyOneAlias aliases a = accountNameWithPostingType atype aname'
where
(aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a)
firstmatchingalias = headDef Nothing $ map Just $ filter (\(re,_) -> regexMatchesCI re aname) aliases
applyAlias = uncurry regexReplaceCI
aname' = maybe id applyAlias firstmatchingalias $ aname
tests_Hledger_Data_Posting = TestList [ tests_Hledger_Data_Posting = TestList [

View File

@ -48,7 +48,16 @@ data Interval = NoInterval
type AccountName = String type AccountName = String
type AccountAlias = (Regexp,Replacement) data AccountAlias = BasicAlias AccountName AccountName
| RegexAlias Regexp Replacement
deriving (
Eq
,Read
,Show
,Ord
,Data
,Typeable
)
data Side = L | R deriving (Eq,Show,Read,Ord,Typeable,Data) data Side = L | R deriving (Eq,Show,Read,Ord,Typeable,Data)

View File

@ -25,6 +25,7 @@ module Hledger.Read (
mamountp', mamountp',
numberp, numberp,
codep, codep,
accountaliasp,
-- * Tests -- * Tests
samplejournal, samplejournal,
tests_Hledger_Read, tests_Hledger_Read,

View File

@ -37,7 +37,8 @@ module Hledger.Read.JournalReader (
mamountp', mamountp',
numberp, numberp,
emptyorcommentlinep, emptyorcommentlinep,
followingcommentp followingcommentp,
accountaliasp
#ifdef TESTS #ifdef TESTS
-- * Tests -- * Tests
-- disabled by default, HTF not available on windows -- disabled by default, HTF not available on windows
@ -243,13 +244,34 @@ aliasdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdat
aliasdirective = do aliasdirective = do
string "alias" string "alias"
many1 spacenonewline many1 spacenonewline
orig <- many1 $ noneOf "=" alias <- accountaliasp
char '=' addAccountAlias alias
alias <- restofline
addAccountAlias (accountNameWithoutPostingType $ strip orig
,accountNameWithoutPostingType $ strip alias)
return $ return id return $ return id
accountaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
accountaliasp = regexaliasp <|> basicaliasp
basicaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
basicaliasp = do
-- pdbg 0 "basicaliasp"
old <- rstrip <$> (many1 $ noneOf "=")
char '='
many spacenonewline
new <- rstrip <$> anyChar `manyTill` eolof -- don't require a final newline, good for cli options
return $ BasicAlias old new
regexaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
regexaliasp = do
-- pdbg 0 "regexaliasp"
char '/'
re <- many1 $ noneOf "/\n\r" -- paranoid: don't try to read past line end
char '/'
many spacenonewline
char '='
many spacenonewline
repl <- rstrip <$> anyChar `manyTill` eolof
return $ RegexAlias re repl
endaliasesdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdate endaliasesdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdate
endaliasesdirective = do endaliasesdirective = do
string "end aliases" string "end aliases"

View File

@ -360,17 +360,9 @@ getCliOpts mode' = do
-- CliOpts accessors -- CliOpts accessors
-- | Get the account name aliases from options, if any. -- | Get the account name aliases from options, if any.
aliasesFromOpts :: CliOpts -> [(AccountName,AccountName)] aliasesFromOpts :: CliOpts -> [AccountAlias]
aliasesFromOpts = map parseAlias . alias_ aliasesFromOpts = map (\a -> fromparse $ runParser accountaliasp () ("--alias "++quoteIfNeeded a) a)
where . alias_
-- similar to ledgerAlias
parseAlias :: String -> (AccountName,AccountName)
parseAlias s = (accountNameWithoutPostingType $ strip orig
,accountNameWithoutPostingType $ strip alias')
where
(orig, alias) = break (=='=') s
alias' = case alias of ('=':rest) -> rest
_ -> orig
-- | Get the (tilde-expanded, absolute) journal file path from -- | Get the (tilde-expanded, absolute) journal file path from
-- 1. options, 2. an environment variable, or 3. the default. -- 1. options, 2. an environment variable, or 3. the default.

View File

@ -1,22 +1,54 @@
# alias-related tests # alias-related tests
# 1. alias directive. The pattern is a case-insensitive regular # . simple alias directive
# expression matching anywhere in the account name. All matching hledgerdev -f- accounts
# aliases will be applied to an account name in turn, most recently <<<
# declared first. The replacement can replace multiple matches within alias checking = assets:bank:checking
# the account name. The replacement pattern supports numeric 1/1
# backreferences. (checking:a) 1
>>>
assets:bank:checking:a
>>>=0
# . simple alias matches whole account name components only
hledgerdev -f- accounts
<<<
alias a:b = A:B
1/1
(a:b:c) 1 ; should match this
1/1
(a:bb:d) 1 ; should not match this
>>>
A:B:c
a:bb:d
>>>=0
# . regex alias directive
hledgerdev -f- accounts
<<<
alias /^(.+):bank:([^:]+):?(.*)/ = \1:\2 \3
1/1
(assets:bank:B:checking:a) 1
>>>
assets:B checking:a
>>>=0
# . regex alias pattern is a case-insensitive regular expression
# matching anywhere in the account name. All matching aliases are
# applied to an account name in turn, most recently seen first. The
# replacement can replace multiple matches within the account name.
# The replacement pattern supports numeric backreferences.
# #
hledgerdev -f- print hledgerdev -f- print
<<< <<<
alias a=b alias /a/ = b
2011/01/01 2011/01/01
A a 1 A a 1
a a 2 a a 2
c c
alias A (.)=\1 alias /A (.)/=\1
2011/01/01 2011/01/01
A a 1 A a 1
@ -36,10 +68,10 @@ alias A (.)=\1
>>>=0 >>>=0
# 2. command-line --alias option. These are applied in the order # . --alias command-line options are applied in the order written.
# written. Spaces are allowed if quoted. # Spaces are allowed if quoted.
# #
hledgerdev -f- print --alias 'A (.)=a' --alias a=b hledgerdev -f- print --alias '/A (.)/=a' --alias /a/=b
<<< <<<
2011/01/01 2011/01/01
a a 1 a a 1
@ -54,12 +86,12 @@ hledgerdev -f- print --alias 'A (.)=a' --alias a=b
>>>=0 >>>=0
# 3. Alias options run after alias directives. # . alias options are applied after alias directives.
# #
hledgerdev -f- print --alias a=A --alias B=C --alias B=D --alias C=D hledgerdev -f- print --alias /a/=A --alias /B/=C --alias /B/=D --alias /C/=D
<<< <<<
alias ^a=B alias /^a/=B
alias ^a=E alias /^a/=E
alias E=F alias E=F
2011/01/01 2011/01/01