Edit Line-Pairing Rule

Use the controls on this sheet to edit the definition of a line-pairing rule. Please see the Pairing topic for further information about the purpose of line-pairing rules.

Description

Use this field to provide a succinct description of the line-pairing rule.

Expression

Provide the regular expression definition in this field. A summary of regular expression syntax may be found below.

The status text underneath the entry field indicates whether the provided regular expression is syntactically correct.

By default, when a line matches the regular expression, all of the matching characters are used for line-pairing. However, it is also possible to use sub-expressions so that only parts of the text matched by the regular expression are used to pair lines. See Use sequences of characters matching these selected sub-expressions to pair lines below for further information.

Example line

You can enter a sample line of text into this field to see which parts will be matched by the regular expression you have entered and thus be used for line-pairing.

Use sequences of characters matching these selected sub-expressions to pair lines

By default, Merge will use the entire sequence of characters that matches a regular expression to pair lines. You may, however, wish to use only part of the text that was matched.

For example, if you wished to pair C/C++ #define directives based on the identifier being defined, you could use an expression like this:

^\s*#define\s+([[:alpha:]_][[:digit:][:alpha:]_]*)

This expression contains a single sub-expression, which is marked by its being enclosed in parentheses. This sub-expression is the #define directive’s identifier. Now consider the following example line when used with this regular expression:

#define SOME_IDENTIFIER123 200

The sub-expression list for this example line will show two entries. The first (All) is the sequence of characters (#define SOME_IDENTIFIER123) matched by the entire regular expression (i.e. the #define and the following identifier). The second (1) is the sequence of characters (SOME_IDENTIFIER123) matched by the sub-expression. You would likely therefore check the second item in the list (1), so that lines containing #define directives are paired (or not paired) based only upon their identifiers.

In this example, these characters would be used to pair lines between files

This field displays the effect of applying the regular expression (taking into account any selected sub-expressions) to the example line you entered in the Example line field above. Only the displayed text would be considered when trying to pair a line with the same content as the example line.

Regular expression syntax

The regular expression syntax used by Araxis Merge is the same as that used by many applications in the UNIX operating system. Regular expressions can be used to search for sequences of characters within a piece of text. They consist of simple text that will be matched literally, and special characters that have a particular meaning.

The rest of this topic contains example regular expressions. For more comprehensive information, please see the Regular Expression Reference.

Simple matches

To match lines containing the word apple:

apple

To match lines containing only the word apple:

^apple$

Matching whitespace

To match lines that are either completely empty, or that only contain whitespace (spaces and tab characters):

^[ \t]*$

Breakdown:

^ Match the start of the line.
[ \t]* Match zero or more space or tab (\t) characters.
$ Match the end of the line.

Matching C++ comments

To match lines that contain only a C++ style comment (//, followed by any characters up to the end of the line), the following expression can be used:

^[ \t]*//.*$

Breakdown:

^ Match the start of the line.
[ \t]* Match zero or more space or tab (\t) characters.
// Match two consecutive / characters.
.* Match zero or more occurrences of any character.
$ Match the end of the line.

Matching source code control keywords

Some version control products enable special keywords to be inserted into text files. Subversion, for example, will expand out a piece of text $Date$ so that it contains the date and time of the last check-in. When comparing different revisions of a file, lines containing these keywords will almost always be different and can be ignored. An expression to ignore the Date keyword when it appears in C++ comment lines follows:

^[ \t]*//.*\$Date:.*\$.*$

Breakdown:

^ Match the start of the line.
[ \t]* Match zero or more space or tab (\t) characters.
// Match two consecutive / characters.
.* Match zero or more occurrences of any character.
\$ Match the character $, not the end of line. Putting \ before a character means that the character is treated as literal. Any special meaning it might have had as a regular expression is removed.
Date: Match Date:
.* Match zero or more occurrences of any character.
\$ Match the literal character $.
.* Match zero or more occurrences of any character.
$ Match the end of the line.

Related expressions:

^[ \t]*//.*\$Archive:.*\$.*$
^[ \t]*//.*\$Author:.*\$.*$
^[ \t]*//.*\$Header:.*\$.*$
^[ \t]*//.*\$JustDate:.*\$.*$
^[ \t]*//.*\$Modtime:.*\$.*$
^[ \t]*//.*\$Revision:.*\$.*$
^[ \t]*//.*\$Workfile:.*\$.*$

Combining expressions

Several expressions can be combined in to one by using the parenthesis () and | characters:

(apple|^pear$)

Breakdown:

( Begins a group of expressions.
apple Match lines containing the word apple.
| Match lines that contain matches for the previous expression (apple) or the next one (^pear$).
^pear$ Match lines consisting of only the word pear.
) Ends the group.

This syntax enables larger expressions like the following to be constructed:

^[ \t]*//.*\$(Date|Archive|Author|Header|JustDate|Modtime|Revision|Workfile):.*\$.*$

It is almost always better for comparison performance if expressions are made as short as possible. The example above performs significantly better than the following:

(^[ \t]*//.*\$Date:.*\$.*$)|
(^[ \t]*//.*\$Archive:.*\$.*$)|
(^[ \t]*//.*\$Author:.*\$.*$)|
(^[ \t]*//.*\$Header:.*\$.*$)|
(^[ \t]*//.*\$JustDate:.*\$.*$)|
(^[ \t]*//.*\$Modtime:.*\$.*$)|
(^[ \t]*//.*\$Revision:.*\$.*$)|
(^[ \t]*//.*\$Workfile:.*\$.*$)