Mark wrote:
> From a line of arbitrary text, possibly followed by some amount of
> text from the beginning of the string ' Reference #\d+', where \d+
> represents one or more digit characters, I want to output the line
> without the ending ' Reference...' string. For example, the input line
> 'some arbitrary text Refer' would become 'some arbitrary text'.
>
> Here are two programs that seem to do what I want, but they seem
> overly complicated for this task. I'm looking for a simpler solution,
> possibly by using a better regular expression than I have chosen in my
> first sample code.
After making the wrong turn first,
I think this can't be solved very
much different from your solution.
The Regex can be an incremental one
(as was shown already by others) or a
sequence of alternations (as you tried).
One could rewrite it somehow 'different',
as a "split", like:
use strict;
use warnings;
no warnings 'qw';
my @end = qw{R e f e r e n c e \\s # \\d+};
my $reg = '('.(join '|',map join('',@$_),map[@end[0..$_]],0..$#end).')$';
while( ) {
chomp;
print "[$_->[0]]\n\t[$_->[1]]\n" for
map [$_->[0]||'undef', $_->[1]||'undef'],
[split /$reg/]
}
__DATA__
...
Aside from the regex construction (which can be commented
properly ;-), this should be quite readable.
Regards
M.