Thanks to Mark Wilkins for inpsiration, but here's a shorter bit of code for doing it:
irb(main):015:0> s = "split on the word on okay?"
=> "split on the word on okay?"
irb(main):016:0> b=[]; s.split(/(on)/).each_slice(2) { |s| b << s.join }; b
=> ["split on", " the word on", " okay?"]
or:
s.split(/(on)/).each_slice(2).map(&:join)
See below the fold for an explanation.
Here's how this works. First, we split on "on", but wrap it in parentheses to make it into a match group. When there's a match group in the regular expression passed to split
, Ruby will include that group in the output:
s.split(/(on)/)
# => ["split", "on", "the word", "on", "okay?"
Now we want to join each instance of "on" with the preceding string. each_slice(2)
helps by passing two elements at a time to its block. Let's just invoke each_slice(2)
to see what results. Since each_slice
, when invoked without a block, will return an enumerator, we'll apply to_a
to the Enumerator so we can see what the Enumerator will enumerator over:
s.split(/(on)/).each_slice(2).to_a
# => [["split", "on"], ["the word", "on"], ["okay?"]]
We're getting close. Now all we have to do is join the words together. And that gets us to the full solution above. I'll unwrap it into individual lines to make it easier to follow:
b = []
s.split(/(on)/).each_slice(2) do |s|
b << s.join
end
b
# => ["split on", "the word on" "okay?"]
But there's a nifty way to eliminate the temporary b
and shorten the code considerably:
s.split(/(on)/).each_slice(2).map do |a|
a.join
end
map
passes each element of its input array to the block; the result of the block becomes the new element at that position in the output array. In MRI >= 1.8.7, you can shorten it even more, to the equivalent:
s.split(/(on)/).each_slice(2).map(&:join)