Scala: Filter Strings and Lists with Regexes

Scala has a neat built-in function that turns a string into a regular expression (.r). If you use it on a regular expression with slashes in it, you’ll get errors like so:

<console>:1: error: invalid escape character

Thus for many regexes, it is preferable to use the multi-line string syntax, as this skips the escaping.

scala> """\w+""".r
res43: scala.util.matching.Regex = \w+

From this, we can do some neat things, like find the first word in a sentence:

x.findFirstMatchIn("abc def")
res44: Option[scala.util.matching.Regex.Match] = Some(abc)

We can also replace all matches in the string, so if we want to swap one word for another, we can:

x.replaceAllIn("abc def", "gary")
res46: String = gary gary

You can also apply this to every value in a list, if you want to filter to just items that match. For example, here we filter the list to just values with a single word:

List("multi word", "12345", "word")
res2: List[String] = List(12345, word)

The underscore in allows use to simplify the code, since there is only one thing we’re testing on the value (this is equivalent to writing a lambda, that looks like x => …findFirstIn(x) ).

Interested in Scala? I send out weekly, personalized emails with articles and conference talks. Click here to see an example and subscribe.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *