Regular expressions are pattern matching utilities found in most of the programming languages. They define a generic pattern to match a sequence of input characters. Regex are widely used in text parsing and search.

The Regex class in scala is available in scala.util.matching package.

Consider an example of how to find a word below.

import scala.util.matching.Regex
object findWord {
   def main(args: Array[String]) {
  	val p = "Functional".r
  	val st = "Scala is a Functional Programming Language"
  	println(p findFirstIn st)

Below image shows the output produced when we execute this object main method.


In the above example we are finding the word “functional” . We invoke the r() method which converts string to RichString and invokes the instance of Regex. The findFirstIn method finds the first occurrence of the pattern. To find all the occurrences use finadAllIn() method.

If there is a match, scala returns an object. To return the actual string, we use mkString.

The mkString method concatenates the resulting set. Pipe (|) symbol can be used to specify the OR search condition. For example, small and capital case of the letter ‘S’ in the word ‘Scala’. Instead of using r() constructor the Regex constructor can be used.

Consider an example using regex constructor;

import scala.util.matching.Regex
object multipleoccurence {
   def main(args: Array[String]) {
      val p = new Regex("(S|s)tudent")
      val st = "Student Id is unique. Students are interested in learning new things"
      println((p findAllIn st).mkString(","))

Above main method will produce output as;


The replaceFirstIn( ) can be used to replace the first occurrence of the matching word and replaceAllIn( ) replaces all the occurrences.

Consider an example below.

object Replace {
   def main(args: Array[String]) {
  	val p = "Car".r
  	val st = "Car has power windows"
  	println(p replaceFirstIn(st, "Alto"))


Alto has power windows

Here the word Car is replaced by Alto using replaceFirstIn method.

Forming Regular Expressions

The following regular expression operators are supported in Scala.

. – Matches any single character except newline

$ – Matches end of line

^ – Matches beginning of line

[…] – Matches any single character in brackets

[^…] – Matches any single character excluding the characters in brackets

\A – Matches the string beginning with A

re* – Matches 0 or more occurrences of preceding expression

re+ – Matches 1 or more of the previous thing

re? – Matches 0 or 1 occurrence of preceding expression

re{n} – Matches exactly n number of occurrences of preceding expression

re{n,} – Matches n or more occurrences of preceding expression

re{n,m} – Matches at least n and at most m occurrences of preceding expression

x|y – Matches either x or y

(re) – Groups regular expressions and remembers matched text

(?: re) – Groups regular expressions without remembering matched text

(?> re) – Matches independent pattern without backtracking

\w – Matches word characters

\W – Matches nonword characters

\s – Matches whitespace. Equivalent to [tnrf]

\S – Matches nonwhitespace

\d – Matches digits. Equivalent to [0-9]

\D – Matches nondigits.

\A – Matches beginning of string

\Z – Matches end of string. If a newline exists, it matches just before newline

\z – Matches end of string

\G – Matches point where last match finished

\n – Back-reference to capture group number “n”

\b – Matches word boundaries when outside brackets

\B- Matches nonword boundaries

\n, \t, etc.- Matches newlines, carriage returns, tabs, etc.

\Q – Escape (quote) all characters up to \E

\E – Ends quoting begun with \Q

Consider an example which matches all the occurrence of the pattern in the statement.

import scala.util.matching.Regex
object findAll {
   def main(args: Array[String]) {
  	val p = new Regex("al+")
  	val st = "Scala is a Functional programming language"
  	println((p findAllIn st).mkString(","))



That’s all for a quick roundup on Scala Regular Expression, we will look into Scala Extractors in next article.

By admin

Leave a Reply

%d bloggers like this: