Description

LeetCode 10

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

Analysis

Although this problem is called Regular Expression Matching, it is not a extension of the general string matching problem. Instead, it is transformed from the problem of comparing two strings. If we catch this thing, we can easily come up with dynamic programming. Because comparing two strings needs the whole strings to be same, in which the substring now has influences to the next substring, that is to say, it has an optimal substructure.

Considering about the corner conditions, we can define a boolean array \(dp[s.length+1][p.length+1]\) with default value \(false\). And \(dp[0][0]\) should be \(true\).

First, let’s discuss about the corner case. We could implement dynamic programming generally except one condition. It is that the first character has a next character ‘*’. If it happens, even if the character does not match, it could be erased without any influences on the following parts of string. And when this process is finished, if the same thing happens on the new first character, we should do this again. This corner case can be solved by setting the specific element on the first row of \(dp\) array as \(true\).

Then in the general condition, we set \(i\) and \(j\) be the index of \(s\) and \(p\). So \(i+1\) and \(j+1\) are the index of relevent character of \(dp\). The general condition can be divided into two conditions.

  1. \(p[j]=s[i]\) or \(p[j]=”.”\)

Obviously, \(dp[i+1][j+1]=dp[i][j]\).

2. \(p[j]=”*”\)

If \(p[j-1]\neq s[i]\) and \(p[j-1]\neq “.”\), then \(dp[i+1][j+1]=dp[i+1][j-1]\). Because the character that has a next character ‘*’ could be erased if it is not matched.

Otherwise, now we know \(p[j-1]=s[i]\) or \(p[j-1]=”.”\), which means that \(p[j-1]\) could be matched. In this condition, if \(p[j-1]\) just repeats once, \(dp[i+1][j+1]=dp[i+1][j]\). But if \(p[j-1]\) has repeated for more than once, that means the multiple \(p[j-1]\) could be merged. We can implement this process by \(dp[i+1][j+1]=dp[i][j+1]\). Or it could be erased because the strings could be matched without repeated \(p[j-1]\), which is might because \(p[j-1]\) just appears in the past position. These condition could be directly merged. If one of the condition is \(true\), then the result is \(true\).

Obviously, the time and space are both \(O(m*n)\).

Solution

class Solution {
    public boolean isMatch(String s, String p) {
        while(s==null||p==null) return false;
        boolean[][] d=new boolean[s.length()+1][p.length()+1];
        d[0][0]=true;
        for(int i=0;i<p.length();i++)
        {
            if(p.charAt(i)=='*'&&d[0][i-1])
            {
                d[0][i+1]=true;
            }
        }
        for(int i=0;i<s.length();i++)
        {
            for(int j=0;j<p.length();j++)
            {
                if(s.charAt(i)==p.charAt(j)||p.charAt(j)=='.')
                {
                    d[i+1][j+1]=d[i][j];
                }
                else if(p.charAt(j)=='*')
                {
                    if(s.charAt(i)!=p.charAt(j-1)&&p.charAt(j-1)!='.')
                    {
                        d[i+1][j+1]=d[i+1][j-1];
                    }
                    else
                    {
                        d[i+1][j+1]=d[i+1][j]||d[i][j+1]||d[i+1][j-1];
                    }
                }
            }
        }
        return d[s.length()][p.length()];
    }
}

Reference

[1] https://github.com/EdwardShi92/Leetcode-Solution-Code/blob/master/RegularExpressionMatching.java

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *