Description

LeetCode 30

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

Example 1:

Input:
  s = "barfoothefoobarman",
  words = ["foo","bar"]
Output: [0,9] Explanation: Substrings starting at index 0 and 9 are "barfoor" and "foobar" respectively. The output order does not matter, returning [9,0] is fine too.

Example 2:

Input:
  s = "wordgoodstudentgoodword",
  words = ["word","student"]
Output: []

Analysis

When I begin to think about this problem initially, I tried to use KMP matching algorithm to create a table. The table’s size equals to the string s. And the elements reflect the string that starts from it match which pattern successfully. Then find the different values in a specific cycle in this table to get the result. I implemented it and it seems to work. However, I found there may be replicated patterns so using string matching is not easy. And the latter work for the table is very trivial.

Finally, I found most of others’ solutions just use a method that is nearly brute force. Use a hashmap to store the patterns and their counts. And match an artificial pattern to the string. This string should be the concatenation of all patterns. So when matching it, we can just compare the relevent substring in s with the hashmap. Because we don’t need to consider about the order for the artificial pattern, KMP does not work for it. We must use brute force to match it.

Besides, when I went back to consider about the initial idea, I compared the time complexity. Set \(k\) as the length of the string and \(n\) as the number of patterns, then for the initial idea, just the process of creating the table has consumed \(O(kn)\) time. But for the brute force method, the process of creating hashmap needs \(O(n)\) time and the process of matching artificial patterns needs \(O(kn)\) time. So The process of this method needs \(O(kn)\) time. The brute force method is not slower!

From this problem, I believe when facing a new problem, maybe we should consider about the upper bound time first. Then when we try to come up with an efficient method, we should estimate the time and compare it with the upper bound. The brute force method is often the easiest to understand. If the “efficient” method is not easier than the brute force method, just use brute force.

Solution

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> res=new ArrayList<>();
        if(words.length==0||s.length()==0) return res;
        HashMap<String,Integer> map=new HashMap<>();
        for(int i=0;i<words.length;i++)
        {
            map.put(words[i],map.containsKey(words[i])?map.get(words[i])+1:1);
        }
        int n=words.length,m=words[0].length();
        HashMap<String,Integer> tmp;
        int flag;
        String stemp;
        for(int i=0;i<=s.length()-m*n;i++)
        {   
            tmp=new HashMap(map);
            flag=1;
            for(int j=i;j<i+m*n;j=j+m) { stemp=s.substring(j,j+m); if(tmp.containsKey(stemp)&&tmp.get(stemp)>1)
                {
                    tmp.put(stemp,tmp.get(stemp)-1);
                }
                else if(tmp.containsKey(stemp)&&tmp.get(stemp)==1)
                {
                    tmp.remove(stemp);
                }
                else
                {
                    flag=0;
                    break;
                }
            }
            if(flag==1)
            {
                res.add(i);
            }
        }
        return res;
    }
}

Reference

[1] https://www.geeksforgeeks.org/find-starting-indices-substrings-string-s-made-concatenating-words-listl/

Leave a Reply

Your email address will not be published. Required fields are marked *