跳至主要內容

30. 串联所有单词的子串


30. 串联所有单词的子串

🔴   🔖  哈希表 字符串 滑动窗口  🔗 力扣open in new window LeetCodeopen in new window

题目

You are given a string s and an array of strings words. All the strings of words are of the same length.

A concatenated substring in s is a substring that contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef", "abefcd", "cdabef", "cdefab", "efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated substring because it is not the concatenation of any permutation of words.

Return the starting indices of all the concatenated substrings ins. You can return the answer in any order.

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]

Output: [0,9]

Explanation: Since words.length == 2 and words[i].length == 3, the concatenated substring has to be of length 6.

The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.

The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.

The output order does not matter. Returning [9,0] is fine too.

Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]

Output: []

Explanation: Since words.length == 4 and words[i].length == 4, the concatenated substring has to be of length 16.

There is no substring of length 16 in s that is equal to the concatenation of any permutation of words.

We return an empty array.

Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]

Output: [6,9,12]

Explanation: Since words.length == 3 and words[i].length == 3, the concatenated substring has to be of length 9.

The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"] which is a permutation of words.

The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"] which is a permutation of words.

The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"] which is a permutation of words.

Constraints:

  • 1 <= s.length <= 10^4
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.

题目大意

给定一个字符串 s 和一个字符串数组 wordswords 中所有字符串 长度相同。

  • s 中的 串联子串 是指一个包含 words 中所有字符串以任意顺序排列连接起来的子串。

例如,如果 words = ["ab","cd","ef"], 那么 "abcdef""abefcd""cdabef""cdefab""efabcd", 和 "efcdab" 都是串联子串。 "acdbef" 不是串联子串,因为他不是任何 words 排列的连接。 返回所有串联子串在 s 中的开始索引。你可以以 任意顺序 返回答案。

解题思路

此题是 第 438 题 找到字符串中所有字母异位词 的进阶版。不同的是第 438 题的元素是字母,而此题的元素是单词,可以用类似的滑动窗口方法来解此题。

  • 窗口大小固定:每次窗口大小等于 words.length * words[0].length
  • 使用哈希表统计 words 中每个单词的出现次数
  • 滑动窗口遍历 s,每次取出窗口内的子串,检查是否匹配 words
  1. 状态定义

    • need 记录 words 中单词出现的次数。
    • window 记录当前窗口内的单词是否匹配 need
  2. 移动 right 指针 扩大窗口,每次加 words[0].length 个字符。

  3. 当窗口大小超过 words.length * words[0].length 时,left 指针右移收缩窗口。

  4. 判断窗口内的子串是否匹配 words

    • 使用 need 统计 words 词频。
    • 使用 window 统计窗口内单词的词频。
    • window === need,记录 left 位置。

复杂度分析

  • 时间复杂度: O(n * wordLen)ns 长度,每个 i 遍历 O(n / wordLen)
  • 空间复杂度: O(words.length),存储 words 词频和 window 词频。

代码

/**
 * @param {string} s
 * @param {string[]} words
 * @return {number[]}
 */
var findSubstring = function (s, words) {
	let res = [];
	let wordLen = words[0].length;
	let totalLen = words.length * wordLen;
	let need = {};

	// 统计 words 中每个单词出现的次数
	for (let word of words) {
		need[word] = (need[word] || 0) + 1;
	}

	// 枚举 wordLen 个不同的起点
	for (let i = 0; i < wordLen; i++) {
		let left = i,
			right = i;
		let window = {};
		let count = 0;

		while (right + wordLen <= s.length) {
			let word = s.slice(right, right + wordLen);
			right += wordLen;

			if (need[word]) {
				window[word] = (window[word] || 0) + 1;
				count++;

				while (window[word] > need[word]) {
					let removeWord = s.slice(left, left + wordLen);
					window[removeWord]--;
					count--;
					left += wordLen;
				}

				if (count === words.length) res.push(left);
			} else {
				window = {};
				count = 0;
				left = right;
			}
		}
	}
	return res;
};

相关题目

题号标题题解标签难度力扣
76最小覆盖子串[✓]哈希表 字符串 滑动窗口🔴🀄️open in new window 🔗open in new window