窗口思想

2023-09-26 02:38| 来源: 网络整理| 查看: 265

目录 1-031 两个数和 1-032 连续正数序列和 1-033 leetcode-49. Group Anagrams 1-034 leetcode-3. Longest Substring Without Reapting Characters 1-005 leetcode-76.Minimum Window Substring 1-035 leetcode-30. Substring with Concatenation of All Words 1-031 两个数和

输入一个递增排列的数组和一个数字s，在数组中查找这两个数，使得它们得和正好是s。如果有多对数字得和等于s，输出任意一对即可。

数据结构

数组

算法

双指针 1.启发：假设已经有了这个数组的任意两个元素之和的有序数组。那么利用二分查找法进行查找，但是排序需要平方时间 2.这个二分查找法启发我们，可以直接对两个数字的和进行一个有序的遍历 1）双指针如果取两个最小（和最小）双指针如果取两个最大（和最大）双指针如果取最小和最大（和居中）对于平均查找时间来讲，居中的具有优势 2）证明双指针法为什么是有效的？ a.如果存在答案，那么0 (i) + (j) > target，也即任意移动i，所得的和都比target要大，因此排除掉了(j)位置的所有可能性只能递减j b) 若(i) + (j) < target 同理，那么递减j为j1，(i) + (j1) < (i) + (j) < target，也即任意移动j，所得的和都比target要小，因此排除掉了(i)位置的所有可能性只能递增i （根据证明可知：这个其实本质上也符合贪心的性质，最后转化为了更小子数组的查找）时间复杂度空间复杂度关键思路

在序列首尾定义指针p1, p2，如果和超过s，p2往中间移动，如果和小于s，p1往中间移动。

bool FindNumbersWithSum(int data[], int length, int sum, int* num1, int* num2) { bool found = false; if(length < 1 || num1 == NULL || num2 == NULL) return found; int ahead = length - 1; int behind = 0; while(ahead > behind) { long long curSum = data[ahead] + data[behind]; if(curSum == sum) { *num1 = data[behind]; *num2 = data[ahead]; found = true; break; } else if(curSum > sum) ahead --; else behind ++; } return found; } 1-032 连续正数序列和

输入一个正数s，打印出所有和为s的连续正数序列（至少含有两个数）。例如15，1+2+3+4+5 = 4 + 5 + 6 = 7 + 8 = 15，所以结果打印出三个连续序列1-5, 4-6, 7-8.

数据结构

数组

算法

双指针贪心 1）双指针法为什么可以找到所有的情况

这里有一个循环不变式： small没加一个，表示small-1的所有可能性已经穷尽；并且也已经穷尽了big从small + 1到当前值的可能性以target = 15为例说明： step1. small = 1, big =2，将big一直向右移动，直至sum(small - big) = 15，记录下来；继续加到sum(small - big) = 21 > 15，此时small = 1, big = 6 这是如果big继续加大，那么sum只会更大，偏离target更多，因此以small = 1起始的所有情况穷尽了 step2. 内部while循环找到的是第一个sum > target的big值此时small = 1, big = 6，在前面一个的时候sum = middle big > small >= midlle sum(small - big) > target 时间复杂度空间复杂度关键思路 void PrintContinuousSequence(int small, int big); void FindContinuousSequence(int sum) { if(sum < 3) return; int small = 1; int big = 2; int middle = (1 + sum) / 2; int curSum = small + big; while(small < middle) { if(curSum == sum) PrintContinuousSequence(small, big); while(curSum > sum && small < middle) { curSum -= small; small ++; if(curSum == sum) PrintContinuousSequence(small, big); } big ++; curSum += big; } } void PrintContinuousSequence(int small, int big) { for(int i = small; i = 0){ res.push_back(strs[rec[key]]); rec[key] = -1; } res.push_back(strs[i]); } } return res; } private: int* dic; string generateKeyByDic(int* dic, int n, string str){ for(int i = 0; i < n; ++i){ dic[i] = 0; } for(int j = 0; j < str.length(); ++j){ if(str[j] = 'a') dic[str[j] - 'a']++; } string key(26, '0'); for(int k = 0; k < 26; ++k){ key[k] = dic[k] + '0'; } return key; } }; 1-034 leetcode-3. Longest Substring Without Reapting Characters Given a string, find the length of the longest substring without repeating characters. Examples: Given "abcabcbb", the answer is "abc", which the length is 3. Given "bbbbb", the answer is "b", with the length of 1. Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring. 数据结构

关联容器map

算法

双指针窗口 1）为什么start和end这种移动方式就能找到所有无重复的字符串

循环不变式：一直求得是以start开头的最长无重复字符串说明：以abcdcbaefga为例 step1. start = 0 end一直累加，直到end指向4 这是以0开头的最长的无重复字符串 step2. 当end = 4时，因为此时start = 0，因为start - end之间含有两个c 因此，移动start = 1，此时还是有两个c，因此此时的end - start无疑较小，继续累加，这是以1开头的最长无重复 start =2 ，此时还有两个c，继续累加，这是以2开头的最长无重复字符串 start = 3，无重复，此时需要累加end，以求得以3开头的最长无重复字符串时间复杂度空间复杂度关键思路

使用窗口的思想，定义start，end作为窗口的两端，开始时start = end = 0；再定义一个Map，用来检测窗口中是否有重复字符。

class Solution { public: int lengthOfLongestSubstring(string s) { int len = s.length(); if(len == 0) return 0; int start = 0, end = 0, max = 0; int* map = new int[256]; //自定义Map for(int i = 0; i < 256; ++i) map[i] = 0; while(end < len){ if(map[s[end] - '\0'] == 0){ map[s[end] - '\0']++; //右移end扩大窗口 if((end - start + 1) > max) max = (end - start + 1); ++end; }else{ for(; map[s[end] - '\0'] > 0; map[s[start] - '\0']--, ++start); //右移start缩小窗口 } } return max; } };

另一种本质相同的解法：

int lengthOfLongestSubstring(string s) { vector map(128,0); int counter=0, begin=0, end=0, d=0; while(end0) counter++; while(counter>0) if(map[s[begin++]]-->1) counter--; d=max(d, end-begin); //while valid, update d } return d; } 说明： counter标记是否有重复 map标记是哪个字符重复，重复的字符>1 1-005 leetcode-76.Minimum Window Substring Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n). For example, S = "ADOBECODEBANC" T = "ABC" Minimum window is "BANC". Note: If there is no such window in S that covers all characters in T, return the empty string "". If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S. 数据结构

对于单个字符map/hash

算法

我们可以发现窗口思想应用场景的一些特点：一般都是要求一个子串或者子数组整体满足一定条件，比如要和最大，要包含什么字符之类的；而且所求的结果肯定和这个子串或者子数组长度有关。 1）证明为什么如此做是正确的？

a. 证明为空的情况 for(i = 0; i < S.length(); ++i){ if(chT[S[i] - '\0'] > 0 && chS[S[i] - '\0'] < chT[S[i] - '\0']) { ++cntS; } ++chS[S[i] - '\0']; if(cntS == cntT) break; } 其中 chT[S[i] - '\0'] > 0 表示S中的该字符在T中存在，chS[S[i] - '\0'] < chT[S[i] - '\0']控制的是该字符的个数；该循环的目的是累计到目前为止i，S中出现的T中的字符的个数cntS。如果cntS等于cntT，表示找到第一个T中所有字符。若没有找到，i等于S.length() b. 证明第一个for循环不变式：保持的是st和end之间始终包含字符串T 因为这个循环终止的条件为chT[S[st-1] - '\0'] > 0 && chS[S[st-1] - '\0'] < chT[S[st-1] - '\0'] 这个表示刚刚去掉的一个字符st-1为T中的字符，且此时S中st - end到字符统计chS出现了空缺，导致S[st-1]字符比T中少。这里有两个统计量：chT和chS c. 证明第二个for循环不变式：保持的是st和end之间之中缺少字符toFind 到了第二个for循环，前提是st和end之间之中缺少字符toFind 但是这个for循环退出的条件是toFind等于S[end]，得证 d. 证明算法能够找到最小的窗口 step0.以start=0开始的最小的包含T的窗口是：end一直累加到首次完整包含T; 若后面end继续增加，则窗口大小都要增大 step1.start一直增大，直到首次不包含完整T的前一个字符st start = 1 - st，这些都是以1 - st等字符为首的包含T的最小窗口 step2. start = st + 1表示首次不包含完整T的起始字符，此时end一直累加到以st + 1开始的首次完整包含T 重复step0. step1.即可穷尽所有可能性时间复杂度空间复杂度关键思路

先不断向右移动end直到当前窗口已经包含T中所有字符，然后向右移动start 直到再移动start的话窗口就不再包含T所有字符了，这个时候记录下窗口大小(end - start) 并和 min 比较即可。最后返回min。

class Solution { public: string minWindow(string S, string T) { if(T.length() == 0 || S.length() == 0) return ""; int chT[256]; int chS[256]; int i, j, k, cntT = T.length(), cntS = 0; for(i = 0; i < 256; chT[i] = 0, chS[i] = 0, ++i); for(i = 0; i < T.length(); ++chT[T[i] - '\0'], ++i); for(i = 0; i < S.length(); ++i){ if(chT[S[i] - '\0'] > 0 && chS[S[i] - '\0'] < chT[S[i] - '\0']) ++cntS; ++chS[S[i] - '\0']; if(cntS == cntT) break; } if(i == S.length()) return ""; //至此，找到了第一个包含T中所有charactor的S 字串 int end = i, st = 0; char toFind; //st指针右移，直到窗口因为缺少T中某个charactor(把这个ch记为toFind)而不再满足要求，就开始右移end指针，直到又找到了toFind int minlen = end - st + 1, minst = st; while(end < S.length()){ for(++st; st 0 && chS[S[st-1] - '\0'] < chT[S[st-1] - '\0']){ toFind = S[st-1]; break;} else if((end - st + 1) < minlen){ minlen = (end - st + 1); minst = st; } } for(++end; end < S.length(); ++end){ ++chS[S[end] - '\0']; if(toFind == S[end]) break; } } return S.substr(minst, minlen); } };

本质上相同的解法：

string minWindow(string s, string t) { vector map(128,0); for(auto c: t) map[c]++; int counter=t.size(), begin=0, end=0, d=INT_MAX, head=0; while(end0) counter--; //in t while(counter==0){ //valid if(end-begin0)，这个>0不能少，控制的是该字符的个数 b.counter为0时，表示begin-end之间已经包含了t 此时map中对应的t字符全为0，s中不属于t的字符全为负数 c.此时增加begin，直到刚好不包含t为止总结，这个map的含义：要在s中查找的字符c的个数，如果查到了，就减一。 1-035 leetcode-30. Substring with Concatenation of All Words You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters. For example, given: s: "barfoothefoobarman" words: ["foo", "bar"] You should return the indices: [0,9]. (order does not matter). 数据结构算法

双指针窗口

说明： 1）因为是定长，所以所有可能性不外乎遍历（0...seg）为起始位置，检查长度为seg的情况 2）map表示在S中需要找到的，如果map[str] == 0，表示S中已经找到了所有的这些str 3）st -= dist map[S.substr(st.seg)]++; --count; st += dist; 表示其实位置向后移动一位，因为map只是去掉了一个str，下一个循环st又加了一个seg，正好检查下一个位置。时间复杂度空间复杂度关键思路

这时候我们引入窗口的思想。假设窗口的起始状态是"bar"，然后end右移unit位，如果新被窗口包含进来的部分属于L，而且正好窗口大小和L大小相同了，那么当前start的值被记录要返回的vector中。若新被窗口包含进来的部分根本不属于L，那么start就可以直接移到end了，end则赋值为start+unit。若新被窗口包含进来的部分属于L，但是不巧，窗口中已有的部分已经把L这种字符串占满了，那么start就要右移了，一直移到这种字符串有一个不再被包含在窗口中。

不要忘了，这种窗口的思想，start和end的步长都是unit。所以，为了保证所有的正解都能被包含到。我们需要定义Unit个窗口，每一个窗口的start开始位置分别从0到unit。

class Solution { public: vector findSubstring(string S, vector &L) { vector res; if(L.size() == 0) return res; if(S.length() == 0) return res; int seg = L[0].length(); if(seg > S.length() || seg == 0) return res; int st = 0, count = 0, i = 0, j = 0; map map; string str; for(i = 0; i < seg; ++i){ map.clear(); count = 0; for(vector::iterator it = L.begin(); it != L.end(); ++map[*it], ++it); for(st = i; st < S.length(); st += seg){ str = S.substr(st, seg); if(map.find(str) != map.end() && map[str] > 0){ map[str]--; ++count; if(count == L.size()){ //找到一个结果 st -= ((count-1) * seg); res.push_back(st); map[S.substr(st, seg)]++; //把符合条件的子序列中最前端的unit移除 st += ((count-1) * seg); --count; } }else if(count > 0){ //虽然当前不匹配，但只要之前还有成功匹配的部，都要考虑以之前的匹配部分为起点，挨个尝试。 st -= (count * seg); map[S.substr(st, seg)]++; --count; st += (count * seg); } } } return res; } };

【本文地址】

窗口思想

窗口思想

今日新闻

推荐新闻