Highest-Scoring Substring by Prefix and Suffix Matches
You are given three strings: text, pre_text, and post_text. For any substring of text, define its score as the sum of two parts:
pre_score: the length of the longest prefix of the substring that also appears as a suffix ofpre_text(i.e. the longest overlap where the end ofpre_textmatches the start of the substring).post_score: the length of the longest suffix of the substring that also appears as a prefix ofpost_text(i.e. the longest overlap where the end of the substring matches the start ofpost_text).
Find the maximum total score (pre_score + post_score) over all substrings of text. Note that both the prefix-match and the suffix-match must fit inside the SAME chosen substring, so they cannot overlap a longer span than the substring itself provides.
Examples:
Example 1: text = "nothing", pre_text = "bruno", post_text = "ingenious". The substring "nothing" starts with "no", which is the suffix of "bruno" -> pre_score = 2. It ends with "ing", which is the prefix of "ingenious" -> post_score = 3. Total $= 5$, the maximum.
Hints
- Notice the two score components decouple:
pre_scoreis about where the substring starts,post_scoreabout where it ends. - Reduce each to a string-overlap problem: text-prefix vs
pre_text-suffix, and text-suffix vspost_text-prefix. - For linear time, run KMP or the Z-algorithm on the concatenations
pre_text + '#' + textandtext + '#' + post_text.
Worked Solution
How to Think About It: For a fixed substring text[i:j+1], pre_score is the longest prefix of that substring matching a suffix of pre_text, and post_score is the longest suffix of that substring matching a prefix of post_text. The subtlety is that BOTH matches must live inside one substring of width $w = j - i + 1$: each of pre_score and post_score is capped at $w$. You cannot simply take the best prefix-overlap over all starts and the best suffix-overlap over all ends and add them, because those maxima may come from different (or too-short) substrings and not be simultaneously achievable.
Algorithm: 1. For each start index $i$ and length $k$, precompute whether text[i:i+k] equals the last $k$ characters of pre_text (call this pre_valid[i][k]). 2. For each end index $j$ and length $k$, precompute whether text[j-k+1:j+1] equals the first $k$ characters of post_text (call this post_valid[j][k]). 3. For every substring (start $i$, end $j$, width $w = j - i + 1$): take the largest valid prefix length $\le w$ as pre_score, the largest valid suffix length $\le w$ as post_score, and track the maximum of pre_score + post_score. Both lengths are bounded by $w$, so the two matches genuinely coexist in that one substring.
Code (clear version): ```python def best_score(text, pre_text, post_text): n = len(text) lp = len(pre_text) lq = len(post_text) if n == 0: return 0 pre_valid = [[False] * (n + 1) for _ in range(n)] for i in range(n): for k in range(1, min(n - i, lp) + 1): if text[i:i + k] == pre_text[lp - k:]: pre_valid[i][k] = True post_valid = [[False] * (n + 1) for _ in range(n)] for j in range(n): for k in range(1, min(j + 1, lq) + 1): if text[j - k + 1:j + 1] == post_text[:k]: post_valid[j][k] = True ans = 0 for i in range(n): for j in range(i, n): w = j - i + 1 ps = 0 for k in range(w, 0, -1): if pre_valid[i][k]: ps = k break qs = 0 for k in range(w, 0, -1): if post_valid[j][k]: qs = k break if ps + qs > ans: ans = ps + qs return ans
print(best_score("nothing", "bruno", "ingenious")) # 5 ```
Complexity: The version above is $O(n^2 \cdot (n + m))$ in the worst case; it can be sped up with KMP/Z-algorithm on pre_text + sep + text and text + sep + post_text to obtain the per-position overlaps in linear time, but the substring widths must still be respected so the prefix and suffix matches coexist.
Answer: Enumerate substrings (or use a width-aware combination of the per-start prefix-overlaps and per-end suffix-overlaps): for each substring of width $w$, both pre_score and post_score are capped at $w$, and the answer is the maximum of their sum. Do NOT add the global best prefix-overlap and global best suffix-overlap independently, as that can overcount beyond what any single substring achieves.
Intuition
The reusable trick is separability: a score that looks like it depends on a whole substring often splits into independent endpoint contributions, letting you optimize each end separately instead of scanning all $O(n^2)$ substrings. Once decoupled, each piece is a classic prefix/suffix-overlap query, exactly what KMP's failure function and the Z-algorithm compute in linear time.
The original interviewee flagged that 'the second example has tricky corner cases' -- those are the boundary conditions (empty overlaps, the window long enough to host both matches). Handling overlaps and off-by-one bounds cleanly is the real test; the algorithmic core is standard string matching.