3. Longest Substring Without Repeating Characters
3. Longest Substring Without Repeating Characters

3. Longest Substring Without Repeating Characters

My Solution file(if any)
ID
7
Tags
Strings
Arrays
Hashmaps
Microsoft
Sliding Window
Level
Medium
Leetcode List Name
Published
May 7, 2024
Sub-item
Date (of first attempt)
May 7, 2024 08:53 PM
My Expertise
Hard
Remark
Key part of this problem is to identify when to move the pointers and where to, then for optimization it is good to use a simple hashmap!
My Solution Link
Best Solution Link
Spaced Repetition
Reading List
Parent item

Introduction to the Problem

In the realm of software development, particularly in areas involving string manipulation, the ability to efficiently process and analyze data is crucial. One common challenge is the "Longest Substring Without Repeating Characters" problem. It's a popular question on coding platforms such as LeetCode, which not only tests a programmer's ability to work with strings but also their skill in utilizing optimal algorithms and data structures to handle large datasets efficiently.

Problem Statement

Given a string, the objective is to find the length of the longest substring that does not contain any repeating characters. This task is significant in various applications, such as token generation, lexical analysis, and other scenarios where identifying unique patterns in text data is necessary.

Thought Process and Practical Applications

When tackling this problem, a programmer must think about how to efficiently traverse the string while keeping track of the characters that have already been seen. This involves considering the trade-offs between time complexity (how fast the algorithm runs) and space complexity (how much memory it uses).
In real-world applications, this algorithm can be crucial. For instance, in network security, generating unique session tokens often requires determining substrings without repetition. Similarly, in text editors, this logic can help in syntax highlighting by identifying unique patterns within the code.

Approaches Explored

Approach 1: Brute Force

  • Pros: The brute force approach is straightforward to understand and implement. It simply checks all possible substrings, which makes it clear and easy for beginners.
  • Cons: Its major drawback is efficiency. The time complexity is O(n^2), which becomes impractical for long strings due to the significant increase in execution time.

Approach 2: Sliding Window with Hash Map

  • Pros: This method is highly efficient, reducing the time complexity to O(n) by using a sliding window technique combined with a hash map. It processes each character only once, making it suitable for large strings.
  • Cons: It requires more memory to maintain the hash map and is slightly more complex to implement. The logic behind adjusting the window size as repeats are found can also be tricky to grasp initially.

Finalized Approach and Implementation

The chosen approach for solving this problem utilizes the sliding window technique with a hash map. This strategy allows dynamic adjustment of the window size by using the hash map to remember the last position of each character. When a repeated character is found, the window's starting point is moved just past the last occurrence of that character.
notion image

Executable Code

Idea IconTheTechCruise.com Pyodide Terminal
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        index_map = {}
        max_length, current_index, start_index = 0, 0, -1

        for current_index in range(len(s)):
            if s[current_index] in index_map and index_map[s[current_index]] > start_index:
                start_index = index_map[s[current_index]]
            max_length = max(max_length, current_index - start_index)
            index_map[s[current_index]] = current_index

        return max_length
if __name__ == "__main__":
    solution = Solution()
    test_strings = [
        ("pwwkew", 3),       # "wke"
        ("", 0),             # ""
        ("bbbbbb", 1),       # "b"
        ("anviaj", 5),       # "anvia"
        ("dvdf", 3),          # "vdf"
				("tmmzuxt", 5),     # The longest substring without repeating characters is "mzuxt"
        ("!@#$$%^&*", 7),   # The longest substring without repeating characters is "!@#$%^&"
        ("1234561234567", 7), # The longest substring without repeating characters is "1234567"
        ("abcdefaghiab", 8),  # The longest substring without repeating characters is "bcdefagh"
        ("a0b1c2d3e4f5g6h7i8j9k0l", 21)  # The longest substring without repeating characters is "a0b1c2d3e4f5g6h7i8j9k0l"
    ]

    for test_string, expected in test_strings:
        result = solution.lengthOfLongestSubstring(test_string)
        print(f"Input: '{test_string}' | Expected: {expected}, Got: {result}")

Example and Detailed Explanation

Let's walk through the string "abba" to see the sliding window in action:
  1. Initialization:
      • index_map is empty.
      • max_length is 0.
      • current_index is 0.
      • start_index is -1.
  1. First Character ('a'):
      • Add 'a' with index 0 to index_map.
      • Update max_length to 1.
  1. Second Character ('b'):
      • Add 'b' with index 1 to index_map.
      • Update max_length to 2.
  1. Third Character ('b') Again:
      • 'b' is found in index_map with the previous index 1.
      • Move start_index to index 1 + 1 = 2.
      • Update max_length to 2 (no change).
      • Update the last occurrence of 'b' in index_map to index 2.
  1. Fourth Character ('a'):
      • 'a' is found in index_map with index 0, but since start_index is now 2, it doesn't affect the window.
      • Update max_length to 2 (no change).
      • Update the last occurrence of 'a' in index_map to index 3.
Through this example, it's evident how the sliding window adjusts as duplicates are encountered, ensuring that at any point, the substring considered does not have repeating characters.

Complexity Discussion

The time complexity of this approach is O(n), primarily due to the single pass required over the string, and each character's last index is updated in constant time due to the hash map. The space complexity is O(min(m, n)), with m being the character set of the string, as it depends on the number of unique characters stored at any time.

Conclusion

Beats around 99.6%  of the submissions! Definitely great!
Beats around 99.6% of the submissions! Definitely great!
This problem serves as a perfect illustration of combining elementary data structures like hash maps with efficient algorithmic strategies like sliding windows to tackle complex problems. The method discussed is not just theoretically optimal but also widely applicable in real-world scenarios where performance and accuracy are critical.
 
Let us know your thoughts and share similar problems in comments!
Buy us a coffeeBuy us a coffee