HomeOur TeamContact
Tries
Multi String Search
Sameer
Sameer
September 13, 2021
1 min

Multi String Search

Write a function that takes in a big string and an array of small strings, all of which are smaller in length than the big string. The function should return an array of booleans, where each boolean represents whether the small string at that index in the array of small strings is contained in the big string.

Note that you can’t use language-built-in string-matching methods.

Sample Input #1

bigString = "this is a big string"
smallStrings = ["this", "yo", "is", "a", "bigger", "string", "kappa"]

Sample Output #1

[true, false, true, true, false, true, false]

Sample Input #2

bigString = "abcdefghijklmnopqrstuvwxyz"
smallStrings = ["abc", "mnopqr", "wyz", "no", "e", "tuuv"]

Sample Output #2

[true, true, false, true, true, false]

Hints

Hint 1 A simple way to solve this problem is to iterate through all of the small strings, checking if each of them is contained in the big string by iterating through the big string’s characters and comparing them to the given small string’s characters with a couple of loops. Is this approach efficient from a time-complexity point of view?

Hint 2 Try building a suffix-trie-like data structure containing all of the big string’s suffixes. Then, iterate through all of the small strings and check if each of them is contained in the data structure you’ve created. What are the time-complexity ramifications of this approach?

Hint 3 Try building a trie containing all of the small strings. Then, iterate through the big string’s characters and check if any part of the big string is a string contained in the trie you’ve created. Is this approach better than the one described in Hint #2 from a time-complexity point of view?

Optimal Space & Time Complexity O(ns + bs) time | O(ns) space - where n is the number of small strings, s is the length of longest small string, and b is the length of the big string

Solution 1

# O(bns) time | O(n) space
DEFINE FUNCTION multiStringSearch(bigString, smallStrings):
    RETURN [isInBigString(bigString, smallString) FOR smallString IN smallStrings]

DEFINE FUNCTION isInBigString(bigString, smallString):
    FOR i IN range(len(bigString)):
        IF i + len(smallString) > len(bigString):
            break
        IF isInBigStringHelper(bigString, smallString, i):
            RETURN True
    RETURN False

DEFINE FUNCTION isInBigStringHelper(bigString, smallString, startIdx):
    SET leftBigIdx TO startIdx
    SET rightBigIdx TO startIdx + len(smallString) - 1
    SET leftSmallIdx TO 0
    SET rightSmallIdx TO len(smallString) - 1
    WHILE leftBigIdx <= rightBigIdx:
        IF bigString[leftBigIdx] != smallString[leftSmallIdx] or bigString[rightBigIdx] != smallString[rightSmallIdx]:
            RETURN False
        leftBigIdx += 1
        rightBigIdx -= 1
        leftSmallIdx += 1
        rightSmallIdx -= 1
    RETURN True

Solution 2

# O(b^2 + ns) time | O(b^2 + n) space
DEFINE FUNCTION multiStringSearch(bigString, smallStrings):
    SET modifiedSuffixTrie TO ModifiedSuffixTrie(bigString)
    RETURN [modifiedSuffixTrie.contains(string) FOR string IN smallStrings]

DEFINE CLASS ModifiedSuffixTrie:
    DEFINE FUNCTION __init__(self, string):
        SET self.root TO {}
        self.populateModifiedSuffixTrieFrom(string)

    DEFINE FUNCTION populateModifiedSuffixTrieFrom(self, string):
        FOR i IN range(len(string)):
            self.insertSubstringStartingAt(i, string)

    DEFINE FUNCTION insertSubstringStartingAt(self, i, string):
        SET node TO self.root
        FOR j IN range(i, len(string)):
            SET letter TO string[j]
            IF letter not IN node:
                SET node[letter] TO {}
            SET node TO node[letter]

    DEFINE FUNCTION contains(self, string):
        SET node TO self.root
        FOR letter IN string:
            IF letter not IN node:
                RETURN False
            SET node TO node[letter]
        RETURN True

Solution 3

# O(ns + bs) time | O(ns) space
DEFINE FUNCTION multiStringSearch(bigString, smallStrings):
    SET trie TO Trie()
    FOR string IN smallStrings:
        trie.insert(string)
    SET containedStrings TO {}
    FOR i IN range(len(bigString)):
        findSmallStringsIn(bigString, i, trie, containedStrings)
    RETURN [string IN containedStrings FOR string IN smallStrings]

DEFINE FUNCTION findSmallStringsIn(string, startIdx, trie, containedStrings):
    SET currentNode TO trie.root
    FOR i IN range(startIdx, len(string)):
        SET currentChar TO string[i]
        IF currentChar not IN currentNode:
            break
        SET currentNode TO currentNode[currentChar]
        IF trie.endSymbol IN currentNode:
            SET containedStrings[currentNode[trie.endSymbol]] TO True

DEFINE CLASS Trie:
    DEFINE FUNCTION __init__(self):
        SET self.root TO {}
        SET self.endSymbol TO "*"

    DEFINE FUNCTION insert(self, string):
        SET current TO self.root
        FOR i IN range(len(string)):
            IF string[i] not IN current:
                SET current[string[i]] TO {}
            SET current TO current[string[i]]
        SET current[self.endSymbol] TO string

Tags

Previous Article
Suffix Trie Construction
Next Article
Bubble Sort
Sameer

Sameer

Related Posts

Heap Sort
January 31, 2022
1 min
© 2022, All Rights Reserved.

Quick Links

Advertise with usAbout UsContact Us

Social Media