Using a Set For Faster Searching

Recently, I was working on a code challenge at CodeFights which specified that the solution had to run in O(n) time. The brute force solution was a nested for loop, but this would run in O(n^2) time. Here’s the code challenge:

Note: Write a solution with O(n) time complexity and O(1) additional space complexity, since this is what you would be asked to do during a real interview.

Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, return -1.

Example

For a = [2, 3, 3, 1, 5, 2], the output should be
firstDuplicate(a) = 3.

There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than than second occurrence of 2 does, so the answer is 3.

For a = [2, 4, 3, 5, 1], the output should be
firstDuplicate(a) = -1.

Input/Output

[time limit] 4000ms (js)
[input] array.integer a

Guaranteed constraints:
1 ≤ a.length ≤ 105,
1 ≤ a[i] ≤ a.length.

[output] integer

The element in a that occurs in the array more than once and has the minimal index for its second occurrence. If there are no such elements, return -1.

My first attempt to solve it worked but unfortunately was not quite fast enough and did not pass the last two tests. I created a new array to hold numbers that were already checked. In a for loop, in each iteration, the new array would be checked to see if the current number was included. If it was, then function would return the number. Take a look…

function firstDuplicate(a) {
    let visitedNumbers = [],
        i;
    
    for (i = 0; i < a.length; i++) {
        if (visitedNumbers.includes(a[i])) {
            return a[i];
        }
        visitedNumbers.push(a[i]);
    }
    
    return -1;
}

If the visitedNumbers array includes the current indexed number in the given array, then that number will be returned. If not, then the number is added to the visitedNumbers array and the next number in the given array is checked. If all the numbers are checked and none of them are duplicated, the function will return -1. Like I said earlier, this works, however it was not passing the last two tests, which I assume had very large inputs. After digging around a bit, I stumbled across something that I had not used before.

According to our friends at Mozilla, a Set is an object that “lets you store unique values of any type, whether primitive values or object references”. Ok, so how does that help us? Well according to this site, the Set method ‘has’ is 16% faster than the Array method ‘includes’. For small inputs, this won’t matter. But imagine a huge input; 16 percent would be very noticeable. Let’s see a Set in action.

function firstDuplicate(a) {
    let visitedNumbers = new Set(),
        i;
    
    for (i = 0; i < a.length; i++) {
        if (visitedNumbers.has(a[i])) {
            return a[i];
        }
        visitedNumbers.add(a[i])
    }
    return -1;
}

Now all test pass! Yay! Apparently this question is asked at Google. That makes me feel good about myself. The moral of the story: Sets are cool and fast. I’ll be using them more in the future. Until next time. Cheers:)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s