LC 1348. Tweet Counts Per Frequency

Sarthak Sehgal · May 25, 2020

medium binary search

Keeping a track of all the occurrences of a tweetName using unordered_map<string, vector<int>> and then iterating the whole vector to identify times between startTime and endTime can be inefficient. We want a more optimal approach than this one. Idea:

Keep track of times of a tweetName using a multiset instead of a vector: unordered_map<string, multiset<int>>. This way, all the times are stored in sorted order.
Now, when we have to find the tweets from “startTime” to “endTime”, we only have to look for the first time in multistep which is equal to or greater than “startTime” and then iterate normally till we reach the end of the set or we encounter a time greater than “endTime”.

The code below uses this approach by leveraging the “lower_bound” function available in C++. It is internally implemented using binary search and finds the first occurrence in a sorted sequence which is greater than or equal to the input number.

class TweetCounts {
public:
    unordered_map<string, multiset<int>> rec;

    TweetCounts() {
    }

    void recordTweet(string tweetName, int time) {
        rec[tweetName].insert(time);
    }

    vector<int> getTweetCountsPerFrequency(string freq, string tweetName, int startTime, int endTime) {
        int t = getFreq(freq);
        auto m = rec.find(tweetName);
        vector<int> res((endTime-startTime)/t + 1);

        if (m==rec.end()) return res;

        for (auto it=(m->second).lower_bound(startTime); it!=(m->second).end() && *it<=endTime; it++) {
            res[(*it-startTime)/t]++;
        }

        return res;
    }

    int getFreq (string freq) {
        if (freq[0]=='m') return 60;
        if (freq[0]=='h') return 3600;
        return 86400;
    }
};

Time complexity:

recordTweet: O(nlogn) in worst case when the all the tweetNames are same
getTweetCountsPerFrequency: O(logn + (endTime-startTime))

Share: Twitter, Facebook