Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This solution is really bad, read it in as a dataframe and use a groupby...


That's even worse. You don't need to throw in a massive library like pandas, especially not when OP is build a basic API. Sure, if you're in a data science-y project where pandas is already a dependency, go nuts.


It really depends on the size of the list you want to process. If it's 10 items, pandas is overkill (and probably slower). If it's a million items, pandas is a great solution.

I have a nagging feeling there is an easier way to do this, but my quick and dirty solution was

    def merge_list1(l):
    other_dict = defaultdict(lambda: 0)
        for t, c in ((i['thing'], i['count']) for i in l):
            other_dict[t] += c
        return ({'thing': k, 'count': other_dict[k]} for k in other_dict)
which is still readable, but probably far from optimal.


> If it's a million items, pandas is a great solution.

Possibly not even then, it depends on how much you're doing and I feel like the topic at hand might be around that tipping point. We have some rather slow code that, profiling it, turned out to spend something like 60-70% of its time just converting between python types and native types when moving data in and out of the dataframe.


True. If there are millions of different “things” conversion times will end up dominating. If they are just a handful, then the libraries will be able to do a lot more work with parallel operations and converting the output will be very quick




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: