It has made me stop using Google and StackOverflow. I can look most things up quickly, not rubber duck with other people, and thus I am more efficient. It also it is good at spotting bugs in a function if the APIs are known and the APIs version is something it was trained on. If I need to understand what something is doing, it can help annotate the lines.
I use it to improve my code, but I still cannot get it to do anything that is moderately complex. The paper tracks with what I've experienced.
I do think it will continue to rapidly evolve, but it probably is more of a cognitive aid than a replacement. I try to only use it when I am tight on time. or need a crutch to help me keep going.
I cannot tell you the number of times I thought I invented something new and novel only to later find out it already existed. So, while it is true that you can sometimes find paths untraveled, many things related to first principles seem already heavily explored in CS.
That sounds like it could be interpreted two ways. On one hand you’re not the first to discover something, on the other hand your invention is validated as being worthwhile.
In times like those (depending on how much work I put into it) I might retrace the steps I took when I was searching for a solution before writing my own and if I can find something like a stack overflow post then link to the ultimate solution. Or blog about it using the search terms I originally tried etc.
A core part of science is reproducing others work.
Also from HCI one thing I took away from research on brainstorming: slight variations and deviations can be novel and produce a better outcome. The research here is that people misunderstanding someone else’s idea isn’t a problem, but rather generates a brand new alternative. If you feel you’ve redone some work, look a little closer, perhaps something small about it is novel or new.
Looking at this thread, it is a wonder that any PRs make it through review. I started calling these kinds of debates Holographic Problems.
- Spaces vs Tabs
- Self documenting code vs documented code
- Error Codes vs Exceptions
- Monolithic vs Microservices Architectures
- etc.
Context matters and your context should probably drive your decisions, not your personal ideology. In other words, be the real kind of agile; stay flexible and change what needs to be changed as newly found information dictates.
Maybe you'd know, but why would one choose to not sort favoring larger counts and drop the bottom half when full? It may be obvious to others, but I'd be curious.
The guarantees would not hold, I'm pretty sure ;) Maybe one of the authors could chip in, but my hunch is that with that you could actually introduce arbitrarily large errors. The beauty of this algorithm really is its simplicity. Of course, simple is.. not always easy. This absolute masterpiece by Knuth should demonstrate this quite well:
It's an absolutely trivial algorithm. Its average-case analysis is ridiculously hard. Hence why I think this whole Ordo obsessions needs to be refined -- worst case complexity has often little to do with real-world behavior.
Worst case complexity matters when the input data can be manipulated by someone malicious, who can then intentionally engineer the degenerate worst case to happen - as we have seen historically in e.g. denial of service attacks exploiting common hash table implementations with bad worst case complexity.
No, you're throwing away a random selection of 50/50. You would have to flood the algorithm with uniques or commons to set the algoritm to a probability of a known state.
You want every distinct item to have the same chance at the end. So when items repeat you need to reduce (not increase) the odds of keeping any given occurrence.
Let’s prove it by contradiction:
Lets say you pick the larger ones and drop the smaller ones every single round, you have lost the probabilistic guarantee of 1/2^k that the authors show because the most frequent words will be the lost frequent in subsequent rounds as well. This is the intuition, the math might be more illuminating.
I was taught the universe was a computer in 3 different college courses 25 years ago. Not exactly a revelation.
"According to the best physics, the universe: isnt programmable; doesn't evolve deterministically; isnt described by functions over integers; isn't electronic; doesnt transmit power through programmable operation; isn't abstract; doesnt have causal powers through mere arrangements of parts; .. and so on."
You can't say any of these things because electronic computers exist within the universe, but I get what your point is. That said, there are physicists beginning to assume the universe IS a computer and work forward from that assumption.
The problem exist for corporations as well. A teenage girl, who was a former classmate and friend of my daughter, was murdered at work because she turned down a man's advances. She made formal complaints, but the bureaucratic corporate processes made it difficult to protect her or sufficiently separate her from the harasser/murderer. Even if the state is an at-will state, corporate policies and mismanagement often handcuff those involved to rectify situations before they get out of hand.
To look and feel important. I worked for a company that moved to a building simply because the CEO wanted the sign to be seen from a major highway. My 20min commute would become 1 hour or more. I left upon hearing that rationale.
AMD’s hardware is essentially irrelevant for non-gaming applications because the software stack isn’t there. Unfortunately same goes for just about everyone BUT NVIDIA. It’s a situation that could be fixed, but it would require cooperative investment from, well, the rest of the semiconductor industry.
The fact that a large number of people care about a small number of ML models right now means that AMD has an opportunity to port just a few things and take market share.
What Amazon is missing is buyers. Actual professional buyers who evaluate products and make determinations if products meet standards and deserves to be placed in front of customers.
I use it to improve my code, but I still cannot get it to do anything that is moderately complex. The paper tracks with what I've experienced.
I do think it will continue to rapidly evolve, but it probably is more of a cognitive aid than a replacement. I try to only use it when I am tight on time. or need a crutch to help me keep going.