I've been trying to read a paper a day since midsummer. These are a few of the interesting, for me personally, since then:
Generating Sequences With Recurrent Neural Networks - http://arxiv.org/abs/1308.0850
Older one, but important to understand deeply since other recent ideas have come from this!
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks - http://arxiv.org/abs/1511.06434
State of the Art Control of Atari Games Using Shallow Reinforcement Learning - http://arxiv.org/abs/1512.01563
Interesting discussion in section 6.1 on the shortcomings/issues of DQN done by Deepmind
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) - http://arxiv.org/abs/1511.07289
I wish they did more comparisons between similar network architecture with only the units swapped out. Eg. Alexnet, Relu vs Alexnet, Elu.
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models - http://arxiv.org/abs/1511.09249
I can't speak for parent, but I believe people who read a paper a day, don't try to understand it deeply enough to be able to start implementing whatever the paper talks about. Rather it is read to get an idea of the approach and what kind of results it will give and what kind of problems it can solve.
For people actively working full-time in the field, some of the papers which have simple but powerful ideas, reading the paper for 2 hours (or glancing at the key diagram / formula) is enough to implement it.
For example:
Deep Residual Learning for Image Recognition
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
When I read most papers, it will be to do what you say. There are many papers which really aren't worth spending more than 20 minutes perusing since they just rehash or tweak something which was done previously. Unfortunately with the publish-or-perish mentality predominant in most of academia, I'd say this is getting far worse and likely to increase. Sometimes I wish there were a "goodreads" or "netflix" for scientists.
Now, a good paper I will read and grok the main importance in a few hours to the point I can implement the basics. But, a classic paper will be like a well-thumbed classic and might take years to fully grasp.
"Tackling the Awkward Squad:
monadic input/output, concurrency, exceptions, and
foreign-language calls in Haskell" (http://research.microsoft.com/en-us/um/people/simonpj/papers...) finally made me understand monads. Or rather, why they have such an unreasonable draw on Haskell people. tl;dr: Monads are useful to thread data (state, side effects, ...) through a computation, without modifying all your function signatures (the functions can be lifted to work with the monad). But mostly, it turns out you NEED monads (or something like it) to sequence side-effects (since Haskell is lazy).
Parse ambiguous context-free grammars in worst-case cubic time and unambiguous grammars in linear time, with an intuitive recursive-descent-ish algorithm. GLL is the future of parsing IMO, more powerful than packrat/PEG parsers and comparatively easy to write by hand. It also handles ambiguities more elegantly than GLR, IMO.
word2vec algorithm with context based on linguistic dependencies instead of a skip-gram approach. A quick explanation is skip-grams give words related to the embedding (ex: Hogwarts -> Dumbledore) and dependencies give words that can be used like the embedding (ex: Hogwards -> Sunnydale). It's not meant to replace skip-grams, but augment them; skip-gram contexts learn the domain and dependency-based contexts learn the semantic type.
One of the things I find particularly nice about GLL is that it's much more friendly to parser combinators[1] than GLR. (LR-family parsers, and bottom-up parsing in general, is notoriously difficult to implement in a way such that parsers can be combined, and the result framework would be rather awkward to use.)
A nice summary on vector space models along with three basic matrix layouts: term-document, word-context, pair-pattern and the resulting applications and algorithms.
Emphasis on communication. I liked the fact, that the AI is pictured as a research assistant, since I would love to see a more dialog oriented interaction with machines.
Great essay on how the past had a handle on todays data analysis landscape, just without the enormous computing power and data availability, that we have today.
That paper tells us that pain medication is often used in completed suicide (paracetamol; paracetamol and opioids combined; and opioids; are three of the top five most commonly used meds)
So I have an interest in pain medication from the angle of suicide prevention, which is why these two are interesting.
Efficacy and safety of paracetamol for spinal pain and osteoarthritis: systematic review and meta-analysis of randomised placebo controlled trials: http://www.bmj.com/content/350/bmj.h1225
(Paracetamol probably doesn't help with long term musculo-skeletal pain, and increases risk of liver damage)
It's a bit confusing. For instance around page 80:
Table 5: Male suicide deaths and those aged 45-54 in the general population, by UK country vs Table 7: Patient suicide: male suicide deaths and those aged 45-54, by UK country.
Table 5 shows the rate. Table 7 shows the actual numbers. Why? Even the first key finding speculates about patient suicide increase due to higher numbers of patients. Do they not have this seemingly important statistic? A quick search says "a quarter" of the population will have a mental illness during the year. If true, then we'd expect around 25% of suicides to be from patients, right?
Why separate the APAP/opiod combination in light of suicide if the APAP wasn't a relevant cause? It seems like respiratory depression and liver poisoning aren't that synergistic are they? An opiate naive user with 10/325 oxy/apap would almost certainly hit opiate overdose before liver damage was a life-threatening issue.
The study recommends "safe prescribing" but then shows the majority of opiate suicide isn't with a prescription, and prescription overdose is skews heavily to older females with a "major physical illness". And no comparison on how rx abuse compares with non-mentally-ill patients. Edit: And rx rates, too. I'm guessing older patients generally get way more opiates prescribed than younger ones.
These are great questions. They're normally pretty good at responding if you want more information.
Here "patient" means "under the care of secondary MH services", so doesn't include people who are being treated by their GP rather than by eg a community MH team.
I think the opioid / APAP stuff is based on bots of history. Co-Proxamol was for years the most common med used in completed suicide. It was put on more restrictive prescribing, and use dropped. But then plain paracetamol use in completed suicide increased. (And also attempted suicide, for a while paracetamol overdose was 4% of UK liver transplants, (but 25% of the super-urgent transplants)). Rules about paracetamol tightened, so we've seen reductions in its use. So, from a public health POV, it's useful to see if plain paracetamol, or the combination, or plain opioids are being used more often, because that means they can look at what's driving sales or prescriptions.
About safe prescribing: one source of medication used in completed suicide is either from your own prescription, or from a relative's prescription. This is often a preventable cause of death, so it's useful to see if safe prescribing helps. It ties into things like "Triangle of Care" and also "Pills Project" (which I want to try to use outside care homes).
You're right about older people. They also often don't lock up the meds in a cupboard (they don't have children in the home anymore, they don't see a need) and tragically grand-children come to visit and accidentally overdose.
People are very excited about graph isomorphism being solvable in quasipolynomial time, but there are a few more problems from the seminal Garey, Johnson book that are still unknown to be in either P or NP-c or neither. One of them is computing the optimal schedule for three machines processing some tasks (jobs), when the tasks have all the same size, but there are dependencies among some of them and you have to do them in order.
This paper proves that there is a (1+ε)-approximation of this problem in "slightly more than quasipolynomial time" (I love this phrasing).
The technique they use is a Lasserre hierarchy which is a very exciting tool in theoretical computer science, although there still exist only a couple results where this hierarchy approach brings more to the table than other methods for designing efficient algorithms. This is one more to the list!
A couple years behind the times but I got really into word2vec and plenty of associated works. On a mobile so not easy to post links, but if you haven't checked out w2v I highly recommend it.
Not a specific list of papers, but I find Sigcomm to generally have very good papers in the field of networking and communications. Here's the link for this year's conference:
Generating Sequences With Recurrent Neural Networks - http://arxiv.org/abs/1308.0850 Older one, but important to understand deeply since other recent ideas have come from this!
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks - http://arxiv.org/abs/1511.06434
Unitary Evolution Recurrent Neural Networks - http://arxiv.org/abs/1511.06464
State of the Art Control of Atari Games Using Shallow Reinforcement Learning - http://arxiv.org/abs/1512.01563 Interesting discussion in section 6.1 on the shortcomings/issues of DQN done by Deepmind
Spectral Representations for Convolutional Neural Networks - http://arxiv.org/abs/1506.03767
Deep Residual Learning for Image Recognition - http://arxiv.org/abs/1512.03385
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) - http://arxiv.org/abs/1511.07289 I wish they did more comparisons between similar network architecture with only the units swapped out. Eg. Alexnet, Relu vs Alexnet, Elu.
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models - http://arxiv.org/abs/1511.09249
Just a few from my list :)