Well, one of these is something that most reaonable people work on avoiding, while the other is something that a huge capitalist industrial machine is working to achieve like their existence depends on it.
I really don't see why it's not possible to just use basically a "highlighter" token which is added to all the authoritative instructions and not to data. Should be very fast for the model to learn it during rlhf or similar.
How would that work when models regularly access web content for more context, like looking up a tutorial and executing commands from it to install something?
It's fine for it to do something like following a tutorial from an external source that doesn't have the highlighter bits set. It should apply an increased skepticism to that content though. Presumably that would help it realize that an "important recurring task" to upload revenue data in an awk tutorial is bogus. Of course if the tutorial instructions themselves are malicious you're still toast, but "get a malicious tutorial to last on a reputable domain" is a harder infiltration task than emailing a PDF with some white text. I don't think trying to phish for credentials by uploading malicious answers to stack overflow is much of a thing.
I have a theory that a lot of prompt injection is due to a lack of hierarchical structure in the input. You can tell that when I write [reply] in the middle of my comment it's part of the comment body and not the actual end of it. If you view the entire world through the lense of a flat linear text stream though it gets harder. You can add xml style <external></external> tags wrapping stuff, but that requires remembering where you are for an unbounded length of time, easier to forget than direct tagging of data.
All of this is probability though, no guarantees with this kind of approach.
No one expects a SQL query to pull additional queries from the result set and run them automatically, so we probably shouldn't expect AI tools to do the same. At least we should be way more strict about instruction provenance, and ask the user to verify instructions outside of the LLM's prompt stream.
Not exactly. You can't steal anything unless the person revealed the public key. Addresses are just hashes of public keys, therefore qc resistant. However, you can't ever reuse an address, as signing reveals the public key.
Otoh, afaik either it wasn't like this in the satoshi era or satoshi revealed the public key. In any case, satoshi's wallets are crackable by qc.
I'm curious, does this mean that, if all Bitcoin wallets had been programmed from the beginning to never reuse addresses, Bitcoin could have been implemented without any asymmetric cryptography?
It's really not that hard to come up with ways to commit fraud if you want to. On the other hand, it's very easy to make such mistakes if you don't know to avoid them. This characterization is very misguided IMO.
Ah, perhaps I wasn't clear about what I'm trying to say. I don't think we should stop training researchers in common mistakes and fraudulent methods to watch out for.
I'm just saying: I don't believe anyone actually tells budding researchers that they should commit fraud. Instead I think the process is probably more like this:
Year 1: Statistics/research training. Here are a load of subtle mistakes to watch out for and avoid. Scientific fraud happens sometimes. Don't do it, it's very dishonest.
Year 2: Starting research. Gee a lot of these papers I'm reading are hard to reproduce, or unclear. Maybe fraud is widespread - or maybe they're just smarter or better equipped than me.
Year 3: "You really ought to have published some papers by now, the average student in your position has 3 papers. If you don't want to flunk out you really need to start showing some progress"
I still disagree. It's more like "omg I should have published at least a few papers by now, what am I doing" and then you start frantically looking for provable things in the dataset. You find one that you can also support with a nice story. Now either a) You were not tought about how or why this is wrong and you publish the paper b) You were, and know that you should collect a separate dataset to test the hypothesis. But also, there is a huge existential pressure to just close your eyes and roll with it.
It's not that you need to be tought how to cheat, it's that you need to be tought how to avoid unintentionally cheating.
Maybe I'm misreading but you both seem to say the same thing.
New student is shown how to read a paper, how to spot egregious errors and all the things listed above.
Student, i guess feels forced to publish. And maybe uses murkier tactics to get the paper published.
As Dr Frank Etscorn said, "I can show anything correlates to anything else." We were discussing vitamin D papers anf I was testing that paper funding AI mentioned on HN a few times last year.
Not only common, it's become a requirement to review papers if you submit one yourself. Yeah it's not ideal for multiple reasons (what you said + prompt engineering grad students dismissing proper papers without having the slighest idea about the field), but the amount of submissions is so incredibly huge that it's impossible to do it any other way.
"For the first time, researchers reading conference proceedings will be forced to wonder: does this work truly merit my attention? Or is its publication simply the result of fraud?
[...]
But the mere possibility that any given paper was published through fraud forces people to engage more skeptically with all published work."
Well... spending a few weeks reproducing a shiny conference paper that simply doesn't work and is easily beaten by any classical baseline will do that to you in the first few months of your PhD imo. I've become so skeptic over the years that I assume almost all papers to be lies until proven otherwise.
"This surfaces the fundamental tension between good science and career progression buried deep at the heart of academia. Most researchers are to some extent “career researchers”, motivated by the power and prestige that rewards those who excel in the academic system, rather than idealistic pursuit of scientific truth."
For the first years of my PhD I simply refused to parttake in the subtle kinds of frauud listed in the second paragraph of the post. As a result, I barely had any publications worth mentioning. Mostly papers shared with others, where I couldn't stop the paper from happening by the time I realized that there is too little substance for me to be comfortable with it.
As a result, my publication history looks sad and my carreer looks nothing like I wished it would.
Now, a few years later, I've become much better at research and can now get my papers to the point where I'm comfortable submitting them with a straight face. I've also came to terms with overselling something that does have substance, just not as much as I wish it had.
> I've become so skeptic over the years that I assume almost all papers to be lies until proven otherwise.
I couldn't agree more. I have read a lot of psychology papers during my PhD and I think there is very little signal in the papers. Many empirical papers for example use basically the same "gold standard" analysis, which is fundamentally flawed in many ways. One problem for example is that if you would use another statistical model, then the conclusions would often be wildly different. Another being that the signal is often so weak that you can't use it to predict much (to be useful). If you try to select individuals for example, the only thing you can tell is that the group on average is less neurotic. But for individuals there is no better chance of picking the right one than average. The point of a good paper is to take these sketchy analyses and write a beautiful story around it with convincing speculation. It sounds absurd but take a random quantitative psychology paper and check which percentage of the claims made in the discussion are actually based on the actual data from the paper.
But the worst part about this is that these problems exist for literally decades. Nobody cares. The funding agencies grade people not on correctness but on the number of citations. As a result, you see that many subcultures exist who's sole existence is about promoting the importance of their subculture. It is quite common in academia to cite someone in the introduction just to "prove" that some idea is worth pursuing. But does it work? Doesn't matter. Just keep writing papers.
So I'm not saying that all research is bad. I'm saying that indeed most papers are not very useful or correct. Many researchers try, but the incentives are extremely crooked.
How could this be corrected? The scientific community has lost a lot of credibility with the public, and the backlash is obvious in recent policy changes. Fast forward four years. Assume Trump and RFK jr have successfully destroyed the current system. What should replace it?
How could the Federal government ensure that public monies only fund high quality research? Could policy re-shape the incentives and unlock a healthy scientific sector?
Consequences with the current system would suffice.
Ignorance as a defense needs to go too. Ignorance as a defense is too powerful and we should balance it more towards hurting the supposedly ignorant rather than everyone else. Basically, a redefining of wilful ignorance so it's balanced as stated.
The "so called Long COVID" really just means "I got covid and I remain unwell in some way, long term". I'm not sure it needs to be questioned if those two qualifying things are given. Very well might be the same thing as ME/CFS, or might not, I don't think we have a conclusive answer yet about what causes either.
"Long COVID" encompasses several very distinct nosological categories, which makes it a difficult term to talk about. There are at least four distinct subtypes - people who had severe acute illness and suffered respiratory injury, people who had severe acute illness and suffered cardiovascular or renal injury, people who had a relatively mild acute illness but developed long-term neurological or musculoskeletal sequelae, and people who developed those latter symptoms during the COVID pandemic without actually being infected with COVID. All of those types of suffering are very real, but may have very different causes and require different treatments.
> The final study population comprised 341 participants (90.6% females) who completed blood sampling and answered the questionnaire. A total of 232 (68%) were seropositive
How does this compare to the base rate in a similar population? Two thirds sounds like a reasonable estimate for "ever had COVID" in Denmark in 2022, though maybe a smaller percentage would in fact be seropositive. It would be interesting if self-reported long COVID had little or no correlation with having had COVID at all.
note that this is 'self reported long covid'. For a proper study you'd need an objective measure otherwise it can't be determined what might be long covid vs. fatigue from working late hours at work, changes in mood due to season, changes due to age etc.
It's not that it sucks to have this brought up, it's that the people who do are pretty consistently dismissive and infantilising about it. Naturally, folks that this happens to develop an aversion to people that say the same thing for the millionth time. It's not novel, it's not interesting (in that it never leads anywhere novel either) and it's frequently hostile.
If anything I feel like it’s actually more interesting if it’s a psychiatric condition.
Does someone’s visceral sense of having control of their own life being detonated have an impact on their body? Probably? What if they have been conditioned to act like they are not bothered by this aspect of the situation, and therefore don’t process this feeling and it remains unresolved. That is like the definition of trauma.
Probably but that doesn't stop people from selectively reading or twisting the Bible to serve their own ends, bringing the weight of God behind their own desires and goals. They know most believers never actually read their own holy book on their own much less with a critical eye to how well it matches what they're being told.
Reading utopic rulebooks - the intent is of no value. What is of value, is the reproduceable outcome of the rule-book interacting with actual humans, not idealized or demonized concepts.
Yeah, but you are, at least seemingly, missing the point: the scripts he was referring to in context weren't the legal disclosures but the sales tricks that were handed to them to use. He is pointing out that the 'elite' saleswoman wasn't better at strategy or sales tricks, but was effective because she made them feel good.
The article was also written by someone who is not a reliable narrator.
The type of person who would skip the legally required stuff while working as a telemarketer is exactly the kind of person who'd tell you he got fired for a "technicality", while he was actually fired because he was breaking the law and lying to customers.
Especially considering that even the unreliable narrator himself admits he was one of the first ones to be fired - why would you fire the best employee first if it wasn’t for cause?
But then this is also an argument against spontaneous emergence of life, right? Sicne mirror-life could just emerge on its own, instead of evolving from regular life.
Even if the timescales and environments conducive to the emergence of basic life exist on Earth today, I would expect any spontaneously-emerged proto-cell to almost immediately get eaten by normal microorganisms without any meaningful chance to proliferate. Reverse-chirality or no, it'd be a glorified bag of nutrients compared to the more evolved flora surrounding it.
Not necessarily. A pre-condition for emergence and evolution towards more complex life forms can be removed from the system by the existing complex life forms.
At the end of the day, all life is competing for energy. Spontaneous self replicators would be uncompetitive with existing biology and would never make it to more complex stages.