Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Who cares about plagiarism? (worksinprogress.co)
41 points by apophatic on Feb 2, 2022 | hide | past | favorite | 66 comments


The entire point of academia is to extend the current state of knowledge. When judging if a paper is extending knowledge, the reader must understand 1) what the prior work is and 2) how the author has extended prior work. Without citing prior work, how is the reader supposed to tell what the author has done new?

Not only that, reproducibility is core to research. Without having detailed records of where ideas are originating (i.e., citations) research is much less reproducible.

There are actually many examples of papers getting accepted because the authors claim something new by using new terminology, but ignore all prior work in the field. The reviewers are not experts in that subfield, so they accept the paper without realizing it is not novel.


> The entire point of academia is to extend the current state of knowledge.

Maybe it once was but for the vast majority of people who enter academia the enter point is to increase their take home pay throughout their life.

Academia isnt some noble endeavor but a desperate game that people play in an attempt to win enough money that they can comfortably live their meager lives and hopefully raise their children to have better lives than they had.

Until economic necessity is decoupled from participating in the academic process the idea that academia serves some kind of abstract goal like the persuit of knowledge will remain an elaborate lie that people tell themselves to justify their life choices regarding academia.


Academia is a rather poor choice if your goal is income (industry pays much better). I agree that academia isn't as idealistic as one often presents it, but the principle currency in academia isn't money but prestige. Plagiarism and other forms of academic fraud are largely motivated by an attempt to look smarter and more productive than one is in order to win recognition from one's peers.


I'm curious if in your generalization are there any references to cite? I am in desperate agreement with you, but sources are needed! (Irony unintended)


Requiring citation is not counteracting "open science", in fact most of the "open science" movements _require_ citations, e.g. if you use an open dataset, you have to cite it.

I really don't get the point the article is trying to make. Copying other's ideas and introductions and methods is practically a core tenet of science and every paper out there does it. No one is discouraging you from copying anyone else's articles (publishers would be another topic), they're just asking for a citation. Is a citation so hard ?

If you claim that requiring citations is not "open" then practically no software license is "open".


I think the article is making a point through hyperbole. Citations have become a scoreboard on the back end. On the front end they've become more important than the content itself.

A bad citation is viewed as some sort of academic crime. Bad content -- if actually judged -- is largely ignored.

Perhaps I'm just scarred from my student days when I put a superscript at the end of the wrong sentence, which left the following sentence uncited gasp. A big deal was made of this that I was plagiarizing the material. They were also upset that I wasn't visibly concerned enough about this high crime and that I only viewed it as a typo. Seriously, I added the superscript one sentence early. There was never any feedback on the work itself.


> I really don't get the point the article is trying to make.

It is essentially in agreement with your position here.


The core problem AFAICT is the academic worlds obsession with publication in the first place. If publication is the priority and at the heart of the reward system then plagiarism is a big deal. If the goal is producing useful results then it would be less of a big deal, though attribution still seems appropriate. If they would embrace replication studies as much as novelty this would also be less of an issue.

Seriously, the world doesn't need papers for the sake of papers. It needs useful information and confirmation that information published is useful. Or if not directly useful, at least accurate as documenting something.


Plagiarism would still be a problem in a world without papers for the sake of papers. Someone I knew abandoned his PhD after plagiarism of his unpublished work-in-progress rendered his largely-completed thesis effectively useless. He never really found his way after that, and although his suicide had a number of causes, this did not help.

Maybe you are thinking that the PhD mill is itself problematic, but there comes a point where saying "if things were different, this would not be a problem" becomes a way to avoid the issue.


Papers are not written for the sake of writing papers. Papers are written for the sake of getting grants. Publishing is a proxy for increasing your chances of getting grants. Papers with more citations get bigger grants. The whole publish or perish idea is working at the wrong level; it's really get grants or perish.

The reason why no one does replication studies is because no one offers grants to fund replication studies. If you want replication studies, lobby to pay for them at the federal level, because that's where everyone's grants come from. If that money comes available, you'll have as many replications studies as you could dream of.


In general, even outside of a strictly academic setting (e.g. books), there would seem to be a lot of good reasons to attribute when possible. Does that mean that every concept needs to be traced back to its origin--even if that's possible? Mostly not. But especially if you're lifting a paragraph wholesale from someone, yes you should footnote it or otherwise attribute it.

This is of course harder in other media.


As a hobbyist song writer I think about this subject often. There's also "accidental plagiarism". There are tens if not hundreds of millions of songs out there with melodies composed of only 12 distinct tones. And there's no way to check if that melody you just came up with doesn't exist already. In fact, chances are it does.

Paul Mccartney said during one interview that when he wrote the melody of "Yesterday", he wasn't sure if it was his or he overheard it somewhere, so he spent 2 weeks playing it to various people. Consider the fact that Yesterday's melody is fairly elaborate and it was the 60s. Tens of millions of songs were written since. The conventional 3 minute song format simply ran out of melodies. So either songs need to adopt more complex approach of Bohemian Rhapsody (unlikely) or plagiarism is absolutely inevitable.


With the robot 9000 project I got the impression people underestimate the number of permutations and how easy it generally is to generate them on the fly.

You don't just got "only 12 tones" but also tone length, pauses, melody length, etc.


Of course. But plagiarism, or close similarity is what we're talking about here. If I take someone's melody and stretch a few tones, that doesn't really make it a different melody as viewed by law or society in general.

Regarding the permutations - absolute majority of randomly generated melodies will sound like shit, complete gibberish. Like think worse than impro jazz. So when you only leave the "useful" ones, that number is drastically smaller.


> If I take someone's melody and stretch a few tones, that doesn't really make it a different melody as viewed by law or society in general.

You are literally describing the technique by which Western polyphony developed new melodies!

For centuries composers took the exact intervals of a pre-existing popular melody, had one singer sing it at a glacial pace, then had one or more other singers fill in the gaps with new sprightly melodies.

For example, consider:

https://en.wikipedia.org/wiki/Missa_L%27homme_arm%C3%A9

Same title, same exact super-slow melody (a.k.a. cantus firmus), yet no plagiarism in the sense you're using it.

Josquin's mensural canon[1] flies out of neverminder's nostril at the last possible moment:

Go go go!

Goooooo!

Go Go!

neverminder sneezes

1: https://en.wikipedia.org/wiki/Missa_L%27homme_arm%C3%A9_supe...


I remember hearing an interview with Dave Brubeck on NPR, in which he mentioned that the melody of Schutz's "Oh, Sacred Head" (also found in Bach's St. Matthew's Passion) was from a folk melody, a song about a guy being jilted.


A story about what happens when very accessible recording tech, automated plagiarism-search, dirt cheap mass storage & sharing, a relatively small set of combinations of sounds that people find appealing, and long copyright terms collide.

http://spiderrobinson.com/melancholyelephants.html

IMO we're already about where this story projects we'd end up surprisingly fast under that set of circumstances, and are just kinda in denial about it.


>> melodies composed of only 12 distinct tones

Yeah yeah the "only so many notes in music" argument.

As a hobbyist songwriter I'm surprised you would say this knowing what semi-tones are, as well as how identical phrases can be entirely different melodies when the backing chords are taken into account.


12, that included the semitones. Very easy to verify that on a piano, wouldn't you agree? I'm talking about conventional western music here. Also, changing the backing chords doesn't actually change the melody, just makes it sound different. I'm not a lawyer, but I don't think this kind of argument would fly in court. In fact there are historical precedents - like Huey Lewis vs Ghost Busters song.


Copyright only covers melody and words, not harmony. Weird quirk of our system.


Why does the distinction of "traditional western music" matter?


Traditional western music assigns 12 notes to an octave using a tuning known as Twelve Tone Equal Temperament[1]. These twelve tones include whole tones and semitones. The vast majority of western music is composed exclusively using these twelve notes.

I believe the GGP meant to refer to “microtones,” which are frequencies that don’t fall into the Twelve Tone Equal Temperament tuning system. These are commonly used in non-western music (Indian music is a typical example), and occasionally in some western music styles (blues, some jazz, some sub-genres of rock, maybe), but you’re not going to find microtones in most western music.

The use of microtones would greatly expand the set of possible melodies, but those melodies would also sound very weird to someone who has grown up listening to western music. Also, these legal cases are not decided by precisely comparing the relative frequencies of notes in two different melodies.

Relating all this to the article, I personally feel like music is not a good parallel for the topic of academic plagiarism since the idea is the whole point of academic papers, whereas the point of music is harder to pin down, but it’s certainly not only to expand the set of melodies and chords used in songs.

Anyways, for more on music theory, I’d suggest Adam Neely’s YouTube channel [2].

[1] https://en.m.wikipedia.org/wiki/12_equal_temperament

[2] such as this video: https://m.youtube.com/watch?v=ghUs-84NAAU


From your explanation, it feels like a pretty artificial constraint. Why have no "western" musicians used other tones say "oh we have no new melodies" but just don't include some very real possibilities?

It's a bit odd from my perspective


It is artificial, in the sense that the western twelve tone scale is essentially arbitrary (why not 6? why not 16?), and it is a constraint in the sense that what most people consider “in tune” is defined by what they’ve gotten used to hearing. Culturally, in western music, that is twelve notes to an octave.

People from all cultures can and do use notes from outside those twelve tones in music. In western music, sometimes you’ll notice (check out King Gizzard and the Lizard Wizard[1]), and sometimes you probably won’t (that “raspiness” in blues music is usually because they’re singing/playing a note slightly “out of tune” aka “microtonally”).

Music and “what sounds good” is a cultural construct, and I’d encourage you to check out music from around the world as well experimental microtonal artists if you’re interested in hearing what happens when you aren’t tied to the idea of twelve notes in an octave.

[1] https://m.youtube.com/watch?v=U72rbtrufws


Personally I quite enjoy when my expectations are broken by interesting patterns in music. I enjoy a lot of 'experimental' stuff. Again this is just my personal preference, but most of the popular music in the US I find boring because it all seems to be rehashes of the same patterns, just getting louder.

I wish more people were exposed to a wider variety of types and patterns in music, there's a lot of great sounding music out there which breaks the usual mold played on radio stations


I'd guess personal preference of the musicians themselves is part of the reason microtones aren't more common in "western" music. Most of the musicians that I personally know at least seem to prefer creating/performing music that sounds good to them. And compared to non-musicians, most of them are also more aware of and bothered by notes that are "out of tune", especially ones with "classical" training.


> I wish more people were exposed to a wider variety of types and patterns in music, there's a lot of great sounding music out there which breaks the usual mold played on radio stations

The 12 tone standard didn't just come out of nowhere, it is dominant for a reason. It developed throughout the entire existence of humanity. Think about it, currently the scientific consensus is that singing actually came before speaking. It's those 12 tones that create the best sounding melodies and harmonies. The greatest composers of all time used them. Maybe you're a guy who likes impro jazz (which coincidentally uses micro tones), but most people don't.


But again, isn't this primarily a "western" phenomenon? That makes me think it isn't quite so clear cut

Side note: yes, I love love jazz in all its forms ^^


We know it goes all the way back to the ancient Greeks, don't know its prior origins or if it developed in parallel in other places.


The use of microtones, or rather lack thereof, is what I'm referring to (from a comment by pythko):

>I believe the GGP meant to refer to “microtones,” which are frequencies that don’t fall into the Twelve Tone Equal Temperament tuning system. These are commonly used in non-western music (Indian music is a typical example), and occasionally in some western music styles (blues, some jazz, some sub-genres of rock, maybe), but you’re not going to find microtones in most western music.


>>> (why not 6? why not 16?)

12 is the simplest scale whose intervals match the harmonic sequence. It's a technology.


The 12-TET certainly has some neat mathematical properties, but my point is that it’s not the only way to decide what “notes” are. It’s an interesting debate on whether there’s something fundamental to this particular approach that sounds good to the human ear, but I think it’s clear that there’s a hefty cultural component.

It’s also worth noting that modern 12-TET is not the series of whole number ratios that someone might expect from basing it off a harmonic series [1]. It’s an approximation based off a logarithmic scale. 12 Tone Just (or “Pure”) Intonation sounds pretty weird, and in my opinion, bad. If people had been making music with 12 Tone Just Intonation for the last millennium, maybe it would sound good to me!

https://en.wikipedia.org/wiki/12_equal_temperament#Just_inte...


Indeed, temperament is its own technology. I assume that the music, technology, and culture co-evolved. And nothing stays constant for a millennium. Temperaments were developed and refined as music began making fuller use of the available scales and chords. Yet each system had to be within the capabilities of musicians to tune their own instruments. A harpsichord had to be tuned before every performance. Equal temperament had to wait until instruments (the modern piano) were stable enough to stay in tune between visits of the technician.

To make just temperament sound "good" even relative to the ears of another time period, probably required playing music written for that particular tuning.

As a double bassist, playing pitches repeatably enough to claim any specific temperament would be a lifelong challenge, if it's even attainable. Most of the time, musicians don't think about temperaments. We try to play in tune and sound good.


They do, but it's difficult because instruments aren't built for it, and people don't like it. You're free to use any tone you like to make the music that you want to make.


Perhaps people don't like it because they aren't familiar with it? Maybe all it would take is artists slowly incorporating new tones and patterns slowly for people to become accustomed



Looking back I meant to say "microtones" but yeah I'm pretty smug, sue me.


I hear Richard Hendricks made a website for that.


Yeah. "Anton" would be handy now. If it was real.


You know what I like? When I’m reading something and some window pops in my eye interrupting me with a beg to subscribe to some bullshit.

No wait. I hate that. Why would you think people want that? Why do we build crap like that? Why do we tolerate it? This is our community, we are the ones who build it.

Let’s take back control and crush crappy behavior like that.


You already can by blocking JavaScript execution with add-ons like NoScript. And if the site's script must be run, uBlock Origin can be configured to auto-hide elements that pop in.


Browsers have built-in settings for allowing or disallowing notification requests.


> science is owned by the community rather than any individual person

the problem is that the community is a group of many individual persons, ultimately it is the individual persons who by actively knowing the 'science' embody science itself.

the science is not exclusive to any individual or group (though nowadays, "important" science is really restricted because nobody wants the Chinese copying from them).

As I see it, the gist of the whole issue is establishing property rights over pieces of culture.

Why would anybody do this? well it seems to me that the biggest driver is that if things are exclusively owned, then they can be traded in some kind of marketplace.

The problem with this attitude is that culture (e.g. science) is not something that can be exclusively owned (a problem shared by digital things).

IMO, it all comes down to people wanting to be paid (with royalties and/or recognition) for what they did a long time ago.

with a practical intent, I think that the question which needs an answer is how can we have a marketplace to trade things which are not exclusively owned?

or maybe we should reconsider the general practice of continuing to reward people for things they already did?


Ignoring plagiarism to me is like ignoring broken windows and other minor crimes. At first it may sound practical but the end result is not good.


A horrible article that rationalizes the collectivism that is also prevalent in software these days:

10% of productive people create 90% of everything. The 90% unproductive people engage in ideology, politics, defamation and rationalize why the output is actually theirs. They are the public face of the stolen economic or intellectual output.

As a productive person, just say "no"!


It's about the cost of credit. References are free in academia. In music crediting means paying royalties. In academia the assumption is that citing is fair use up to a point (a paragraph? a page?). Since songs are much shorter, the window for "fair use" is also much shorter (5 seconds? 10 seconds?).


I think this article would be better titled "Who cares about plagiarism in science publishing"? It gets off on the wrong foot by starting with an anecdote about Bob Dylan, when that really has nothing to do with the rest of the article. Music and science can produce very different arguments about why you should care/not care (e.g. the argument that plagiarism allows bad actors to get ahead in academia and then pollute science makes no sense with respect to folk music). Yet another case with its own set of arguments would be plagiarism in a school setting.


Perhaps it would be useful to think about plagiarism in quantitive terms, rather than yes/no. There are hundreds, if not thousands of papers published in obscure journals that are word-for-word copies of other peoples' work. And even more papers where entire figures and paragraphs were lifted without attribution (or modification). I suppose one might argue that this is not a problem, because the journals are typically obscure (more reputable journals now run software to look for duplication), but I think that taking someone else's published work and calling it your own damages science. As does re-publishing the same results multiple times under slightly different titles (it's more difficult to refute a finding if it has been "replicated").

There is no doubt that "plagiarism" can be ambiguous ("self-plagiarism??") and copying with modification can be justified, but there are much more blatant examples that I think clearly cause harm.


Grant that the obsession people have with citing everything - even basic stuff that everyone knows, like old poetry - is quite absurd. Just put it in italics or quotation marks or something. Either people are smart enough to figure it out, or they're too stupid.

That being said, I think there is the broader issue of discoverability. As a practical matter, it's very convenient, if I have a paper on X, that I can look at the citations and get good links to resources on Y. This isn't really a matter of making the author of this or that paper happy either, it's about giving the reader some useful information.

And I just don't see how to replicate that without citations.

That's not to say the social science style (Davis, 1953) isn't obnoxious, though - there it really seems (Johnson, 2010) to be more about name-dropping the big guys (Smith, 2007), than about putting a small unobtrusive number in a superscript.


I certainly don't, I view everything I've ever written as MIT-equivalent. Feel free to copypaste what I've written without my express permission or rewrite it as you see fit. Does not matter to me whether this is to create spammy blogposts, commercial ventures, or for your school essay.


This. I don't publish code I think is useful. I realize that's contrary to the open source ethos, and that many people would consider it hypocritical that I would then use open source without compunction - but I view open sourcing as a choice, and closed-sourcing as an equally valid choice.

What I would never do, or never use, are restrictive/copyleft licensing terms. There don't begin to be enough hours in the day to spend even one second arguing about copyright with geeks. You get dirty, and the pigs like it, etc.


> What I would never do, or never use, are restrictive/copyleft licensing terms.

Maybe just consider them a form of closed-source (where you can look at the source) instead of raging about them?


I view my statement and your statement as effectively the same thing.


Maybe we can make plagiarism a non- issue.

Of course, everyone builds on the work of others. The problem with plagiarism is that an author builds on the work of others without citing.

However, with computers and the data we have, I think we are pretty close to the point where a computer program could quickly do similarity searches and automatically generate all the citations.

A lot of this technology already exists with plagiarism detectors which can detect uncited sources. We just have to make it add those sources as citation.

It could kind of be like auto format for scientific papers. The analogy would be like instead of getting into a discussion about formatting and tabs vs spaces on every pull request, we just have an auto format that runs for every pull request, and removes that contention

We might be able to do the same with citations and make plagiarism a thing of the past.


I don't want to cite, don't want to be cited, don't want to care about any of it. If that means I have to subset my code down, so be it. It's worth it to not have to care about all of that unproductive mess.


Higher Ed IT instructor / Lawyer here. I mostly don't in terms of how academia does it?

I think the issue is "attribution" vs "you're supposed to be original."

It's funny, because of course in law, literally only the first thing matters and the second is completely unnecessary and not part of the equation -- when it comes to adjudicating real life matters. I find that this stretches over to the academic work I do and how I grade it.

I generally try to do something like -- look, show me via writing (and frequently other media, screenshots, whatever) that you've done work in looking at this stuff. But I don't really need or expect you to "come up with something new." -- mostly because, what could "new" possibly mean here?


The author of the article appears to have conflated plagiarism (a moral offense) and copyright infringement (an economic crime). The two aren't the same: you can be 100% Free and open, and still be very much anti-plagiarism.


I just finished a CS class in Information Security that was so strict about plagiarism that you had to cite your source for every little bit of information you look up on google while you were programming. This wasn't just code you were copying (which was strictly forbidden unless it was pseudocode), but googling basic things like YouTube videos of concepts covered in class like public key encryption. My works cited page I submitted with my last assignment was at least 5 pages long. Meanwhile in the professional world, it's a completely different set of rules.


I was actually quite surprised to find out that self-plagiarism is also a thing, which will get you punished in academia.

It was something which I had done many, many times myself - where I'd just copy/paste boilerplate stuff from my older reports or papers, into my new papers.

A couple of years after graduation, I read articles about students getting caught and expelled for self-plagiarism in things like home exams.


But in academia we need to know the source of the idea in order to research the roots of an idea. I look at citations like hyperlinks of academia so using ideas without citation doesn't sit well with me. And the more reproducible/legitimate a paper is the more citations it usually has, that is an important metric for judging the quality of any research paper.


It's like BSD attribution licenses - pretty soon you have megabytes of license file that is a pain to maintain and that nobody ever views or verifies. Why not maintain an encyclopedia of significant discoveries in the field and let insignificant trivia to be free for all? We have the later with stackoverflow and it works.


Citation provides access to background info about an idea. Some of that info could weigh in favor of, or against your conclusions. Since it's inconceivable that a typical study is based on 100% original ideas, failure to cite raises a reasonable suspicion that you're hiding something.



If an artist is clearly a master of his or her craft (e.g. Dylan, Picasso), I don't mind plagiarism so much.

However, I found this passage wrong-headed:

> “Academic plagiarism norms”, Frye writes, “are primarily an inefficient and illegitimate form of extra-legal academic rent-seeking that should be ignored.” And although he doesn’t quite call those who disagree with him “wussies and pussies”, he does disparagingly refer to the “plagiarism police”, which includes anyone who tries to identify academic plagiarism and somehow punish the plagiarist.

I completely disagree. I am a non-academic who takes published academic work seriously. My number one discovery mechanism for new knowledge is not arbitrary web searches or the even a library catalogue, but citations. A strong norm against plagiarism in academia (where everyone really is standing on the shoulders of giants) is essential for academic knowledge to be available among the public, not to be cloistered away, only available to a learned elite.

Academic research already presents so many accessibility barriers to the public (stiff & boring writing, copious jargon/symbols, excessive abstraction, lack of data for reproducibility, false published conclusions and even publisher paywalls to name a few), I'd hate to see academia create another by normalizing plagiarism.


https://xkcd.com/978/ is one reason to care.

If many papers state the same thing and don't cite each other, a reader could reasonably conclude that thing must be true, as so many people have independently verified/discovered it.

The well of knowledge on Wikipedia, and the Internet as a whole, is forever poisoned due to journalists not consistently citing sources.


Thirty plus years ago Biden dropped out of the presidential primary when it was shown he was a plagiarist. Today, he’s president. So it’s safe to say that no one cares about plagiarism.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: