Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The site is being crushed by traffic right now -- but without reading the article, I've also found that some stories that I thought were important that were scrubbed from HN's front page just about as soon as I saw it (when I doubled back to read the comments)...

While I realize I'm not entitled to explanations, some transparency would be appreciated. Maybe it could even be automatic, whenever a mod removed something forcibly from front, they could leave a comment and it'd show up on some page?

[EDIT] - After reading the article, if a mod did indeed take down the post because it discussed reverse engineering the rank algorithm, I think that's pretty naive. Security through obscurity isn't a thing, and the better response is just to make a better algorithm, not try and suppress knowledge about it.

I say this naively myself, as I've never had to maintain a ranking algorithm with these many users who depend on it (or any at all for that matter), but surely the problem isn't intractable?



> Security through obscurity isn't a thing, and the better response is just to make a better algorithm, not try and suppress knowledge about it.

Obscuring an algorithm or making it more tedious to reverse may not make it perfectly secure, but that's not the goal. It's not like actual information security, where loss of the encryption keys means your product is broken or your database is on the Internet. You're just trying to minimize the workload on humans who act as a back-up for the few posts that slip through.

If an email spam detection algorithm was public, spammers could precisely craft their content to slip through. If the heuristics for showing a CAPTCHA were public, bots could automate their requests to avoid it. If a ranking algorithm was public, people who might financially benefit from the front-page traffic could force content there through vote rings and sock puppets.

If the algorithm is secret, far fewer will be able to do so, and this small fraction of abusers can be handled by humans.


You can always find reasons why something has to be a certain way. But at the end of the day, you can't expect to attract and retain curious hackers/entrepreneurs in this way. The author presumably spent some time compiling the original story, the math symbols and graphs are quite nice for instance, just to have it removed without explanation. It's not a very good way to treat people.


There is no proof it has been removed by different means other than those mechanisms all stories are subject to. I myself am not convinced at all.


Does it matter? The author seems representative of exactly the type of users you want in a community like this. Long standing user account, high karma, produces decent comments and great content (the visualizations in the Swype article is another example [0]) and even puts HN comments on the blog. Whatever mechanism that ends up discouraging people like this in favor of trigger happy flaggers is broken. Fortunately for the community the author didn't go "fuck it", but instead wrote another sensible blog post which makes them an even better community member.

[0] http://sangaline.com/post/finding-an-optimal-keyboard-layout...


See also, a discussion a couple weeks ago about how people can "buy" upvotes on HN and how HN takes steps against that: https://news.ycombinator.com/item?id=13676362


Once it's public (even unofficially), people will begin to rely on the current implementation and then complain if it is ever tweaked or changed. They build towards the value it grants and then complain that the value was arbitrarily stolen from them.


That's for sure. I rarely visit SEO websites or forums, but the few times I did I saw lots of complaints from people who seemed to think that Google owed them something after tweaking their page ranking algorithms.


But aren't the people who are relying on that value precisely the people you want to keep out?


The debate is not really whether the algorithm should be made public or not, it's about the method used to try and keep it secure.

If the algorithm can be reverse engineered then trying to suppress the knowledge that is already "out there" will only create an illusion of it being a secret, and the fewer people know about it the more damage they can potentially do (i.e they more they can financially benefit from their knowledge).

It's the same as with information security - if you discover an exploitable bug then chances are someone else has already discovered it too (or can discover it any time) so making it public is one of the most sensible things you can do.


It's not the same as with exploitable bugs, because exploitable bugs are fundamentally preventable. Not necessarily in aggregate; but individually, all bugs can essentially be patched given enough time or effort. There's no benefit to keeping them secret if their threats can be neutralized.

As I outlined in another comment in this thread, algorithms that do not offer or adopt significant authorization constraints (as quantified by time/monetary costs) cannot be "fixed." This is fundamentally why reverse engineering e.g. HMAC signing algorithms, search results ranking, spam filtering or front page listing algorithms is possible. The generous usability requirements do not allow for authorization that would mitigate reversing the algorithm, even when it's not embedded in an untrustworthy client.

Suppression is essentially all you can do to prevent reverse engineering, and suppressing the knowledge of how to reverse engineer an algorithm is in effect the same as suppressing the algorithm itself.


I think the difference is that one can make verifiably secure software. Can one make a verifiably ungameable ranking algocfor a news site?


What does "verifiably" mean for you? Are you talking about provable security?

First establish an upper bound, worst case scenario cost (as a function of time + resources) to fully reverse engineer the algorithm. Use that as the comparison benchmark, and if you can come up with a design that eliminates any reverse engineering efforts with fewer costs than worst-case, you've done it.

Here's where that breaks down: "ungameable" is not precise enough to establish worst-case bounds for, in the same way that we can establish worst-case bounds for breaking an MD5 hash (brute-force it - what does "brute-force it" mean for gaming a ranking algorithm, or reverse engineering more generally?). Other than that, were you to come up with such a measurement, it would almost assuredly increase the costs of reverse engineering to infeasibility by increasing the authorization controls in place and decreasing the usability requirements.


Well, probably not for such a broad description of "ungameable", but there is an entire research field dedicated to try and come up with such algorithms/systems: Mechanism design!

https://en.wikipedia.org/wiki/Mechanism_design


I'm pretty new here, so take this with a grain of salt. My impression of this place is that it's a pretty highly regulated forum. In particular, it's designed to serve the needs of YC, and is not particularly free. That's the good and the bad of HN. It's a walled garden essentially, but instead of a wall it has very busy gardeners.

Honestly, it's why I like it, but I also recognize that behind all of this is a business, not a chatroom.

Edit: To clarify: there are so many places online to speak whatever is on your mind. What's lacking are places to have a decent conversation, any conversation, that can remain a conversation. How many places like this, with this many users, exist? I can't think of many.


The naive thing is coming up with 17 complicated theories about this instead of just emailing the mods and asking. They're very responsive.


No, the naive thing is keeping this kind of crap in private messages instead of in public where we the community are owed an explanation.

I wonder how responsive the mods will be when I publish just how hackable the YC infrastructure is next week.

To heck with responsible disclosure since HN mods apparently don't believe in such a thing.

It's fair game, boys. This is how the mods want to treat us, the users can respond in kind.


You do know this is a nerd message board on the internet not a re-enactment of '300', right?


We're on a discussion board web site. Using it to discuss the subject would seem the obvious place, right? Perhaps the mods could use their own site and be responsive to everyone...


> whenever a mod removed something forcibly from front

Mods (as far as I know, I'm not one) never forcibly remove things from the front page. These are almost always the result of user flagging.


Those are marked first "flagged" as a warning, and then "dead" when the user flags reach a high enough level.

This article however is about stories that disappear without receiving either.


As others have pointed out, the [flagged] annotation means heavily flagged. A story can be downweighted off the front page by user flags long before [flagged] shows up. Indeed that's what happened to the submission the OP is complaining about. Moderators never saw it.


When does the flag label appear though?

Here's an example of a submission that does not include the [flagged] label, but which has a moderator saying it got flagged, and that the flags are the reason it's not on the front page. https://news.ycombinator.com/item?id=13741276


Just a few flags from high karma accounts can knock a submission off the front page if it isn't getting enough counter up-votes. It wont be marked as flagged.


Flags seem to apply downward pressure before they build enough to get the "flagged" label.


Flags downrank a post, a post can drop off the front page well before 'flagged' shows up.


There can be the reverse though. A controversial post may get flagged repeatedly but the impact of flagging could be disabled by mods to not disable the post.


I am of the mind that Mods can also prevent submissions from an account from ever being displayed in the page rankings.

Many months ago, I had several submissions that were visible when I was logged in, and yet when I accessed HN as a guest, they were not in the submission ranking.


That's something that happens. It's always worth talking to the mods to sort that out. There's a bunch of stuff to detect vote rings, spam, etc etc and sometimes it unfairly blocks the wrong stuff.

From what I've seen mods are happy to fix any problems.


The moderators bury things on the front page pretty routinely.


tptacek 1 hour ago

The moderators bury things on the front page pretty routinely.

That's a pretty big (and vague!) accusation to just throw out there, especially in light of this comment elsewhere: https://news.ycombinator.com/item?id=13858327

dang 6 minutes ago

As others have pointed out, the [flagged] annotation means heavily flagged. A story can be downweighted off the front page by user flags long before [flagged] shows up. Indeed that's what happened to the submission the OP is complaining about. Moderators never saw it.


Read Dan's comment at the top of this thread.


You mean this comment, right? https://news.ycombinator.com/item?id=13858395


I agree with you overall regarding transparency. It would be nice to see removed pages listed somewhere, or perhaps just the fact that it was done with a counter and maybe a timestamp. But as someone who reverse engineers a new mobile app nearly once a week these days: candidly, you can summarize nearly the entire discipline as security through obscurity :).

You use the word "suppress", but we generally can't use words stronger than that in security. We deal in abstractions that are hard to reasonably quantify, so we do so by approximations of monetary or temporal costs, and frequently both. We also try to limit absolutes. Reverse engineering exists as a discipline because 1) contrary to popular opinion, security through obscurity is valid, if incomplete and 2) there's frankly not much more that you can do in many cases other than obfuscation.

There are situations where algorithm secrecy gains you nothing defensively and is actually a strategic disadvantage, such as in encryption or hashing. But in situations where you fundamentally cannot discriminate between authorized users, such as in email (spam), search results (SEO) or, here, Hacker News (front page), you cannot rely on the strength of the algorithm to properly discriminate between users, because that's not its intended purpose. In these situations, obscurity is essentially your only remaining option.

To be fair the Hacker News moderators have more control of the ranking algorithm being reversed as it's on a remote server they control, as opposed to embedded in a client deployed to inherently untrustworthy hands. And once the information is out, it's out. But I don't agree that they have a trust or transparency imperative to keep that sort of submission on the front page. Even if the information exists, there's no reason to make it even more accessible. They can remove it and also improve the ranking algorithm.

If you want to design a general purpose web application without significantly reducing usability, functionality that is not restricted through authentication or higher levels of authorization is susceptible to reverse engineering. Being that the ranking algorithm is not client-side, there are fundamental protections we cannot bypass, but much of it is still inherently obfuscation. There is rate-limiting of course, and you have to log in, but the inherent inputs and outputs can still be somewhat flexibly assessed over reasonable timespans because there are hard usability requirements in place.

The tl;dr: algorithms which cannot be gamed because their inputs have significant quantifiable and controllable time/monetary costs do not require secrecy - these are excellent for implementing authorization. Algorithms which do not have such costs are not appropriate for authorization and, unless also paired with significant authorization constraints, require some degree of obfuscurity.


"While I realize I'm not entitled to explanations, some transparency would be appreciated."

No, as a community we are all owed a public explanation.

And if stcb and dang can't provide it, they need to step down and make room for someone who can.


We're not owed anything. This is a free community and there's no contract, implied or otherwise.


The internet is big enough for you or anyone else to start their own clone. They don't have to step down for that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: