Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Most early-stage startup use the best practice of “delete=1”

Who are you people who can’t/won’t actually delete something from your db’s?



DBs, memcaches, tape backups, offsite storage, log files, etc.

Past that, deleting things from databases is sometimes hard. If, for example, I delete userX, and userX was the founder of a number of forums, or chat rooms, or groups, or facebook pages that are linked to userX? Do those groups and forums and things count as 'belonging' to userX? If userX happened to be the guy who created /r/news, do we delete that subreddit, and all of the content therein?

What if userX was a paying member? Do you delete all his old invoices? How do you make sure that doing so still allows you to balance your books?

There are indeed real world scenarios wherein just deleting a user and cascading that delete throughout the system breaks things. In some cases, it might be better to replace userX's personal details with 'AnonymousUserX', but then that might leave behind content they've generated, which you then have to replace with "DELETED CONTENT" or some other stub, which causes complications.


Absolutely. It just isn’t easy.


I only know of two group of people, either incompetent or just plain dishonest.

Because they either argue that it is hard to design a database that allows deleting or anonymization, or it is that they're in the business of selling data and won't delete anything and rather lie to their user and customers.

I would be interested to know if there is any other argument for this.


The problem is dealing with software and databases that weren't designed for deleting and anonymization. I do not envy all the developers who are going to have to rewrite crappy legacy code to be compliant.


What if the contents of that DB get pre-rendered to disk or memory for caching (eg: prerendering a bunch of HTML)? Do you blow those away? Which ones? What if it turns out to be a substantial number of cache records you need to blow away? Whats gonna be the performance impact of that?

I agree with the OP. People who assume this shit is easy haven't really thought about the problem much at all. There is a lot of data stored out there in ways that wasn't really designed to be mutable.


You have 30 days to comply, so if you regenerate those cached pages regularly (e.g. once every 3 weeks) you're fine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: