Hacker Newsnew | past | comments | ask | show | jobs | submit | jinqueeny's commentslogin

I am so sorry about it. The website was down due to some server issues after launch. The problem has is fixed. Hopefully, it can work as you expected.


Sorry. I will restrain myself.


Sorry if the post sounds too marketing-heavy. I'm sharing this case study because it addresses a common technical challenge: scaling a high-traffic application (Kwai, a video platform) from traditional MySQL to a distributed database system. It describes the journey from managing 300+ MySQL shards to consolidating into a single 400TB cluster using OceanBase, handling over 1 million QPS during peak loads. While the post is oriented towards promoting OceanBase, it resonated with developers because it provides specific technical details about real-world scaling problems, architecture decisions, and performance metrics that many engineering teams face when growing beyond traditional MySQL setups.


This comment pegs my AI-slop-o-meter scale high. No real human genuinely talks like this. Perhaps sharing specific problems you ran into in your single host to distributed sql journey that this product solved for you would be useful? That would be much more interesting than a post about a marketing page and Someone Else’s Problem.


Your AI-slop-o-meter needs calibration


AI-slop-o-makers were trained heavily on marketing-speak and they often sound very similar.


are you working for OceanBase?


It has a lot of use cases in the mining industry, such as monitoring mine water, ventilation systems, fluids in tanks and pipes, distance and angle measurements, remote operation of actuators and equipment, environmental monitoring, etc.


ops, sorry. Here it is: https://github.com/myscale/ChatData


In short, Apache Pulsar beats Kafka:

- 2.5x Maximum Throughput Compared to Kafka

- 100x Lower Single-digit Publish Latency than Kafka

- 1.5x Faster Historical Read Rate than Kafka


Disclaimer: PingCAPer here. Thanks for your attention to PingCAP and TiDB. It’s true that we are on HN front page for many times - that’s incredible recognition for the value of our technical content, and we are humbled and thankful. We take great pride in our work, and have always committed to valuable, accurate and insightful content.

As for referring our PVLDB as “the first industry paper”, thanks @karsinkk for pointing out. We admit that the original description was inaccurate and just updated it to be “This is the first paper in the industry to describe the Raft-based implementation of a distributed Hybrid Transactional/Analytical Processing (HTAP) database.”

However, we have never claimed ourselves as “the best, the fastest.” If you find such occurrences and think they are inaccurate, please do share with us and we will correct the same ASAP. As pointed in https://pingcap.com/blog/9-why's-to-ask-when-evaluating-a-di..., “There's no single technology that can be the elixir to all your problems. The database realm is no different. If your data can fit on a single MySQL instance without too much pressure on your server, or if your performance requirement for complex queries isn't high, then a distributed database may not be a good choice. Choosing to use a distributed database typically means additional maintenance cost, which may not be worthwhile for small workloads. “

Regarding your comparison between TiDB and MySQL, there are differences between MySQL and TiDB as stated here: https://opensource.com/article/18/11/key-differences-between.... If you are still interested in trying out TiDB with your real life workload, we are very happy to help you. Please reach out to us in our community Slack: https://slack.tidb.io/invite?team=tidb-community&channel=eve...


I may have had an axe to grind when TiDB didn't fit our workload so I apologize for being over the top there. You have good marketing, but I'm wary of companies like yours that maintain such a pervasive social media influence. It's a fine line between creative marketing and deception on social media where identities are nebulous things.


TiDB has a similar case study: Queries over 1.3 Trillion Rows of Data Within Milliseconds of Response Time at Zhihu.com https://pingcap.com/success-stories/lesson-learned-from-quer...

The latest stats in the same case scenario (already-read posts) Zhihu is:

- 2.6 Trillion Rows

- 560TB data

- 200 TiKV instances


Very cool! Thanks, I will look into it.


The link to the paper (on VLDB 2020) is now available: http://www.vldb.org/pvldb/vol13/p3072-huang.pdf


I am wondering have you considered the open source NewSQL database solutions like TiDB, CockroachDB? They have the best of both traditional RDBMS and NoSQL and would be the perfect choice to the hyper-growth scenarios.

TiDB could be considered as a scale-out MySQL and CRDB a scale-out Postgres. They are all Spanner-inspired solutions that can help avoid manual sharding.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: