Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> ... but then people try to overdo it and deal with UTF-8 before having the basic algo working.

Wait... We all know (?) that Java was conceived before Unicode 3.1 came out and somehow thought that 16 bit chars would forever be enough to hold unicode codepoints and hence Java strings are now a mess of both char and codepoint methods but... What's UTF-8 got anything to do with your question, even after a basic algo is working?

If I'm using Java's charAt(...) on both strings, what's the encoding got anything to do with matching if the substring is present? Who cares that one character may have a codepoint encoded using more than one Java char primitive? It's either encoded the same in both strings or it won't match right?

I'm confused.

EDIT: you said it's not fancy, so you're not looking for people to match combining codepoints in one string matching another codepoint in the other string or things like that right?



What's UTF-8 got anything to do with your question, even after a basic algo is working?

ربما يجب عليك التفكير أكثر قليلا في السؤال.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: