People doesn't seem to understand that they cannot have word-to-word translation when it comes to Japanese vs Latin root languages. Also there are grammatical and syntactical rules that you cannot get right by an educated guess - "educated guesses" such Google translate...
See, Google translate doesn't care if a person doesn't know that when it comes to Proper names such as the name of a person the Kanji reads in a completely different way than the regular dictionary words.
There is no translation for these class of words just as there is none for words like Jason, Tom, Samantha, etc.
In such cases one needs to memorize the reading of the character, which reads that way only when it is used for Proper names. If the Kanji of a person have multiple reading such as 淳子, Atsuko, Kyoko and Junko, that person would use what is called Furigana, the Kana only reading showing the sound of that particular Kanji. This happens when you are filling a form, in a contract, in IDs etc.
In other words feeding people names to Google only shows only one thing. That person doesn't know what is doing.
Sadako is Proper name of a popular iconic character in Japanese culture that embodies a little girl killer ghost and it is based on a real Japanese person that was said to be a psychic and lived in the 1900s. They made at least one horror Japanese movie out of that story and even Hollywood had a go at it! Now, how many girls you think they have been named Sadako in Japan by their parents?
Once more Google translate doesn't give a damn if the person using it doesn't know what is doing.
In regular writing to mark Proper names a number of suffix are employed さま, どの, さん, くん, ちゃん and more. Every suffix has a different and unequivocal use case. It can be written in Hiragana, like I just did or by using Kanji. If the Kanji is adopted looks like 様 reads さま, 殿 reads どの etc. Also for such words there isn't a clear-cut translation, because they are employed as a result of the implications they come with. For instance 様 or さま is used when referring to customers or clients, but in a business letter you would use 殿 or どの with similar significance.
Most amusing is "淳子殿 translated to "Mr. Reiko". On Google Translate left windows in the left-bottom corner there is one of the possible correct reading of 淳子殿, namely Junko-dono. But on the right it's messing it right up. That's the demonstration that Google Translate has no brain.
By the way Reiko is written 麗子 as a Proper name... and so is Rika, Akirako, Yoshiko, Urarako . Get the point? You need Furigana.
About gyoza, that's the name of a very popular kind of Chinese dumpling made in Japan, and there is no Kanji for it its written in Katakana only as チャオズ. But since it's a Chinese word you can used the two sounds ぎょう and ざ in Hiragana to write the two corresponding Kanji 餃子. To understand why, you need to read my post down the page. It seems Google translate cannot read the first Kanji right.
If you feel so inclined, mopreme, read my post down below which covers your original blog post extensively and throughout.
Some statements you have made about the Japanese Language are ambiguous at best. Let's clear them up, but first let's be precise with the terminology. Pronunciation and sound have not interchangeable meaning. Every sound -or sequence of- in Japanese is associated with a specific character. It happens that some Kanji have multiple sounds or that a group of Kanji shares the same sounds, but still how are read it's univocal. There is no pronunciation involved. Just to give a quick example, the sound of あ is one and only one, while in English the sound of `a` has different pronunciations in pal, Paul, paediatrics, and so on. For the sake of simplicity it's ubiquitous savage practice overlapping the use of the "reading" with the one of "pronunciation" and the one of "intonation", but you should know the difference. The pronunciation transforms based on the preceding and/or following characters. There's no such concept in Japanese. The sounds of the Japanese language are distinctively dictated by Hiragana. Therefore what you have is a "Reading". A reading is made of one or more sounds. But once again readings or sounds have no multiple pronunciations. I hope I was able to shed some light on this complicated subject. It's a nuance, but makes an important impact.
Quoting: "I should note that there are two different alphabetical sorting orders in Japanese. For this article I am going to use the a i u e o (あいうえお) sort order."
Alphabetical order as we know it, it's one in Japan too. The order of the Hiragana works exactly like our alphabetical order: a preordered sequence of scripts from A to Z is the methodologically equivalent to the Hiragana from あ to ん, albeit there is no words starting with を and ん - last two characters of the alphabet. The Katakana alphabet is exactly like Hiragana, just the characters `design` changes - it is only used to write, in Japanese sounds, imported words from non-Japanese languages (Chinese and Korean are exception because you can use the Kanji to write them, although when writing just Chinese or Korean sounds, you will be using Katakana still).
When talking about `ordering` we normally refer to which sequence we are listing the Kanji: by their Hiragana sounds is the most common way of listing them. They can also be listed by what in English are called Radicals - foundational shapes composing the Kanji ideograph, they can be listed by strokes count or be listed by their Chinese reading - the Japanese sound of the equivalent Chinese Kanji, or even by recurrence.
In the Japanese language it's frequently used `ordering` by meaning, by which Kanji may or may not be listing, where only Hiragana is used to write all the words. This method is the closest rendition in Japanese to what we are accustomed to name English dictionary. More here https://en.wikipedia.org/wiki/Japanese_dictionary
I think it's already clear where the problem of sorting lays. Kanji, Hiragana and Katakana, though being different looking alphabets, have a well-defined scope and interconnection in the Japanese language. Kanji are words, thus have meaning; Hiragana offers a vocalization to the Kanji and also absolves the crucial grammatical role, providing the language with adverbs, prepositions, particles, determiner, verbs conjugation and more. Katakana usage is sidelined to just words foreign in nature, hence they exclusively represents sounds, carrying no meaning. Consequently when reading Japanese on a generic subject Kanji are mostly encountered, some Hiragana that connects them are present (it needs to be said that very common words are often written in Hiragana only, e.g. こんにちは means Hello) and sporadic Katakana when some imported word is used.
This is happening because in the Japanese language there is hardly any punctuation at all - yes, there's a full stop, a comma, a way to encapsulate direct speech, but no concept of empty space. You can read Japanese from any direction you set up yourself for. Traditionally it's read top to bottom from left to right. Modern books are read as ours would be, even so sometimes you turn the pages backward to further the reading, e.g. comics etc.
The English alphabet is used in case of very technical terms or for proper names - the context may sometime better served with romaji. But that's not a rule whatsoever. Katakana is employed more often. The Western alphabet -romaji in Japanese- adoption is similar to how you would take up French, Spanish, Italian or German words in your writing. (continue)
Quoting from "Sorting Settings":
"In this example you can see ABC and katakana are separated. Kanji are then separated from katakana. There were no hiragana in this list[...]"
The Hiragana on that list are the characters と and の. On that particular list の expresses the meaning of correlation and と convey the mere meaning of `and`. For instance 地域 と 言語 の オプション (notice I myself have entered the `spaces` for clarity's sake, but there should be none) is a very instructive example. Those are actually three distinct nouns you are trying to order as one: 地域 reads ちいき (Hiragana for `chiiki`), means area; 言語 reads げんご (Hiragana for `gengo`), means language; オプション read opushon, is, you guessed, option in English. Note I didn't write that オプション means option. オプション IS the word `option` written with Japanese sounds or, in other words, written in the Japanese script - i.e. Katakana, you guess it.
So 地域と言語のオプション can be translated as "Regional and Language options".
Quoting from "Sorting Names"
"It is very possible to have different people with the same name write their name in different character sets. The traditional way of writing the Japanese name of Ayumi would be written in kanji; a modern, stylish way would be to write it in hiragana, and a second generation Japanese-American might write their name in katakana or the alphabet."
Japanese always writes their name in Kanji. They don't use different sets. When the Kanji composing their names can assume several sounds, they write alongside what's called Furigana, the Kanji reading - usually as a subscript or superscript. Furigana is written in Hiragana (not Katakana as you stated later on), but for the Internet where websites could potentially be read by a non-Japanese crowd, Katakana might be used in some cases. Nonetheless as I said earlier Hiragana and Katakana differ only by scripting style, if you wish to call it as such. So which script is in use, it's not a relevant issue for foreigners.
All the other statements are a matter of opinion, save for the last: Japanese do write sometimes their names in the romaji, aka "ABC alphabet", especially when they are dealing with foreigners at any level. Although why, being 2nd gen Japanese-American entails writing its own name in Katakana, beats me. It's a tiny bit like saying, forgive me here, a 2nd gen Italian-American would write its name in Latin.
Quoting from "Kanji - The Real Problem":
"Kanji have multiple pronunciations, determined by the context in which it appears.[...]Only from the context in which the kanji appears do you know how to pronounce it."
That's like saying the pronunciation of the word 'pool' is determined by the context referring to water or balls. Not the pronunciation, but the meaning of the word is changed by the context. We all agree on this statement.
Single Kanji words change along with their meaning based on the context, but like the word 'pool', its reading is the same. e.g. あめ (reads ame) which means 'candy' if written 飴 or 'rain' if written 雨 . In this case the Kanji itself controls its meaning and reading, not the context. When single Kanji is a verb the meaning and reading changes with what's called Okurigana - Hiragana written after the Kanji absolving the purpose of conjugation -, while the Kanji remains invariant. For example 着く and 着る reads つく (tsuku) and きる (kiru) respectively, the former means 'to arrive' and the latter 'to wear'. Single Kanji aren't at all like you describe them.
Compound Kanji words follow a different ruleset build upon how many they are. Generally speaking one kanji in two-Kanji words has multiple readings depending on what is the word it appears in and where it appears in that word. You can learn the rules, or you can get used to them just by seeing them used in massive frequency. This case alone is as you've described: the need to know the context in which the Kanji lives. But since the meaning also changes with its reading, you could be able to catch the overall meaning of the sentence without being able to read that single word. But compounds Kanji with multiple readings aren't that recurring and they generally represents common words requiring not much effort to memorize.
Compound Kanji words composed by more than two Kanji reads unequivocally in one way as English words do with very few exceptions.
More here https://ja.wikipedia.org/wiki/%E5%90%8C%E5%BD%A2%E7%95%B0%E9...
If anything what could, quote "[...]keeps students up nights studying for years[...]" isn't the multiple Kanji readings, but the fact that you need to know between 2000 and 3000 Kanji and its their combinations that build words (mostly two Kanji words). So it's like having a permutation with repetition (in ordered arrangements) of 3000 syllables that makes words in pairs or singularly.
Quoting "Here is an example: 私は私立大学で勉強しています。[...]A second year Japanese student could figure this out. For a computer, this is a very difficult problem."
The choice is particularly sad. This isn't difficult at all for a computer, granting you understand the Japanese language. Let's dissect this phrase:
私 は 私立大学 で 勉強しています。
私 わたし (reads watashi) is the English pronoun 'I'. A computer instantly knows it because only the watashi reading/meaning can be followed by the Hiragana は. That's something a 6 years old Japanese knows. And something you would learn in your first months or less of Japanese studies.
Conversely a computer know instantly that's the reading/meaning isn't watashi when it scans that following 私, there's another Kanji. This compound Kanji 私立 reading can only be しりつ (reads shiritsu) out of a staggering number of combined readings of 2 - and that's only because 立 has two usable Chinese reading りゅう and りつ (the third would require the Kanji to be lonesome), 私 only one, し. Kanji have usually one Japanese reading and one or more Chinese readings governed by strict rules on which reading group has to be used. Coding it isn't as much of a headache.
As soon as the computer realize that the third character that follows the first two Kanji is a Kanji as well, the range of possible readings bottoms. That's also due to the fact the first two Kanji makes already a word - as often happens with 2+ long Kanji words, they are compose of multiple words, just like some long words in Western languages would - that means 'private' in English. With the same approach the computer instantly finds the reading of the two Kanji 大学 だいがく (reads daigaku) means University, another very common noun. I think you already got the gist of it. Last word 勉強しています the computer know instantly is a verb because of the unique okurigana しています (read shiteimasu) Present Continuous of "to do" and 勉強 is both extremely popular and has a unique reading, べんきょう (reads benkyou) which is the noun 'study'. "I'm studying at a private university", even a machine translation would be accurate here.
I think the point is that there is no use in sorting all the words written in the three different Japanese alphabets simultaneously in the same juncture. Microsoft knows it so well it has yet to implement it. In your final thought you completely miss to understand that you don't need to attack the problem by "pronunciations". You have only to treat the Kanji with different approach and translate Hiragana/Katakana in romaji, which it has been done already long time ago. I hope at least you're going to quit using "pronunciation" in favor of "reading" by the time you've done reading this post. If ever.
Sadako is Proper name of a popular iconic character in Japanese culture that embodies a little girl killer ghost and it is based on a real Japanese person that was said to be a psychic and lived in the 1900s. They made at least one horror Japanese movie out of that story and even Hollywood had a go at it! Now, how many girls you think they have been named Sadako in Japan by their parents? Once more Google translate doesn't give a damn if the person using it doesn't know what is doing.
In regular writing to mark Proper names a number of suffix are employed さま, どの, さん, くん, ちゃん and more. Every suffix has a different and unequivocal use case. It can be written in Hiragana, like I just did or by using Kanji. If the Kanji is adopted looks like 様 reads さま, 殿 reads どの etc. Also for such words there isn't a clear-cut translation, because they are employed as a result of the implications they come with. For instance 様 or さま is used when referring to customers or clients, but in a business letter you would use 殿 or どの with similar significance.
Most amusing is "淳子殿 translated to "Mr. Reiko". On Google Translate left windows in the left-bottom corner there is one of the possible correct reading of 淳子殿, namely Junko-dono. But on the right it's messing it right up. That's the demonstration that Google Translate has no brain. By the way Reiko is written 麗子 as a Proper name... and so is Rika, Akirako, Yoshiko, Urarako . Get the point? You need Furigana.
About gyoza, that's the name of a very popular kind of Chinese dumpling made in Japan, and there is no Kanji for it its written in Katakana only as チャオズ. But since it's a Chinese word you can used the two sounds ぎょう and ざ in Hiragana to write the two corresponding Kanji 餃子. To understand why, you need to read my post down the page. It seems Google translate cannot read the first Kanji right.
If you feel so inclined, mopreme, read my post down below which covers your original blog post extensively and throughout.