The Kanji Project
Ruby, XSLT
Being a lazy person, I decided to prioritise my study of kanji by studying the most frequently-occurring ones first.
In order to find out what those were, I set a spider loose on some Japanese websites—the kind I actually read—counting the number of occurrences of each kanji character, until it had gathered about a million data points. I then produced a report in XML, ran it through some XSLT to style it, and produced a list of the two thousand most frequent characters.
The list
Each entry on the list is cross-referenced to the corresponding entry in WWWJDIC so that you can easily look up the details of an unfamiliar character.
Here is the list of the top 2000 kanji. It’s a quarter of a megabyte in size, and may tax some browsers. I’m serving it compressed to save my bandwidth.
The raw data
I’ve had a few requests for the raw data, which I’m happy to oblige. It’s a gzipped XML file, and should be fairly easy to process into any format you desire.
2005-08-10 17:43 UTC. Comments: 4.
Snowtweety
Wrote at 2007-07-27 16:22 UTC using Safari 419.3 on Mac OS X:
What a great project! I found it helpful for refreshing my knowledge of Chinese Radicals since they share some of the same characters.a ruby nuby
Wrote at 2007-11-06 06:09 UTC using Opera 9.24 on Windows 2000:
Nice. I see the top 100 on kanji-a-day dot com as well. Any chance of seeing the Ruby script that made it all possible?cheers
walter
Anki-guy
Wrote at 2009-04-06 04:56 UTC using Firefox 3.0.8 on Windows XP:
How does your list compare to the official joyo kanji? Or to the JLPT kanji?Oukila Mohamed Yassin
Wrote at 2009-12-19 09:33 UTC using Firefox 3.5.6 on Windows XP:
You’re awesome