Yu Hua Fun Fact

By Eric Abrahamsen, published December 22, 2008, 7:01p.m.

According to Yu Hua, a professor of Chinese once ran his book 许三观卖血记 (translated as Chronicle of a Blood Merchant) through the data cruncher, and calculated the number of different characters Yu Hua had used in writing the book. The grand total was 486. Is that even possible?

Update: I asked Yu Hua for more details, he went digging, and it turns out this was quite wrong. The actual numbers are 1,909 characters for Chronicle of a Blood Merchant, and 1,907 characters in To Live. Far more than 486 characters; still far, far less than you'd expect for two of the more influential novels of the past couple decades.

Viewed 107 times

Comments

# 1.   

How many total characters in the entire text? That will give us a ratio.

 Canaan Morse, November 24, 2008, 5:20p.m.

# 2.   

An excerpt of 许三观卖血记 in 当代文学100篇 was the first thing I ever really read through in Chinese without cracking a dictionary. I remember thinking: "余华 knows exactly the same 500 characters as I do."

 Dylan, November 24, 2008, 9:11p.m.

# 3.   

Someone's putting someone on. I ran the first ten chapters through SIL's Unicode character count tool and came up with 1308 unique hanzi. That's from an online version, so it might not be entirely correct, but a brief look through the list didn't reveal anything suspicious.

(I had to load the UnicodeCCount results into Excel to get a total count; someone decent in perl could probably modify the script to generate stats directly.)

zhwj, November 25, 2008, 1:45a.m.

# 4.   

Hmm, okay, it does sound a little fishy. I'll ask him and see if I can get a precise source for the statistic. Googling/Baidu-ing didn't immediately turn up anything…

Eric Abrahamsen, November 25, 2008, 2:36a.m.

# 5.   

Via the MCLC mailing list: "The lost generation" By Raymond Zhou (China Daily) http://www.chinadaily.com.cn/cndy/2008-11/21/content_7226279.htm

Micah Sittig, November 25, 2008, 2:49p.m.

# 6.   

This emphasis on numbers is a bit humorous. Since the founding of the People's Republic, military terminology, PCthink and (useless) statistics have never fallen out of style.

I've been living in China since the early 1990s, and sometimes I become insensitive to how flat mainstream Chinese lingo can be. Perhaps 5 years ago, I got a copy of <蛋白女孩>, and just couldn't put it down. It's no classic, just a fun look at two thirty-something Taipei guys talking about their largely unsuccessful 爱情物语.

As I finished it -- in a record 2 or 3 days -- I realized what, besides the wit, kept me glued to the book: The language. Written by Tom Wang (王文华), a Taiwanese who was (is?) Disney's creative director in Taipei, it's heavily influenced by the language he heard in the US when earning a MBA.

Parts of it sound a lot like rap "light" in Chinese. Mind you, almost no English words in the book, and no Chinglish either. This is Chinese, but very contemporary, very natural, much of it rhyming to boot.

Reading it, I couldn't help thinking to myself: Written Chinese can be so hip, so witty, so...fresh.

I'm not sure he used more than a thousand different characters either.

 Bruce, December 23, 2008, 7:22a.m.

# 7.   

I agree entirely: I think what this statistic suggests is that what Chinese literature needs is not necessarily a great new mission, or loftier goals, but to examine its own foundations, to break things apart a little, and try something new...

 Eric Abrahamsen, December 23, 2008, 7:36a.m.

# 8.   

Put bluntly, 规范汉语 and compelling literature have nothing to do with one another...

 Bruce Humes, December 23, 2008, 10:50a.m.

# 9.   

"只有那个写作,虽然那个考大学没考上,但是文革中也没有好好学习,但是也认识了大概有四五千个汉字了,我估计那时候最多认识那么多汉字,然后就开始那个写 小说,所以为什么到后来就是很多中国的那些批评家们表扬我,说我的语言简洁嘛,我告诉他们是我认识的字不多," -Yu Hua

Isaac, March 2, 2009, 3:58a.m.

# 10.   

Neat quote, Isaac.

Are the words from your reporter's notebook, or...

Bruce www.bruce-humes.com

 Bruce, March 2, 2009, 11:33p.m.

# 11.   

It's from this 2005 interview on Phoenix TV's Face to Face with Celebrities program, Bruce.

jdmartinsen, March 3, 2009, 1:12a.m.

*

Your email will not be published
Raw HTML will be removed
Try using Markdown:
*italic*
**bold**
[link text](http://link-address.com/)
End line with two spaces for a single line break.

*
*