45-year development history of 26-key VS Jiugongge Chinese input method

Compared with the shape code of Wubi, Pinyin has a lower threshold as a phonetic code. What is even more frightening is that behind Pinyin is a whole nine-year compulsory education system.

How did Wubi come from?

In December 1978, the national academic exchange meeting on Chinese character coding was held in Qingdao.

This "Qingdao Conference" brought together many scholars from Chinese Academy of Sciences and universities all over the world. The issues they discussed now seem unimaginable:

How to input Chinese characters into the computer?

It is hard to imagine that Chinese input became a "neck-stuck" technology in China in the 1970s and 1980s, when sogou and iFLYTEK are rampant all over the world and the speed of young people’s handwriting is fast.

The reason is simple, too many Chinese characters.

The GB2312 character set released in 1980 included 6763 Chinese characters, while the GB18030-2005 character set released in 2005 included as many as 70244 Chinese characters.

国标GB18030字符集

国标GB18030字符集

Moreover, unlike western letters, Chinese characters do not have a sorting method like "ABCD".

At that time, there was no ready-made solution to how to correspond many disordered Chinese characters to the keys on the keyboard. This also makes Chinese characters lag far behind western countries in information processing, especially in information input.

In the 1984 Los Angeles Olympic Games, when AFP reporters saw that journalists from China were still writing news articles by hand, they were shocked: Of the 7,000 journalists, only Chinese was still writing by hand!

At that time, there were many voices of "Chinese character backwardness" in academic circles, and some extreme scholars even thought that in the computer age, Chinese phonetic alphabet would replace square characters and become the way out for Chinese.

The arguments of "Pinyin Chinese characters", "Pinyin replacing Chinese characters" and even "bilingual" are endless. In some places, there is even a "full pinyin" teaching in schools.

Standing on the eve of the take-off of computer technology, the problem of Chinese character input must be solved.

Many experts and professors put forward different opinions at the meeting. The mainstream view is to develop a keyboard specially designed for Chinese character input.

At that time, there were three kinds of special keyboards for Chinese characters, one was a whole-word keyboard.

That is to say, put thousands of Chinese characters on a keyboard as big as a table according to the radicals, so easy!

If one keyboard is not enough, you can put the second and third keyboard.

The advantages of this scheme are outstanding, there is no learning cost and it is very direct. The shortcomings are also very prominent, except for simplicity and directness: you can’t type blindly, the speed is extremely slow, and it is the limit to input 10 Chinese characters per minute. Moreover, a keyboard costs thousands of dollars, with large equipment and high cost, which is not practical.

The second scheme is to design the "main and auxiliary key keyboard" after the Japanese.

This scheme consists of a main keyboard and an auxiliary keyboard. The auxiliary keyboard is responsible for setting the partition, and the main keyboard can input specific Chinese characters in the partition by pressing the corresponding key.

It can be seen that this scheme is an improvement of the above one, which reduces the size of the equipment, can type with both hands, and is much faster.

But it has caused a huge bug: typists need to remember which partition each Chinese character is in and where it is sorted.

Even if there are only a few thousand Chinese characters, it is a huge learning cost. There is basically no possibility of popularization of this scheme.

The third scheme is Han. A 256-key "Chinese keyboard" scheme proposed by Academician Wang Xuan, the father of Chinese laser phototypesetting.

王选

王选

This scheme divides all Chinese characters into more than 1000 strokes and radicals, and puts them on 256 keys. As long as these "parts" are combined according to the structure, you can spell the desired Chinese characters like Lego.

For the first time, the Chinese keyboard used the method of "character splitting", that is, the "shape code" in the coding type of Chinese characters.

Later Wubi, Zhengma and the Cangjie input method created by Zhu Bangfu in 1976 all used "shape code".

仓颉输入法

仓颉输入法

Compared with the first two systems, the middle keyboard scheme is obviously much more advanced, but it is far from perfect.

After all, English QWERTY keyboard, with only 26 letter keys, is light enough and has been used internationally. If you can, who doesn’t want to use ready-made things directly, but Chinese characters are really too special.

In the same year that Qingdao Conference was held, Wang Yongmin of Nanyang Science and Technology Commission of Henan Province also joined the development of Chinese character input method.

王永民

王永民

Similar to the mainstream idea at that time, he also hoped to make a keyboard dedicated to Chinese characters at first.

In order to achieve this goal, Wang Yongmin decomposed more than 12,000 Chinese characters in Modern Chinese Dictionary one by one, classified and counted them, and summarized more than 600 word-forming units, which Wang Yongmin named as etymons. According to the frequency of occurrence, 125 kinds of them are selected.

With his efforts, he reduced the number of keys on the Chinese keyboard from 256 to 188, and then to 62. By 1982, Wang Yongmin had developed a 36-key scheme.

Moreover, all this was developed without a computer and purely by manual statistics.

At this time, there are two roads in front of him.

One is to promote the 36-key scheme. This scheme was successfully put into operation in 1982. As the most advanced scheme at that time, it was actually possible to popularize it.

The other way is to go back and continue the research, compress the key position until it fully fits the 26-key QWERTY keyboard.

You know, the fewer keys are compressed, the more duplicate codes, that is, the different characters corresponding to a group of keys, and the selection of words will greatly affect the efficiency. At this step, it is very difficult to compress them down.

But Wang Yongmin chose a more difficult path.

In order to reduce the duplication, after nearly a year’s research, Wang Yongmin invented the "last stroke font identification code", which codes and locates the last stroke of each word, thus reducing the duplication rate by an order of magnitude.

On August 28th, 1983, the "five-stroke font" invented by Wang Yongmin was born.

The power of Wubi lies in "quickness" at first.

Because of the introduction of the partition of five strokes, the user can lock the unique Chinese character only by pressing the key five times at most.

There is no duplicate code, which means that users no longer need to choose the words they want. After practice, they can even realize double-blind input, that is, typing without looking at the keyboard or the screen.

Moreover, compared with the big keyboard and the middle keyboard, Wubi has fewer key positions and lower learning cost.

Many post-80s generation may recite the five-stroke formula. What? "The king’s head is green and May Day, and the Tusi are dry for ten inches." . Skilled Wubi users don’t even need these, they can input by muscle memory.

Secondly, Wubi is perfectly compatible with QWERTY keyboard, so it is not necessary to develop new hardware.

At the beginning of Wubi’s release, because it didn’t need a special keyboard, it was even considered as "worshipping foreign things and obsessing foreign things" by the academic circles, but the market and users voted with their feet.

In 1984, the State Science and Technology Commission and the Commission of Science, Technology and Industry for National Defense successively issued "red-headed documents" to promote the five-stroke font to the whole army.

In 1986, Wubi entered the United Nations through the Ministry of Foreign Affairs, and in the same year, it sold the right to use it to DEC Company of the United States for hundreds of thousands of dollars, becoming the first computer patented technology exported from China to the United States.

Since then, IBM, Microsoft and Apple have successively purchased patent rights.

Southeast Asian countries such as Malaysia and Singapore are still widely using five strokes.

In China, Wubi has also become the earliest Chinese character input method that a generation of Chinese contacted and learned.

In the 1990s, when computers were not popular, using computers was regarded as a vocational skill. There were a large number of training courses in China, and Wubi was the most important teaching content in these training courses.

At that time, many domestic computers not only had English letters on their keyboards, but also marked with five radicals, which was convenient for learning.

But it is such a "contemporary movable type printing", why has it become the loser of pinyin input method, synonymous with "minority" and the tears of the times?

02 Why Wubi lost to Pinyin

In 1995, the intelligent ABC input method was born.

This is a classic pinyin input method. Anyone who has used intelligent ABC should be an old netizen like me.

But the classics belong to the classics. To be honest, compared with the current pinyin input method, the smart ABC in those days had nothing to do with intelligence.

Its existence is to provide a huge thesaurus for users to find the words and expressions corresponding to Pinyin.

However, its thesaurus actually lacks practicality, because the thesaurus of that era was mainly obtained from the statistics of high-frequency words in the reports of official media and various newspapers and magazines, so such thesaurus actually did not meet the language needs of ordinary people, and there was no way to update the thesaurus online.

At present, the necessary functions of pinyin input method, such as associative input, fuzzy sound and common vocabulary memory, were even more impossible at that time.

As an intelligent ABC player, I have developed a memory in my long-term use: I will remember the words that are often used on which page.

And because the thesaurus is not updated, sometimes we have to find some stupid ways to improve typing efficiency. For example, when I want to type "like tea", the quickest way is to type "like" and "tea" respectively, and then delete the redundant words.

In a word, the efficiency of pinyin input method at that time was far from that of Wubi.

Why is pinyin input inefficient?

One point is that as we mentioned earlier, the duplication rate of pinyin is very high. There are only over 400 combinations regardless of tones, and these combinations correspond to thousands of Chinese characters.

This means that pinyin should not only type, but also choose words, which is often easier to enter than to find words.

In particular, some pronunciations correspond to a large number of polyphonic words, such as zhi, chi and shi.

If there is a yi word in your name, which happens to be an uncommon word, and you can only use pinyin input method, congratulations. In the GB18030 word set, there are 407 words that only pronounce the yi sound. Typing is cool for a while, looking for a word crematorium.

The first of the three principles of shape code design put forward by Wang Yongmin is the "compatibility principle". That is to say, the duplication rate should be low to ensure the uniqueness of coding. Preferably within 2%. There are a large number of homophones in Pinyin, and the efficiency is obviously failing.

In order to improve efficiency, there was a double spelling input later. However, as long as it is based on pinyin, duplicate codes can’t solve the problem.

In addition, the Wubi font is designed in the key arrangement.

As can be seen from the picture above, it is also based on QWERTY keyboard, and the five-stroke typing is basically balanced in the load rate of both hands.

But pinyin input means that the right hand is busy and the left hand is on holiday. And if you use standard typing, you will find that the strongest index finger of your left hand seems to have nothing to do, but the little finger with the least strength is heavily loaded.

Simply put, pinyin typing is not very scientific in human engineering.

Of course, this pot will still be thrown to the QWERTY keyboard. The key position design of this keyboard is unreasonable, and I will talk about this later.

Therefore, in terms of efficiency and scientificity, Wubi is far beyond the existence of pinyin input method.

What really changed the embarrassing situation of pinyin input method is that it depends on memory and the Internet.

As an input method that relies heavily on the word stock, Pinyin input method needs a lot of storage space to store the word stock. In the 1980s, when the computer memory was calculated by K and the storage space was calculated by MB, there was no enough font.

Until the 1990s, the hard disk capacity of mainstream computers has reached GB level, and the memory has reached the level of tens of MB, which led to an increasingly huge pinyin font.

The larger the word thesaurus, the more words that can be matched, and the efficiency is improved.

The Internet has brought about even greater changes.

In 2006, sogou input method was born. There is not much difference between it and intelligent ABC and Ziguang Pinyin in basic principle. The only difference is that it greatly expands the thesaurus through sogou’s search engine.

Pinyin input by users no longer corresponds to the local thesaurus, but matches the content of the whole Chinese Internet.

In addition, sogou also includes the functions of whole sentence input and associative input. You know, the efficiency of typing whole sentences is much higher than that of typing one word at a time. And many times you don’t have to type complete pinyin, and the system will help you complete it.

In addition, it will also record the high-frequency words you often use, and it will become more and more convenient as you use them.

The addition of these functions makes the efficiency of pinyin input method take off directly. Sogou Input Method grabbed 90% of the market share within one year. Even after Tencent, Baidu and Google joined the battle, sogou’s market share did not fall below 70%.

But objectively speaking, in essence, these input methods have not made Pinyin a more accurate input method with a lower duplication rate. It only uses big data, big thesaurus and networking functions to forcibly improve the matching efficiency.

It’s like a joke I heard about programmers:

What should I do if the product performance is not enough? A programmer will have two choices. One is to work hard for six months to write a very complicated program to improve the performance of the system. The other is to spend half a year on vacation and sleep at home. When you wake up, the price of NB hardware has been reduced, and the problem can be solved by directly changing the hardware.

The real advantage of Pinyin over Wubi is that it is a lower threshold input method.

Chinese characters are written in a trinity of shape, sound and meaning. Of these three, we often understand the sound and meaning first, that is, we learn to listen and speak first, and then learn to read and write.

Moreover, our thinking is often carried out in a phonetic way. When we think about things, there will be a voice in our head.

Compared with the shape code of Wubi, Pinyin, as a phonetic code, has a lower threshold and is more in line with our instinct.

What is even more frightening is that behind Pinyin is a whole nine-year compulsory education system.

Pinyin input method is not without learning threshold, but few people need to learn Pinyin specially.

We have been studying aoeiuü since the first Chinese class in the first grade of primary school. Until the college entrance examination, Pinyin has never left Chinese learning.

With the consolidation rate of nine-year compulsory education of more than 90% in China, Pinyin is almost factory-set for this generation of young people in China.

Interestingly, among the users of Wubi, there are quite a few post-60 s who have not received a complete pinyin education, but they can read and write. These middle-aged and elderly people use five strokes more smoothly.

However, in the face of China’s increasingly complete education system and huge population of pinyin users, Wubi has no chance of winning.

Why Internet companies are willing to make great efforts to develop pinyin input methods is also because the user base is large and there is room for imagination in business, so giants are willing to invest manpower and material resources in marginal markets such as input methods.

In the era of mobile Internet, Pinyin has really settled down.

03 jiugongge vs full keyboard

I still remember that when I was at school, both boys and girls liked to buy a larger school uniform, and the sleeves should at least be able to hide their hands. Half of them are to hide headphones and listen to songs in class, and the other half are to put their mobile phones in their sleeves and send text messages to each other in class.

It’s just me, and no girls send me text messages. I play mobile phones in class to watch the live text broadcast of NBA.

Although there was no big-screen smart phone in that era, almost every student was proficient in blind typing skills because the mobile phone had nine-square physical buttons.

With the popularity of smart phones, we used to use the nine squares in those years and began to use the full keyboard again.

The history of mobile phone input method has to start from the 1990s.

In 1993, Nokia released the first GSM mobile phone supporting SMS sending function.

Since then, the mobile phone is no longer a pure mobile phone, it has entered the text age.

But the mobile phone has really become an efficient text communication tool, thanks to an input technology called T9.

In fact, in western countries, telephone numbers used to contain numbers and letters, so the number keys of fixed telephones were marked with letters. It is not only for typing, but also convenient for users to remember telephone numbers.

This set of keyboards is called E.161 keyboard, and interestingly, there are no Q and Z on this set of keyboards for a long time.

In 1995, an American family namedTejie communication(Tegic Communications) was founded, which is a company specialized in developing technology for small electronic devices, and was later acquired by AOL. It is this company that has developed the T9 input method.

T9 used predictive text technology earlier than sogou, which is the common word association input function we mentioned earlier, and it not only supports pinyin, but also supports stroke input.

This technology has made a qualitative leap in mobile phone typing.

In the late 1990s, T9 input method began to be widely used by Samsung, Nokia and other mobile phone manufacturers. That is what we commonly call "Jiugongge".

I remember when Jiugongge was the hottest, there were many related codes. For example, 962464 stands for "I love you" and 647449633 stands for "You are mine".

Maybe this is the earthy love story after 80/90.

The combination of full keyboard and mobile phone will be in 1996.

That year, Nokia released a mobile phone called 9000, which was also the first full keyboard mobile phone in the world.

It has a design similar to that of a notebook computer. From the outside, it looks like an ordinary mobile phone. When the top cover is opened, there is a big screen and a whole QWERTY keyboard.

As a PDA-level product for business people, Nokia 9000 not only supports sending and receiving emails, but also has browser function and supports surfing the Internet.

After 96 years, this is simply the existence of black technology.

The full keyboard, for a long time, is also considered as the standard of business mobile phones.

Later, BlackBerry, Palm and Nokia’s E-series focused on business attributes, and a large number of models were also equipped with full keyboards. In addition, the sliding keyboard has been popular for some time, such as Nokia’s N97.

It was not until 2007 that the release of iPhone changed this situation.

For a long time, Apple has been obsessed with the virtual full keyboard. It was not until iOS7 in 2013 that Apple began to support Jiugongge, and in iOS8 the following year, Apple opened the third-party input method.

However, for a long time, mainstream smartphone manufacturers set the full keyboard as the default input mode, which led to the full keyboard input finally becoming the mainstream in the era of mobile Internet.

This has also caused a long-lasting debate, which is the high efficiency of the nine-square grid or the full keyboard?

Next, it’s my personal bootleg, so you can listen.

I’ve read a lot of stories, and there are fans on both sides. People who use Jiugongge say that Jiugongge is more convenient to operate with one hand and has a lower false touch rate, so it is easier to hit blindly.

People who use the full keyboard will retort that now the screen of the mobile phone is big enough, so it is not so easy to touch it by mistake. It is obviously more accurate and more efficient to use the full keyboard for typing with both hands.

At this time, people who use double spelling will say that if you practice and get used to double spelling, double spelling is the fastest

The adherents of Wubi said that they are all dregs, and they are not as fast as Wubi.

But I found that most people are not fighting for efficiency at all.

I prefer to use Jiugongge, just because I am a post-90 s generation, and I have used Jiugongge on Nokia before the emergence of smart phones.

For people who are new to the mobile phone and have a full keyboard, this input method that is more consistent with the computer keyboard is obviously more natural. Why bother to learn Jiugongge?

Like my dad, in his time, there was not even nine-year compulsory education, and many people didn’t learn pinyin well. The first contact with typing was five strokes, so it was impossible for him to switch to pinyin typing.

In fact, the efficiency of input method is indeed high and low. However, this slight difference can be completely ignored in daily chat and writing.

What we are pursuing is not efficiency at all, we are just maintaining our habits and the times when we grow up.

Many times, when I want to argue with people about something, I suddenly find that it is not my obsession with correctness but my obsession with my position that supports my crazy search for arguments on the Internet.

输入法只不过是沟通与输出的工具,正如我们从不以说话的快慢来判断语言的价值,比起「写多快」,「写什么」才是更重要的事。

这种对「效率」虚无的追求,其实往往会变成一种低效。

这不禁让我想起有段时间我沉迷于对「效率」工具的使用,还经常在朋友圈推荐一些我觉得不错的软件。

我学习了卢曼的卡片笔记法,尝试过notion,用过flomo、obsidian、craft等等配合来做知识架构,付费过幕布也付费过xmind,光写作软件,我就换过ulysses、metion、bear、iawriter、typora等等等等,都只是为了提高一点点「效率」。

直到有天我看到一个帖子,作者作为资深效率软件使用者,详细对比了各软件之间的优缺点,并提供了自己的使用经验。

帖子结尾则说道:现在我开始用笔和纸了。

突然让我有种置身剑冢,看到独孤求败那句「四十岁后,不滞于物,草木竹石均可为剑。自此精修,渐进于无剑胜有剑之境。」的感觉。

这他妈才叫境界。