Sunday, December 21, 2008

Talk to foreigners (even aliens) in your language seamlessly

Proposal
Build an automatic translation mechanism into instance message system (and other Internet communication system), as if a simultaneous interpreter were there.
It would be exciting if you master three languages. However, one could not master all languages. Even if such an interpreter exists, hiring him/her would be expensive. Internet helps people communicate easily from every corner around the world, and if it were an even probability to talk to anyone on earth, the chance a talker speaks foreign language is much higher. If we have such an automatic translation system, as if a simultaneous interpreter is ready, negotiation should be much simpler.

The problem currently is that a machine could not translate as good as an average interpreter. Languages are not one-to-one mapping, and machines are not good at analyzing and choosing which words to use and in what order. And this is why interpret is still a career.

Whenever a computer is not good at something, we could try to train it in hope that it will eventually learns and masters it. The source of training material would be from thousands of millions of people. As in image recognition, tags from Internet users helps classify and recognize pictures a lot. If, in one day, computers could seamlessly translate one language into another, we would be happy that we no longer need to learn a second language, and we would be equally sad that we will never drive to learn other than native language.

Currently, machine translation in instance message system should be more than experimental, but at least it could provide some information which might be helpful to international talkers. And, have fun to laugh at silly computers.

Wednesday, December 17, 2008

File names you can NOT create on Windows

Try to create a file named NUL.txt on Windows XP, and you probably get an error message.

Many characters are not allowed for file names on Windows. This includes
< > : " / \ | ? *

Besides, many reserved device names should be avoided. This includes
NUL, CON, PRN, AUX, COM1, COM2, ..., LPT1, LPT2, ...

Also, using the device names unintended might raise security vulnerability.

Saturday, December 13, 2008

Dimensional Modeling vs Interval Tree

The concept Dimensional Modeling is like the Interval Tree (Segment Tree). Both speed up queries by storing information at different levels. The difference is that in Dimensional Modeling, levels are defined by users, while in Interval Tree, levels are usually defined by a complete binary tree. Interval Tree usually solves problems on one dimension, but it could also be applied to 2D problems. One dimension is not a limit.