October 29, 2008

Natural Language Processing

'Pentagon!' - One word - that's normally all that is communicated between commuters each morning as they pick up riders for the HOV lanes here in Northern VA. With three riders, cars can use a special section of I-395 meant for high occupancy vehicles. It's a fast way to work and the process is known as 'slugging' here in the suburbs of our nation's capital.

After the cry of 'Pentagon!' out the window, 2 people will get into the driver's vehicle, perhaps mutter 'Thanks!' and then no other words will be spoken the entire ride until they get out at the end of the ride. Again, a quick 'Thanks!', 'Thank you' or 'Thanks for the ride!' is called out as the transaction is completed.

It's a fairly complex transaction from the standpoint of an outsider, but once the protocol is picked up, those of us who participate understand it completely. 90% of the transactions are completed with less words than comprise this sentence. Humans communicate in short bursts for many repeated transactions in their daily lives. It got me to thinking about communication, The Turing Test, computers and, oddly, my 7th grade English classes on thesis paper construction.

In English class, we learn to write different types of papers. The contruction algorithm of an argument is explained to us in simple terms {statement of hypothesis, supporting argument sentences, closing resolution/restatement of hypothesis}. We were taught how to construct sentences, how to construct paragraphs and so on....

A lot of what we do in our day to day lives is driven by our unique needs and goals. Perhaps thinking of it in terms of dopamine receptors and language centers may help us in engineering a natural language processor that would seem more human and able to pass the Turing Test. Additionally, we need to take into account concepts such as 'shame' and 'fear' to induce different personalities into our AI. I know that I put together sentences that I never utter due to these modifiers (not willing to be punched in the nose or start an altercation, saving face, etc...).

I saw something about a semantic dictionary that I probably need to look into that would help with some of the sentence/paragraph construction logic, but I think much of the work that hasn't been done is in the 'motivation' side of speech analysis. Surely to pass the Turing Test, the motivation is clear - convince the human that you are human (even though you are not). I'll need to do some research into thinking/speaking patterns used in deceipt. Maybe I can start with analyzing structure of political speeches and news interviews, working my way up [or down] to criminal defendant arguments on the stand.

A library of structure/patterns of speech need to be created that fit different motivations for speech so that language is more real to the computer, and more of a driving force in the structure of the speech itself. Hopefully the semantic dictionary will help the faux paus of not understanding hidden meaning in sentences by providing further context for the words that are used. Perhaps even irony and back-handed compliments could be detected in analyzing input (or better yet, delivered by the program itself)

Just my thoughts for today...later

