Effective Touch Computing

15 Jun 2015 . Machine Learning . Comments

I am not gonna lie. I am a little intoxicated as I just came back after having a pint with Aaron. While speaking with him on issues transecding humans and talking casually about how life is just a “pattern that reproduces”, it struck me as a pure result of random neurons firing that may be I should approach the T5 Text entry as a re-inforcement learning problem.

You ask what the fuck is T5 Text entry? The more technologically inclined minded of you might have already guessed that its a classic T9 text entry reduced to 5 keys. And why this is appealing to me? I believe the world is moving towards touch screen (if it already hasn’t). I know you are thinking that “But we alerady have swipe which is now shipped inbuilt in Android and IPhone, but it still is a keyboard based on QWERTY.

In the world filled with touch screens, we ideally want to tap our fingers anywhere on the screen and be able to type. That’s where T5 Text entry (or as I like to call it, “Wonderboard coined by my good friend and colleague David Coulthard, as an alternative to keyboard) comes in. The first steps is to design the letter patterns on 5 keys that minimize the number keys pressed (eventually screens taps) and be able to write proper English (for starters) sentences. I did ponder on this idea for a while and failed making any progress designing the proper loss function minimize.

The reinforcement learning approach is based on a human emulator typing and getting a reward on random english (say wikipedia) text. It should get reward after typing whole sentences as opposed to sequential keys and the objective is to minimize the number of keys pressed. A bit more technical, a state space of ~ 26 to the power of 5 initialized with equal probablity initially should converge, and modern computers can deal that kind of space, right?

Let us suppose that each letter can be chosen from 5 different keys with different probabilities (initially all the keys have uniform probability distribution over all the keys). For each sentence that agent emulates to write, the penalty in the end is based on the same keys pressed during sequential emulation of typing. For example, if the agent tries to write the word “apple” and agent picks “a” using key 2 and then for “p”, it ends up choosing key 2 again, then penalty of +1 is added to the accumulated penalty. However, not that if there are same letters in consecutive positions, then no penalty should be imposed (obviously). This is looking promising.

This is all for now, I’ll update this post as an when I get more ideas. I will also probably start reading (hard copy, thanks to Aaron for dumping his books off to me) of book Reinforcement Learning by Sutton and Barto.

Update 2015-07-22: Presentation slides I made for ef.