This paper addresses two obstacles hindering advances in accurate gesture recognition on mobile devices. First, gesture recognition performance is highly dependant on feature selection, but optimal features typically vary from gesture to gesture. Second, diverse user behaviors and mobile environments result in extremely large intra-class variations. We tackle these issues by introducing a new network layer, called an adaptive hidden layer (AHL), to generalize a hidden layer in deep neural networks and dynamically generate an activation map conditioned on the input. To this end, an AHL is composed of multiple neuron groups and an extra selector. The former compiles multi-modal features captured by mobile sensors, while the latter adaptively picks a plausible group for each input sample. The AHL is end-to-end trainable and can generalize an arbitrary subset of hidden layers. Through a series of AHLs, the great expressive power from exponentially many forward paths allows us to choose proper multi-modal features in a sample-specific fashion and resolve the problems caused by the unfavorable variations in mobile gesture recognition. The proposed approach is evaluated on a benchmark for gesture recognition and a newly collected dataset. Superior performance demonstrates its effectiveness. Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
|Original language||American English|
|Title of host publication||32nd AAAI Conference on Artificial Intelligence, AAAI 2018|
|Number of pages||9|
|State||Published - 2018|
|Name||32nd AAAI Conference on Artificial Intelligence, AAAI 2018|