 , , . , HAL 9000 . « 2001 »
, , . , HAL 9000 . « 2001 ». 1976 , «» 
, (. 
"Hearing lips and seeing voices", Nature 264, 746-748, 23 December 1976, doi: 10.1038/264746a0).
— . , (, ), . . . , .
. , , , , .. , , 
HAL 9000.
, . , , , .
— . , . c 
( ) . , . , . 
17±12% 30 21±11% ( ).
— , . , , - . . , . , , .
. 
LipNet , , .
 "please" () "lay" () , , ()
"please" () "lay" () , , ()LipNet — LSTM (long short-term memory). . (Connectionist Temporal Classification, CTC), , , .
 LipNet. T, - () (STCNN), . (), LTSM. LTSM SoftMax
LipNet. T, - () (STCNN), . (), LTSM. LTSM SoftMaxGRID 93,4%. ( ), .
|  |  |  |  |  | 
|---|
| Fu et al. (2008) | AVICAR | 851 |  | 37,9% | 
| Zhao et al. (2009) | AVLetter | 78 |  | 43,5% | 
| Papandreou et al. (2009) | CUAVE | 1800 |  | 83,0% | 
| Chung & Zisserman (2016a) | OuluVS1 | 200 |  | 91,4% | 
| Chung & Zisserman (2016b) | OuluVS2 | 520 |  | 94,1% | 
| Chung & Zisserman (2016a) | BBC TV | >400000 |  | 65,4% | 
| Wand et al. (2016) | GRID | 9000 |  | 79,6% | 
| LipNet | GRID | 28853 |  | 93,4% | 
GRID :
command(4) + color(4) + preposition(4) + letter(25) + digit(10) + adverb(4),.
, 93,4% — - , . , . , .
LipNet .
ICLR 2017 
4 2016 .