Minimally invasive surgeries(MIS) possess obvious advantages for patients but require more specially trained personnel than common, open procedure. With modern advances in robotics, it becomes possible to ease these requirements by introducing robotized assistants into operation. However, for robot-assisted solo MIS, it is crucial to have appropriate methods for robot control. This work presents a design of flag language as a novel human-to-robot communication method based on image processing of the endoscope video during real-time intervention. The proposed system comprises autonomous positioning of endoscope holder combined with recognition of surgeon intention by analyzing surgical instrument postures. The proposed algorithm has been evaluated using experimental setup as well as video clips from laparoscopic operation. Experimental results show that the flag posture can be detected satisfactorily with speed of 12 frames per second, as well as system robustness under moderate lighting and noise conditions.