We develop an interface that bridges crowd simulation and natural language processing techniques so that casual users can produce crowd animation by text input. The interface adopts a parser and a tagger to analyze simple English sentences to convert them into intermediate data structures that encapsulate the essential elements of crowds. There are five stages: preprocessing, parsing input sentences, crowd generation, animation adjustment, and crowd animation. Our system supports basic behaviors including standing, walking, running, escaping, being attracted, and queuing. We conducted a user experience study to evaluate the interface. The results show that the interface is user-friendly for casual users to produce crowd animation.