Despite a great success in learning representation for image data, it is challenging to learn the stochastic latent features from natural language based on variational inference. The difficulty in stochastic sequential learning is due to the posterior collapse caused by an autoregressive decoder which is prone to be too strong to learn sufficient latent information during optimization. To compensate this weakness in learning procedure, a sophisticated latent structure is required to assure good convergence so that random features are sufficiently captured for sequential decoding. This study presents a new variational recurrent autoencoder (VRAE) for sequence reconstruction. There are two complementary encoders consisting of a long short-term memory (LSTM) and a pyramid bidirectional LSTM which are merged to discover the global and local dependencies in a hierarchical latent variable model, respectively. Experiments on Penn Treebank and Yelp 2013 demonstrate that the proposed hierarchical VRAE is able to learn the complementary representation as well as tackle the posterior collapse in stochastic sequential learning. The performance of recurrent autoencoder is substantially improved in terms of perplexity.