This paper presents a non-binary LDPC decoder based on stochastic arithmetic. Although the previous stochastic works reduce the complexity of check node by transforming the convolution of the SPA algorithm to the finite field summation, the stochastic decoder still has a implementation bottleneck due to large storage introduced by the variable node process. Considering a balance between algorithm level and implementation level, we propose a shortened TFM architecture as well as its updating criterion. A compare-and-alter counter architecture is also proposed to avoid sorting among counters which decide the decoded codeword. With these features, the proposed (136, 68) fully-parallel stochastic NB-LDPC decoder over GF(32) implemented in UMC 90-nm can achieve 120 Mb/s throughput while operating under 455 MHz with 740 k gate counts which are only 10 % of the original TFM decoder.