We propose an ensemble scheme with a parallel computational structure which we call Distributed Ensemble Support Vector Machine (DESVM) to overcome the difficulties of large scale nonlinear Support Vector Machines (SVMs) in practice. The dataset is split into many stratified partitions. Each partition might be still too large to be solved by using conventional SVM solvers. We apply the reduced kernel trick to generate a nonlinear SVM classifier for each partition that can be treated as an approximation model based on the partial dataset. Then, we use a linear SVM classifier to fuse the nonlinear SVM classifiers that are generated from all data partitions. In this linear SVM training model, we treat each nonlinear SVM classifier as an 'attribute' or an 'expert'. In the ensemble phase, DESVM generates a fusion model which is a weighted combination of the nonlinear SVM classifiers. It can be explained as a weighted voting decision made by a group of experts. We test our proposed method on five benchmark datasets. The numerical results show that DESVM is competitive in accuracy and has a high speed-up. Thus, DESVM can be a powerful tool for binary classification problems with large scale not linearly separable datasets.