One of the most important issues about cloud computing is how to achieve load balancing among thousands of virtual machines (VMs) in a large datacenter. In this paper, we propose a novel decentralized load balancing architecture, called tldlb (two-level decentralized load balancer). This distributed load balancer takes advantage of the decentralized architecture for providing scalability and high availability capabilities to service more cloud users. We also propose a neural network-based dynamic load balancing algorithm, called nn-dwrr (neural network-based dynamic weighted round-robin), to dispatch a large number of requests to different VMs, which are actually providing services. In nn-dwrr, we combine VM load metrics (CPU, memory, network bandwidth, and disk I/O utilizations) monitoring and neural network-based load prediction to adjust the weight of each VM. Experimental results support that our proposed load balancing algorithm, nn-dwrr, can be applied to a large cloud datacenter, and it is 1.86 times faster than the wrr, 1.49 times faster than the Capacity-based, and 1.21 times faster than the ANN-based load balancing algorithms in terms of average response time. In addition, tldlb can reduce the SLA (service-level agreement) violation rate via in-time activating VMs from a spare VM pool.