Vacant parking space detection is a challenging vision task due to outdoor lighting variation and perspective distortion. Previous methods found on camera geometry and projection matrix to select space image region for status classification. By utilizing suitable hand-crafted features, outdoor lighting variation and perspective distortion could be well handled. However, if also considering parking displacement, non-unified car size, and inter-object occlusion, we find the problem becomes more troublesome. To overcome these problems, we propose a deep learning framework to infer the parking status with two contributions. First, we integrate a convolutional spatial transformer network (STN) to crop the local image area adaptively according to car size and parking displacement. Second, in order to solve inter-object occlusion problems, we group 3 neighboring spaces as a unit. A multi-task loss function is designed to consider the status estimation of the target space and its two neighbors jointly. With the loss function, we could force our network to learn occlusion patterns while estimating space status. The results show our system can reduce the error detection rate and thereby increase system accuracy.