In deep-submicron era, wire delay is becoming the bottleneck while pursuing high system clock speed. Several distributed register (DR) architectures are proposed to cope with this problem by keeping most wires local. In this paper, we propose the distributed register-file microarchitecture with inter-island delay (DRFM-IID). With such delay consideration, synthesis task is inherently more complicated than the one with no inter-island delay concern since uncertain interconnect latency is very likely to make a serious impact on whole system performance. Hence we also develop a performance-driven architectural synthesis framework targeting DRFM-IID, which takes the number of inter-island transfers, transfer criticality and resource utilization into account for better optimization outcomes. The experimental results show that the latency and the number of inter-cluster transfers can be reduced by 26.9% and 37.5% on average; and the latter is a common indicator for power consumption of on-chip communication.