In Internet-of-Things (IoT)-driven smart-world systems, real-time crowdsourced databases from multiple distributed servers can be aggregated to extract dynamic statistics from a larger population, thus providing more reliable knowledge for our society. Particularly, multiple distributed servers in a decentralized network can realize real-time collaborative statistical estimation by disseminating statistics from their separate databases. Despite no raw data sharing, the real-time statistics could still expose the data privacy of crowdsourcing participants. For mitigating the privacy concern, while the traditional differential privacy (DP) mechanism can be simply implemented to perturb the statistics in each timestamp and independently for each dimension, this may suffer a great utility loss from the real-time and multidimensional crowdsourced data. Also, the real-time broadcasting would bring significant overheads in the whole network. To tackle the issues, we propose a novel privacy preserving and communication-efficient decentralized statistical estimation algorithm (DPCrowd), which only requires intermittently sharing the DP protected parameters with one-hop neighbors by exploiting the temporal correlations in real-time crowdsourced data. Then, with further consideration of spatial correlations, we develop an enhanced algorithm, DPCrowd+, to deal with multidimensional infinite crowd-data streams. Extensive experiments on several data sets demonstrate that our proposed schemes DPCrowd and DPCrowd+ can significantly outperform existing schemes in providing accurate and consensus estimation with rigorous privacy protection and great communication efficiency.
- Communication efficiency
- crowdsourced data
- decentralized statistical estimation
- differential privacy (DP)
- real time