Keyword: distributed data : Search

Applied Filters

Publication Date

People

Publications

47 Results for: Keyword: distributed dataEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,856,353 records)|Limit your search to The ACM Full-Text Collection (778,796 records)

Showing 1 - 20of47 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
February 2025
Sequoia: An Accessible and Extensible Framework for Privacy-Preserving Machine Learning over Distributed Data
- Kaiqiang Xu,
- Di Chai,
- Junxue Zhang,
- Fan Lai,
- Kai Chen
Proceedings of the ACM on Management of Data (PACMMOD), Volume 3, Issue 1Article No.: 74, Pages 1–27https://doi.org/10.1145/3709742

Privacy-preserving machine learning (PPML) algorithms use secure computation protocols to allow multiple data parties to collaboratively train machine learning (ML) models while maintaining their data confidentiality. However, current PPML frameworks ...
0
47
Metrics
Total Citations0
Total Downloads47
Last 12 Months47
Last 6 weeks47
Get Access
research-article
Free
January 2023
Least squares model averaging for distributed data
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 215, Pages 10235–10293

Divide and conquer algorithm is a common strategy applied in big data. Model averaging has the natural divide-and-conquer feature, but its theory has not been developed in big data scenarios. The goal of this paper is to fill this gap. We propose two ...
0
56
Metrics
Total Citations0
Total Downloads56
Last 12 Months56
Last 6 weeks10
View online with eReader
PDF
research-article
Free
January 2023
Distributed nonparametric regression imputation for missing response problems with large-scale data
The Journal of Machine Learning Research (JMLR), Volume 24, Issue 1Article No.: 68, Pages 2961–3012

Nonparametric regression imputation is commonly used in missing data analysis. However, it su_ers from the "curse of dimension". The problem can be alleviated by the explosive sample size in the era of big data, while the large-scale data size presents ...
0
80
Metrics
Total Citations0
Total Downloads80
Last 12 Months80
Last 6 weeks20
View online with eReader
PDF
research-article
June 2022
NTP-VFL - A New Scheme for Non-3rd Party Vertical Federated Learning
- Di Zhao,
- Ming Yao,
- Wanwan Wang,
- Hao He,
- Xin Jin
ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and ComputingPages 134–139https://doi.org/10.1145/3529836.3529841

Vertical Federated Learning (FL) handles decentralized and partitioned vertically data about common entities. While most existing privacy-preserving federated learning algorithms require a third party (TP) as an intermediary data accessor to coordinate ...
4
86
Metrics
Total Citations4
Total Downloads86
Last 12 Months8
Last 6 weeks0
Get Access
research-article
May 2020
Data Sharing via Differentially Private Coupled Matrix Factorization
- Beyza Ermiş,
- A. Taylan Cemgİl
ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 14, Issue 3Article No.: 28, Pages 1–27https://doi.org/10.1145/3372408

We address the privacy-preserving data-sharing problem in a distributed multiparty setting. In this setting, each data site owns a distinct part of a dataset and the aim is to estimate the parameters of a statistical model conditioned on the complete ...
10
301
Metrics
Total Citations10
Total Downloads301
Last 12 Months23
Last 6 weeks2
Get Access
Upcoming Conferences

SPAA '25

July 28 - August 1, 2025

DoubleTree Hilton Portland, Portland, OR, USA

SPAA '25 Website

CCS '25

October 13 - 17, 2025

Taipei International Convention Center, Taipei, Taiwan
research-article
September 2019
Efficient privacy-preserving recommendations based on social graphs
RecSys '19: Proceedings of the 13th ACM Conference on Recommender SystemsPages 78–86https://doi.org/10.1145/3298689.3347013

Many recommender systems use association rules mining, a technique that captures relations between user interests and recommends new probable ones accordingly. Applying association rule mining causes privacy concerns as user interests may contain ...
9
776
Metrics
Total Citations9
Total Downloads776
Last 12 Months30
Last 6 weeks1
Get Access
research-article
Public Access
March 2019
Fast Approximate Score Computation on Large-Scale Distributed Data for Learning Multinomial Bayesian Networks
ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 13, Issue 2Article No.: 14, Pages 1–40https://doi.org/10.1145/3301304

In this article, we focus on the problem of learning a Bayesian network over distributed data stored in a commodity cluster. Specifically, we address the challenge of computing the scoring function over distributed data in an efficient and scalable ...
4
774
Metrics
Total Citations4
Total Downloads774
Last 12 Months211
Last 6 weeks29
View online with eReader
View this article in HTML format
PDF
research-article
January 2019
Training Normal Bayes Classifier on Distributed Data
Procedia Computer Science (PROCS), Volume 150, Issue CPages 389–396https://doi.org/10.1016/j.procs.2019.02.068
Abstract
The paper describes an approach to parallelization of Normal Bayes classifier training algorithm for distributed data. In the process of distributed data analysis and the algorithm performance, the results fail to join properly. Due to this, the ...
0
Metrics
Total Citations0
research-article
May 2017
Efficient Matrix Sketching over Distributed Data
PODS '17: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 347–359https://doi.org/10.1145/3034786.3056119

A sketch or synopsis of a large dataset captures vital properties of the original data while typically occupying much less space. In this paper, we consider the problem of computing a sketch of a massive data matrix A ∈ℜ^nxd, which is distributed across ...
1
374
Metrics
Total Citations1
Total Downloads374
Last 12 Months17
Last 6 weeks2
Get Access
research-article
May 2016
Management of distributed big data for social networks
- Carson K. Leung,
- Hao Zhang
CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid ComputingPages 639–648https://doi.org/10.1109/CCGrid.2016.107

In the current era of big data, high volumes of a wide variety of valuable data can be easily collected and generated from a broad range of data sources of different veracities at a high velocity. Due to the well-known 5V's of these big data, many ...
3
56
Metrics
Total Citations3
Total Downloads56
Last 12 Months5
Last 6 weeks0
Get Access
research-article
June 2015
Communication-Efficient Computation on Distributed Noisy Datasets
- Qin Zhang
SPAA '15: Proceedings of the 27th ACM symposium on Parallelism in Algorithms and ArchitecturesPages 313–322https://doi.org/10.1145/2755573.2755575

This paper gives a first attempt to answer the following general question: Given a set of machines connected by a point-to-point communication network, each having a {\em noisy} dataset, how can we perform communication-efficient statistical estimations ...
6
181
Metrics
Total Citations6
Total Downloads181
Last 12 Months3
Last 6 weeks0
Get Access
survey
April 2015
Classification Framework of MapReduce Scheduling Algorithms
ACM Computing Surveys (CSUR), Volume 47, Issue 3Article No.: 49, Pages 1–38https://doi.org/10.1145/2693315

A MapReduce scheduling algorithm plays a critical role in managing large clusters of hardware nodes and meeting multiple quality requirements by controlling the order and distribution of users, jobs, and tasks execution. A comprehensive and structured ...
49
1,658
Metrics
Total Citations49
Total Downloads1,658
Last 12 Months17
Last 6 weeks0
1
Supplementary Material
a49-tiwari-apndx.pdf
Get Access
article
February 2015
Privacy-Preserving Naïve Bayesian Classifier-Based Recommendations on Distributed Data
- Cihan Kaleli,
- Huseyin Polat
Computational Intelligence (COMI), Volume 31, Issue 1Pages 47–68https://doi.org/10.1111/coin.12012

Data collected for recommendation purposes might be distributed among various e-commerce sites, which can collaboratively provide more accurate predictions. However, because of privacy concerns, they might not want to work together. If privacy measures ...
7
Metrics
Total Citations7
Article
October 2014
The Communication Complexity of Distributed epsilon-Approximations
- Zengfeng Huang,
- Ke Yi
FOCS '14: Proceedings of the 2014 IEEE 55th Annual Symposium on Foundations of Computer SciencePages 591–600https://doi.org/10.1109/FOCS.2014.69

Data summarization is an effective approach to dealing with the "big data" problem. While data summarization problems traditionally have been studied is the streaming model, the focus is starting to shift to distributed models, as distributed/parallel ...
1
Metrics
Total Citations1
Article
June 2014
Privacy-Preserving Kriging Interpolation on Distributed Data
- Bulent Tugrul,
- Huseyin Polat
Proceedings of the 14th International Conference on Computational Science and Its Applications — ICCSA 2014 - Volume 8584Pages 695–708https://doi.org/10.1007/978-3-319-09153-2_52

Kriging is one of the most preferred geostatistical methods in many engineering fields. Basically, it creates a model using statistical properties of all measured points in the region, where a prediction value is sought. The accuracy of the kriging ...
0
Metrics
Total Citations0
research-article
April 2012
The ERC webdam on foundations of web data management
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebPages 211–214https://doi.org/10.1145/2187980.2188011

The Webdam ERC grant is a five-year project that started in December 2008. The goal is to develop a formal model for Web data management that would open new horizons for the development of the Web in a well-principled way, enhancing its functionality, ...
1
99
Metrics
Total Citations1
Total Downloads99
Last 12 Months0
Last 6 weeks0
Get Access
research-article
March 2012
Privacy preserving distributed DBSCAN clustering
EDBT-ICDT '12: Proceedings of the 2012 Joint EDBT/ICDT WorkshopsPages 177–185https://doi.org/10.1145/2320765.2320819

DBSCAN is a well-known density-based clustering algorithm which offers advantages for finding clusters of arbitrary shapes compared to partitioning and hierarchical clustering methods. However, there are few papers studying the DBSCAN algorithm under ...
18
448
Metrics
Total Citations18
Total Downloads448
Last 12 Months25
Last 6 weeks7
Get Access
Article
December 2011
An approach to access the distributed data based on the multi-agent system for interoperability
FGIT'11: Proceedings of the Third international conference on Future Generation Information TechnologyPages 215–222https://doi.org/10.1007/978-3-642-27142-7_25

In this paper, we present an approach to access the distributed data for interoperability in the distributed environments based on the multi-agent system that is designed on the proposed structure of multi-agent by FIPA(IEEE Foundation for Intelligent ...
0
Metrics
Total Citations0
research-article
October 2011
Data mining without data: a novel approach to privacy-preserving collaborative distributed data mining
- Vikas Ashok,
- Ravi Mukkamala
WPES '11: Proceedings of the 10th annual ACM workshop on Privacy in the electronic societyPages 159–164https://doi.org/10.1145/2046556.2046578

With the proliferation of organizations that independently collect various types of data, with the growing awareness of corporations and public to keep their sensitive data private, and with the ever-increasing need of government and corporate policy ...
4
466
Metrics
Total Citations4
Total Downloads466
Last 12 Months4
Last 6 weeks0
Get Access
Article
September 2011
Privacy-Preserving Trust-Based Recommendations on Vertically Distributed Data
- Cihan Kaleli,
- Huseyin Polat
ICSC '11: Proceedings of the 2011 IEEE Fifth International Conference on Semantic ComputingPages 376–379https://doi.org/10.1109/ICSC.2011.43

Providing recommendations on trusts between entities is receiving increasing attention lately. Customers may prefer different online vendors for shopping. Thus, their preferences about various products might be distributed among multiple parties. To ...
3
Metrics
Total Citations3