Federated Principal Component Analysis (FPCA)

I. Abstract

Federated Principal Component Analysis (FPCA) is proposed to compute the principal components of a given data set in a distributed fashion where there are several workers (distributed devices) and one master (data center). This framework is called federated learning, which allows a user to attain the global model without sharing the data among the local workers. This idea is essential in protecting data privacy.

II. Method

An optimization algorithm called augmented direction method of multipliers is used (ADMM) to achieve federated learning. To start with, The objective function of PCA can be formulated as:

Then by ADMM we can derive the worker’s and master’s algorithms, where z is denoted as the global model and u is the dual variable from ADMM:

We can illustrate the idea of FPCA by the toy example below:

In the toy example, there are 10 workers each having their own share of data. The dots scattered on the plot is the global data, assuming not seen by all the workers. The thick yellow line represents the first global PC and the other 10 lines stand for the first PCs calculated by each worker. At the begining (iter = 0), it can be observed that the PC each worker initialzes is stretching toward differnt direction from each other. However, at iteration 10 the workers already seem to reach a consensus solution without seeing the whole data throughout the whole process.

III. Experiments

Environment

macOS: 11.2.1
CPU: Intel Core i5 at 2.9GHz
RAM: 8 GB 2133 MHz LPDDR3
Matlab: 2019b

Datasets:

Iris: http://archive.ics.uci.edu/ml/datasets/Iris/
Breast cancer: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
Bank: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing
Synthetic: https://archive.ics.uci.edu/ml/datasets/synthetic+control+chart+time+series

We compare our FPCA method to the SVD method, which is currently the most common way to find PCs. The evaluation metric is the cosine similarity between the first PC we derived by FPCA and the one by SVD. The comparisons are performed using 3 real and 1 synthetic datasets. Each experiment is repeated 10 times and presented in the convergence plots below:

IV. Conclusion

In our work PCA is adapted into a federated learning setting, which not only solves the distributed machine learning problem but also ensure data privacy. The federated PCA is built up by formulating the projection approximation method as a consensus optimization problem and solving it by ADMM. In the application scenario, the data is distributed across multiple workers who compute and submit their model to a master functioning as the center integrating and broadcasting back the models to the workers. During the training process, the data is always kept private and unshared, which guarantees the data privacy. In addition to that, when processing a large dataset that can't be stored in a single device, the federated setting also allows us to conduct PCA by computing on multiple devices. Through a series of experiment it's found that the first PC calculated by FPCA is consistent with the one calculated by SVD method, which proves our method valid.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
datasets		datasets
img		img
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bank_10.m		bank_10.m
breast_cancer_10.m		breast_cancer_10.m
iris_10.m		iris_10.m
synthetic_control_10.m		synthetic_control_10.m
test		test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Federated Principal Component Analysis (FPCA)

I. Abstract

II. Method

III. Experiments

Environment

Datasets:

IV. Conclusion

About

Releases

Packages

Contributors 2

Languages

License

huaihuaiweng/FPCA

Folders and files

Latest commit

History

Repository files navigation

Federated Principal Component Analysis (FPCA)

I. Abstract

II. Method

III. Experiments

Environment

Datasets:

IV. Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages