10.1145/3511808.3557176acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

A System for Time Series Feature Extraction in Federated Learning

Published:17 October 2022Publication History

ABSTRACT

Federated learning (FL), which enables collaborative learning without revealing raw data, is an emerging topic in privacy-preserving machine learning. Based on our experiences in thousands of real-world applications, time-series feature extraction plays a significant role in improving model quality. In this work, we propose a system automatically integrating time series feature extraction for training FL models. Our experiments show that by adopting time series feature extraction, the model accuracy (AUC) is improved by 3% on average, and recall is increased by 10% in recommender systems. We have open-sourced the project https://github.com/4paradigm/tsfe and provided a step by step demonstration on how audiences can use our system to create their own FL pipeline that extracts time series features. Demonstration video at: https://youtu.be/UW27dWT-ays

Skip Supplemental Material Section

Supplemental Material

CIKM22-demo063.mp4

This demonstration details the whole process of extracting time series features and using them for training in federated learning. Based on 4Paradigm's experiences in thousands of real-world applications, time-series feature extraction plays a significant role in improving model quality. We propose a system, TSFE, that securely integrates time series feature extraction for training in vertical federated learning. The system is implemented based on two industrial solutions: OpenMLDB and FATE. Experiments show that by adopting time series feature extraction, the model accuracy (AUC) is improved on average by 3%. The project is open-sourced at https://github.com/4paradigm/tsfe. 4paradigm was founded in September 2014, we offer platform-centric AI solutions to develop end-to-end enterprise-class AI products to uncover hidden patterns in data and comprehensively enhance decision-making capabilities. We are the largest player by revenue in the platform-centric decision-making enterprise AI market, in China.

References

  1. Abbas Acar, Hidayet Aksu, A. Selcuk Uluagac, and Mauro Conti. 2018. A Survey on Homomorphic Encryption Schemes: Theory and Implementation. ACM Comput. Surv., Vol. 51, 4, Article 79 (July 2018), 35 pages. https://doi.org/10.1145/3214303Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cheng Chen, Jun Yang, Mian Lu, Taize Wang, Zhao Zheng, Yuqiang Chen, Wenyuan Dai, Bingsheng He, Weng-Fai Wong, Guoan Wu, Yuping Zhao, and Andy Rudoff. 2021. Optimizing In-Memory Database Engine for AI-Powered on-Line Decision Augmentation Using Persistent Memory. Proc. VLDB Endow., Vol. 14, 5 (Jan. 2021), 799--812. https://doi.org/10.14778/3446095.3446102Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Cheng, T. Fan, Y. Jin, Y. Liu, T. Chen, D. Papadopoulos, and Q. Yang. 2021. SecureBoost: A Lossless Federated Learning Framework. IEEE Intelligent Systems 01 (May 2021), 1--1. https://doi.org/10.1109/MIS.2021.3082561Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. FATE. 2021. Federated AI Technology Enabler. https://github.com/FederatedAI/FATE. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  5. GDPR. 2021. General Data Protection Regulation. https://gdpr-info.eu/. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  6. Google. 2021. Classification: ROC Curve and AUC. https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  7. Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017).Google ScholarGoogle Scholar
  8. JD. 2021. JD Prediction of Purchasing Intention of High Potential Users. https://github.com/4paradigm/tsfe/blob/main/DATASET.md. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  9. Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2021. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, Vol. 14, 1--2 (2021), 1--210.Google ScholarGoogle Scholar
  10. Qinbin Li, Zeyi Wen, Zhaomin Wu, Sixu Hu, Naibo Wang, Yuan Li, Xu Liu, and Bingsheng He. 2021. A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection. arxiv: 1907.09693 [cs.LG]Google ScholarGoogle Scholar
  11. Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, Vol. 37, 3 (2020), 50--60.Google ScholarGoogle ScholarCross RefCross Ref
  12. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273--1282.Google ScholarGoogle Scholar
  13. OpenMLDB. 2021. An Open Source Database for Machine Learning Systems. https://github.com/4paradigm/OpenMLDB. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  14. Pascal Paillier. 1999. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Advances in Cryptology -- EUROCRYPT '99, Jacques Stern (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 223--238.Google ScholarGoogle Scholar
  15. PDPA. 2021. Personal Data Protection Act. https://www.pdpc.gov.sg/Overview-of-PDPA/The-Legislation/Personal-Data-Protection-Act. Accessed: 2022-06--10.Google ScholarGoogle Scholar
  16. Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol., Vol. 10, 2, Article 12 (Jan. 2019), 19 pages. https://doi.org/10.1145/3298981Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A System for Time Series Feature Extraction in Federated Learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!