Cristian Borcea

New Jersey Institute of Technology, USA

Keynote Topic: Federated Learning for Mobile Sensing Data

Abstract: Federated Learning (FL) has emerged as a new distributed machine learning paradigm that enables privacy-aware training and inference on mobile devices with help from the cloud. FL has the potential to enable a wide range of new mobile apps that benefit from running machine learning models on mobile sensing data. The privacy-sensitive raw data is used for local training on the devices, and only the model parameters are transferred to the cloud, where a global model is aggregated and shared with all mobile devices. This keynote talk presents our ongoing work on FL systems and applications. First, we describe FLSys, an end-to-end FL system designed to achieve energy efficiency, tolerance to communication failures, and scalability. In addition, different FL models, accessed concurrently by different apps, are able to work with different FL aggregation methods in the cloud. A common API is provided for third-party app developers to train FL models. FLSys is implemented in Android and AWS cloud. We demonstrate FLSys in the context of human activity recognition (HAR) in the wild, with data collected from the phones of 100+ students. We propose a novel HAR-Wild model, which is based on a skipped Convolution Neural Network model with a data augmentation mechanism to mitigate the non-Independent and Identically Distributed data problem that negatively affects FL model training. We conduct extensive experiments on Android phones and Android emulators, showing that FLSys and HAR-Wild achieve good model utility and practical system performance, in terms of training time and resource consumption on the phones. Second, we present a system for fine-grained location prediction (FGLP) of mobile users, based on GPS traces collected on the phones. FGLP has two components: an FL framework and a prediction model. The framework runs on the phones of the users and also on a server that coordinates learning from all users in the system. The framework represents the user location data as relative points in an abstract 2D space, which enables learning across different physical spaces. The model merges Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Networks (CNN), where BiLSTM learns the speed and direction of the mobile users, and CNN learns information such as user movement preferences. Our experimental results, using a dataset with over 600,000 users, demonstrate that FGLP outperforms baseline models in terms of prediction accuracy for pedestrians and bicyclists. In addition, benchmark results on several types of Android phones demonstrate FGLP’s feasibility in real-life. We conclude this talk with lessons learned from building FL systems and applications, and with challenges that still need to be overcome in order to deploy FL models in real-life.



Bio: Cristian Borcea is a Professor of Computer Science and the Associate Dean for Strategic Initiatives in the Ying Wu College of Computing at New Jersey Institute of Technology, USA. He also holds a Visiting Professor appointment at National Institute of Informatics, Tokyo, Japan. Cristian has over 20 years of experience in the fields of mobile computing & sensing; ad hoc & vehicular networks; and cloud & distributed systems. His current research is at the intersection of mobile computing and machine learning. He has published over 100 papers in top international journals and conferences, and his research has been covered in over 20 media articles in the past few years. Cristian has served as Technical Program Chair or General Chair to conferences such as IEEE MDM, IEEE Mobile Cloud, and EAI Mobiquitous. Cristian received his PhD in Computer Science from Rutgers University, USA. More information: http://cs.njit.edu/~borcea.

Arthur Gervais

Imperial College, London

Keynote Topic: High-Frequency Trading on Decentralized On-Chain Exchanges

Abstract: Decentralized exchanges (DEXs) allow parties to participate in financial markets while retaining full custody of their funds. However, the transparency of blockchain-based DEX in combination with the latency for transactions to be processed, makes market-manipulation feasible. For instance, adversaries could perform front-running--the practice of exploiting (typically non-public) information that may change the price of an asset for financial gain. In this talk we formalize, analytically exposit and empirically evaluate an augmented variant of front-running: sandwich attacks, which involve front-and back-running victim transactions on a blockchain-based DEX. We quantify the probability of an adversarial trader being able to undertake the attack, based on the relative positioning of a transaction within a blockchain block. We find that a single adversarial trader can earn a daily revenue of over several thousand USD when performing sandwich attacks on one particular DEX--Uniswap, an exchange with over 5M USD daily trading volume by June 2020. In addition to a single-adversary game, we simulate the outcome of sandwich attacks under multiple competing adversaries, to account for the real-world trading environment. This talk is based on a paper at IEEE Symposium on Security & Privacy (S&P) 2021. The preprint is available: https://arxiv.org/pdf/2009.14021.pdf.


Bio: Arthur Gervais is a Lecturer (equivalent Assistant Professor) at Imperial College London. He's passionate about information security and worked since 2012 on blockchain related topics, with a recent focus on Decentralized Finance (DeFi).

Haibo Chen

Shanghai Jiao Tong University, China

Keynote Topic: Low-latency Serverless Computing: Characterization, Optimization and Outlooking

Abstract: Serverless computing promises cost-efficiency and elasticity for high-productive software development. To achieve this, the serverless computing platform must address two challenges: strong isolation between function instances, and low startup latency to ensure user experience. In this talk, I will first present a characterization of state-of-the-art serverless platform and derive several key metrics, which collectly forms a systematic methodology and a benchmark called severlessbench. Then, I will show how severless platform can be optimized for (sub-)millisecond startup latency for both normal and condential severless computing. Finally, I will present an outlook on challenges and opportunities of serverless comptuting, including how to make it secure and efficient for joint cloud computing.

Bio: Haibo Chen is a Distinguished Professor of Shanghai Jiao Tong University, who direct both the Institute for Parallel and Distributed Systems and the Engineering Research Center of Ministry of Education for Domain Operating Systems. Haibo is a recipient of National Distinguished Young Fund and ACM Distinguished Scientist. His main research areas are operating systems and distributed systems, and has won the First Prize of Technical Invention by the Ministry of Education, China Youth Science and Technology Award, President's Award of Shanghai Jiao Tong University, CCF Young Scientist Award, National Excellent Doctoral Dissertation Award, etc. He is currently the Chairof ACM ChinaSys, the Vice Chair of the Special Committee on System Software of the Chinese Computer Society, severs on the editorial board member and co-chair of Special Sections of the ACM flagship magzine Communications of the ACM, and the editorial board member of ACM Transactions on Storage. He was the co-chair of ACM SOSP 2017 conference, the area chair of ACM CCS 2018 for System Security, and a member of ACM SIGSAC award committee. According to csrankings.org, he is tied for first in the world in the number of papers published in top conferences (SOSP/OSDI, EuroSys, Usenix ATC and FAST) in the field of operating systems in the last 5 years (2015-2020), and has received Best Paper Awards from ASPLOS, EuroSys, VEE, ICPP and other reputed academic conferences.

Paolo Tonella

Università della Svizzera italiana, Switzerland

Keynote Topic: DL Testing: the What, the How, and the Why

Abstract: The literature on deep learning (DL) testing has grown exponentially in recent years, due to the widespread adoption of deep neural networks in safety and business critical domains, such as autonomous driving, financial trading and medical diagnosis. Many novel techniques have been proposed to assess the robustness of DL based systems facing execution conditions that are missing or under-represented in the data used to train them. In this talk, I will discuss the "what", "how" and "why" of DL faults: what types of faults affect a DL system, how they can be exposed, and why a given test input can indeed expose them. More specifically, I will describe a fault taxonomy obtained from multiple sources, such as software repository and forum mining, as well as interviews with developers. I will introduce an input generation technique, based on the notion of frontier of behaviors, which can expose DL faults. Finally, I will describe a feature map representation of the input space that provides a human interpretable characterization of failure inducing inputs.

Bio: Paolo Tonella is Full Professor at the Faculty of Informatics and at the Software Institute of Università della Svizzera Italiana (USI) in Lugano, Switzerland. He is Honorary Professor at University College London, UK and he is Affiliated Fellow of Fondazione Bruno Kessler, Trento, Italy, where he has been Head of Software Engineering until mid 2018. Paolo Tonella holds an ERC Advanced grant as Principal Investigator of the project PRECRIME. Paolo Tonella wrote over 150 peer reviewed conference papers and over 50 journal papers. His H-index (according to Google scholar) is 59. He is/was in the editorial board of the ACM Transactions on Software Engineering and Methodology, of the IEEE Transactions on Software Engineering, of Empirical Software Engineering, Springer, and of the Journal of Software: Evolution and Process, Wiley. His current research interests are in software testing, in particular approaches to ensure the dependability of machine learning based systems, automated testing of cyber physical systems, and test oracle inference and improvement.

Taghi Khoshgoftaar

Florida Atlantic University, USA

Keynote Topic: Medicare Fraud Detection and Big Data Challenges

Abstract: The U.S. Medicare program provides affordable health insurance to elderly population and individuals with select disabilities. Unfortunately, there is a significant amount of fraud, waste, and abuse within the Medicare system that costs taxpayers billions of dollars and puts beneficiaries' health and welfare at risk. In this presentation, we demonstrate how machine learning techniques can be used to automate fraud detection using historical Medicare claims data that is publicly available through the Centers for Medicare and Medicaid Services. We cover various challenges and solutions related to working with big data, severe class imbalance, and high-dimensional categorical variables, and we provide methods to interpret results through feature importance and ranking. The techniques presented in this talk have enabled us to maximize our Medicare fraud detection results. Most importantly, many of the techniques presented in this study can be extended to similar machine learning applications that are characterized by big data and severe class imbalance.

Bio: Dr. Taghi M. Khoshgoftaar is Motorola Endowed Chair professor of the Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University and the Director of NSF Big Data Training and Research Laboratory. His research interests are in big data analytics, data mining and machine learning, health informatics and bioinformatics, social network mining, fraud detection, and software engineering. He has published more than 800 refereed journal and conference papers in these areas. He was the conference chair of the IEEE International Conference on Machine Learning and Applications (ICMLA 2016 and ICMLA 2019). He is the Co-Editor-in Chief of the journal of Big Data. He has served on organizing and technical program committees of various international conferences, symposia, and workshops. Also, he has served as North American Editor of the Software Quality Journal and was on the editorial boards of the journals Multimedia Tools and Applications, Knowledge and Information Systems, and Empirical Software Engineering and is on the editorial boards of the journals Software Quality, Software Engineering and Knowledge Engineering, and Social Network Analysis and Mining. For my selected publications, please see my Google Scholar link below: https://scholar.google.com/citations?user=-PgNSCAAAAAJ&hl=en.