Recent years have witnessed a surge of interest in applying machine learning to the design of networked systems. I will discuss my recent efforts along these lines in the context of two fundamental challenges: routing and congestion control. Specifically, I will present a general framework for online-learning-based Internet congestion control, and also a novel, deep-learning-based approach to optimizing traffic flow between data centers. Time permitting, I will also discuss recent results on how the safe deployment of such learning-augmented systems can be accommodated.
Michael Schapira is professor of Computer Science at Hebrew University. His research focuses on the design and analysis of (Inter)network architectures and protocols and, in particular, on the interface of networking and machine learning. Prior to joining Hebrew U, he was a visiting scientist at Google NYC’s Infrastructure Networking Group and a postdoctoral researcher at UC Berkeley, Yale University, and Princeton University. He is a recipient of the Wolf Foundation’s Krill Prize, faculty awards from Microsoft, Google, and Facebook, IETF/IRTF Applied Networking Research Prizes, and the IEEE Communications Society William R. Bennett Prize.
More and more data-intensive applications, e.g., micro-service architectures and machine learning workloads, move from on-premise deployments to the cloud. Traditional cloud security mechanisms focus on strict isolation, but these applications require the efficient yet secure sharing of data between components and services. In this talk, I will explore how we can use a new hardware security feature, memory capabilities, to design a cloud stack that bridges the tension between isolation and sharing. Memory capabilities constrain memory accesses, and they can be used to provide a VM-like isolation mechanism, cVMs, that can share data more efficiently than containers. Memory capabilities can also increase memory efficiency in the cloud by safely de-duplicating application components. I will discuss our experience in building a cloud stack using memory capabilities on the CHERI architecture, as implemented by Arm’s Morello hardware.
Peter Pietzuch is a Professor of Distributed Systems at Imperial College London, where he leads the Large-scale Data & Systems (LSDS) group (https://lsds.doc.ic.ac.uk). His research work focuses on the design and engineering of scalable, reliable and secure data-intensive software systems, with a particular interest in performance, data management and security issues. In addition, he is the Director of Research in the Department of Computing and a Co-Director for Imperial’s I-X initiative on AI, data and digital (https://ix.imperial.ac.uk). Recently, he has served as the Chair of the ACM SIGOPS European Chapter (EuroSys) and the Programme Committee Chair for ICDCS 2018. Before joining Imperial College London, he was a post-doctoral Fellow at Harvard University. He holds PhD and MA degrees from the University of Cambridge.
Deterministic databases provide strong serializability while avoiding concurrency-control related aborts by establishing a serial ordering of transactions before their execution. Other benefits include good distributed transaction scaling, and simpler replication and failure recovery since transactions can be replayed deterministically. However, prior deterministic databases scaled poorly under skewed and contended transactional workloads, and the capacity of main memory typically limited their size.
In this talk, I will discuss our work on Caracal, a novel deterministic database that performs well under both skew and contention. Like prior designs, we batch transactions into epochs and execute the transactions in an epoch in a predetermined order. However, we eschew partitioning, commonly used to avoid contention, in favour of a shared-memory design that handles skewed workloads more effectively. We manage contention both during the concurrency-control initialization phase that establishes the execution order for a batch of transactions and during the execution phase, with two novel optimizations that are enabled by the batched deterministic execution model. With these optimizations, Caracal scales well and outperforms existing deterministic schemes in most workloads by 1.9x to 9.7x.
We further show how to integrate non-volatile main memory (NVMM) into Caracal to support larger data sets at a lower cost per gigabyte, and to support faster failure recovery. We describe a novel dual-version checkpointing scheme that takes advantage of deterministic execution, epoch-based processing, and NVMM’s byte addressability to avoid persisting all updates to NVMM.
Angela Demke Brown is a Full Professor in the Department of Computer Science at the University of Toronto. She received her MSc from the University of Toronto and her PhD from Carnegie Mellon University, where her thesis work on compiler-based memory management for out-of-core applications received the Carnegie Mellon Doctoral Dissertation Award and was nominated for the ACM Doctoral Dissertation Award. Her research interests span operating systems, file systems, databases, and programming language runtimes with a focus on the performance and reliability of memory and storage management. Her work has received Best Paper awards at USENIX OSDI, USENIX FAST, and IEEE IPDPS. Dr. Brown was an IBM Faculty Fellow and Visiting Scientist from 2004 to 2007; in 2005 her research group received the IBM Centre for Advanced Studies “Team of the Year” award. She has previously held a NetApp Faculty Fellowship and was a Visiting Researcher at Microsoft Research UK in 2015-16. She has served as Program co-Chair for OSDI, FAST and HotStorage, as well serving on the USENIX Board of Directors, and as an Associate Editor for ACM Transactions on Computer Systems. She is a member of ACM and USENIX.
Day 1: Monday, June 5
09:00 Welcome and registration
09:40 Opening session
10:00 Keynote #1: Towards Learning-Powered Networked Systems, Michael Schapira, Hebrew University of Jerusalem
11:00 Break
11:20 Session A: File Systems
Chair: Gala Yadgar (Technion)
- DPFS: DPU-Powered File System Virtualization
Peter-Jan Gootzen, Jonas Pfefferle and Radu Stoica, IBM Research Zurich; Animesh Trivedi, VU Amsterdam - F3: Serving Files Efficiently in Serverless Computing (Best Paper)
Alex Merenstein, Stony Brook University; Vasily Tarasov, IBM Research; Ali Anwar, University of Minnesota; Scott Guthridge, IBM Research; Erez Zadok, Stony Brook University - Mimir: Finding Cost-efficient Storage Configurations in the Public Cloud
Hojin Park, Greg R. Ganger and George Amvrosiadis, Carnegie Mellon University
12:20 Lunch
13:40 Session B: KV stores
Chair: Sam H. Noh (Virginia Tech)
- TurboHash: A Hash Table for Key-value Store on Persistent Memory
Xingsheng Zhao, Chen Zhong and Song Jiang. University of Texas at Arlington - Exploiting Hybrid Index Scheme for RDMA-based Key-Value Stores
Shukai Han, Mi Zhang, Dejun Jiang and Jin Xiong, Institute of Computing Technology, Chinese Academy of Sciences - Iterator Interface Extended LSM-tree-based KVSSD for Range Queries
Seungjin Lee, Chang-Gyu Lee and Donghyun Min, Sogang University; Inhyuk Park and Woosuk Chung, SK hynix; Anand Sivasubramaniam, Pennsylvania State University; Youngjae Kim, Sogang University
14:40 till ~ 21:30 Social Event – Caesarea tour and dinner
(15:00-16:00, 16:00-17:00 – Computer Science Escape Room alternative)
Day 2: Tuesday, June 6
08:45 Welcome and registration
09:15 Keynote #2: Improving Cloud Security with Hardware Memory Capabilities, Peter Pietzuch (Imperial College London)
10:15 Break
10:35 Session C: SSDs
Chair: Aviad Zuck (Technion)
- ConfZNS : A Novel Emulator for Exploring Design Space of ZNS SSDs
Inho Song and Myounghoon Oh, Dankook University; Bryan S. Kim, Syracuse University; Seehwan Yoo, Jaedong Lee and Jongmoo Choi, Dankook University - Elastic RAID: Implementing RAID over SSDs with Built-in Transparent Compression
Zheng Gu, Jiangpeng Li, Yong Peng, Yang Liu and Tong Zhang, ScaleFlux - BOOSTER: Rethinking the erase operation of low-latency SSDs to achieve high throughput and less long latency Takumi Fujimori and Shuou Nomura, Kioxia Corporation
11:35 break
11:50 Session D: Misc
Chair: Orna Agmon Ben-Yehuda (CRI, Haifa University)
- Optimizing Memory Allocation for Multi-Subgraph Mapping on Spatial Accelerators
Lei Lei, Decai Pan and Li Lin. Chongqing University; Peng Ouyang, TsingMicro Co. Ltd.; Xuelaing Du, Beijing Baidu Netcom Science Technology Co., Ltd.; Dajiang Liu, Chongqing University - Anomaly Detection on IBM Z Mainframes: Performance Analysis and More
Erik Altman and Benjamin Segal, IBM - Predicting GPU Failures With High Precision Under Deep Learning Workloads
Heting Liu and Zhichao Li, ByteDance Inc.; Cheng Tan, Northeastern University; Rongqiu Yang, ByteDance Inc.; Guohong Cao, Pennsylvania State University; Zherui Liu and Chuanxiong Guo, Bytedance Inc.
12:50 Lunch
14:00 Session E: Highlights
Chair: Shir Landau-Feibish (Open University Israel)
- Starlight: Fast Container Provisioning on the Edge and over the WAN (NSDI 2022)
Jun Lin Chen and Daniyal Liaqat, University of Toronto; Moshe Gabel, York University / University of Toronto; Eyal de Lara, University of Toronto - SwiSh: Distributed Shared State Abstractions for Programmable Switches (NSDI 2022)
Lior Zeno, Technion; Dan R. K. Ports, Jacob Nelson, and Daehyeok Kim, Microsoft Research; Shir Landau Feibish, The Open University of Israel; Idit Keidar, Arik Rinberg, Alon Rashelbach, Igor De-Paula, and Mark Silberstein, Technion - Privbox: Faster System Calls Through Sandboxed Privileged Execution (Usenix ATC 2022)
Dima Kuznetsov and Adam Morrison. Tel Aviv University - Scaling Open vSwitch with a Computational Cache (NSDI 2022)
Alon Rashelbach, Ori Rottenstreich and Mark Silberstein, Technion
15:20 Break
15:40 Keynote #3: Herding wild cats: Performance and persistence in the Caracal deterministic database, Angela Demke Brown (University of Toronto)
16:40 Closing remarks
16:55 – 19:00 Poster session and Reception (CS Taub building – hall)
Day 3: Wednesday, June 7
10:30 Systor SC + chairs meeting
11:30 Joint Systor and Technion CE-club seminar (CS Taub building – 401):
DyTIS: A Dynamic Dataset Targeted Index Structure Simultaneously Efficient for Search, Insert, and Scan, Sam Noh (Virginia Tech)