Connect with us

Twitter

Accepted Posters

We are delighted to announce the posters that have been accepted to ACM SYSTOR’21.

CELL: Counter Estimation for Per-flow Traffic over Sliding Windows
Rana Shahout, Roy Friedman, and Dolev Adas (Technion)

Abstract, Poster
Abstract:
Estimators reduce the memory footprint of maintaining network statistics, while keeping the estimation error of each flow proportional to its size. This is unlike sketches and other approximate algorithms that only guarantee an error proportional to the entire stream size. In this work we present the CELL algorithm that combines estimators with efficient flow representation to obtain superior memory reduction compared to the state of the art. We also extend CELL to the sliding window model, which priorities recent data over old one, by presenting two variants named RAND-CELL and SHIFT-CELL.

DeCorus-NSA - Detection and Correlation of Unusual Signals for Network Syslog Analytics
David Ohana, Bruno Wassermann, Moshik Hershcovitch, Elliot Kolodner, Michal Malka, Eran Raichstein, Ronen Schaffer, and Robert Shahla (IBM Research)

Abstract, Poster
Abstract:
The management of large data centre (DC) network infrastructure confronts Network Reliability Engineers (NRE) with challenges. A single DC at a modern cloud services provider can host thousands of network devices. The syslog messages generated by these devices are an important type of monitoring data to detect and diagnose failures. Devices in a single DC produce millions of syslog messages per day in a variety of formats.
We present an alternative approach developed over the last few years with the NREs working on IBM Cloud’s networks. DeCorus-NSA assists NREs in three ways. First, it detects incidents without the need to specify rules manually. Second, it groups large numbers of individual alerts into a smaller number of higher-level incidents. And finally, DeCorus-NSA supports NREs with root cause analysis by extracting additional context.

Indexing cloud data lakes within the lakes
Grisha Weintraub, Ehud Gudes, and Shlomi Dolev (Ben Gurion University of the Negev)

Abstract, Poster
Abstract:
Cloud data lakes are a modern approach for storing large amounts of data in a convenient and inexpensive way. The main idea is the separation of compute and storage layers. However, to perform analytics on the data in this architecture, the data should be moved from the storage layer to the compute layer over the network for each calculation. Obviously, that hurts calculation performance and requires huge network bandwidth. We are exploring different approaches for adding indexing to the cloud data lakes with the goal of reducing the amounts of data read from the storage, and as a result, improving query execution time.

HPC Application Optimisation in SODALITE
Kalman Meth (IBM Research - Haifa); Alfio Lazzaro and Nina Mujkanovic (HPE HPC/AI EMEA Research Lab); Maria Carbonell (ATOS); Dragan Radolovic, Daniel Vladusic, and Joao Pita Costa (XLAB); Elisabetta Di Nitto (Politecnico di Milano)

Abstract, Poster
Abstract:
We propose to tackle the complexity of deploying and operating modern applications onto heterogeneous HPC and cloud-based systems by providing application developers and infrastructure operators with tools to abstract their application and infrastructure requirements.

Intelligent Re-deployment Feedback Loop for Hybrid Applications Kalman Meth (IBM); Indika Kumara (Jheronimus Academy of Data Science); Giovanni Quattrocchi (Politecnico di Milano)

Abstract, Poster
Abstract:
We propose enabling continuous performance optimisation of distributed hybrid applications in heterogeneous cloud, Edge, and HPC environments by employing an intelligent re-deployment feedback loop.

An Investigation of Performance Problems with msync() System Calls on Filesystem DAX
Satoshi Iwata (FUJITSU LABORATORIES LTD.)

Abstract, Poster
Abstract:
Persistent Memory (PM) is a new device which provides faster access than conventional storage devices, such as SSDs. Among several methods prepared for accessing files on PM, a combination of filesystem direct access (DAX) and mmap() is used to take advantage of its native abilities. We can avoid buffer cache and access PM with byte granularity.

Enabling Manycore Scalability in F2FS Metadata for unlink() Operation
Soon Hwang, Chang-Gyu Lee, and Youngjae Kim (Sogang University, Seoul)

Abstract, Poster
Abstract:
Manycore systems enable massive parallel I/O in a single server due to the number of cores. Among file I/O operations in a file system, C. Lee et al. [1] applied range lock in F2FS for parallel data I/O, and showed scalable performance. However, little research has been done on metadata I/O scalability.
To investigate this, we analyzed unlink() with related data structures in F2FS. File metadata in F2FS (inode) is called Node and identified via nid. Nodes are stored in an on-disk structure, called Node Address Table (NAT), which is cached in memory with a pool of free nids. F2FS keeps a certain number of free nids for fast nid allocations during create(). Every time unlink() is called, the number of free nids in the pool is checked. If it is not sufficient, a Free nid Scan function is performed to secure sufficient free nids.
We evaluated the I/O performance when multiple threads call unlink() in F2FS in a manycore system, and it shows no performance scalability. From the analysis above, we identified that a large critical section(CS) in Free nid Scan by a mutex lock is the leading cause of the scalability bottleneck.

Self-Managed Data Protection for Containerized Framework
Umesh Deshpande, Nick Linck, and Sangeetha Seshadri (IBM Research)

Abstract, Poster
Abstract:
Container frameworks have been gaining popularity in recent years, with container native storage being one of the fastest growing segment. According to IDC report [1], 90% of applications on cloud platforms and over 95% of new microservices are being deployed in containers. The growth of container native storage is largely driven by stateful applications [2, 3], the mainstay of enterprise IT environments. As organizations are increasingly adopting containerized deployments, they must also deal with data protection to maintain business continuity

Sentinel - Ransomware Detection in File Storage
Cornel Constantinescu and Sangeetha Seshadri (IBM Almaden Research Center)

Abstract, Poster
Abstract:
Ransomware is software that uses encryption to disable access to data until a ransom is paid and such attacks have increased steeply in recent times. The best current practice to minimize the impact of ransomware attacks include periodic backups and airgapped immutable copies. However, undetected attacks can corrupt data before backups, making backups unusable. Detecting ransomware attacks quickly and flagging the damaged content enables fast recovery and business continuity. We present some features of our ransomware attack detection algorithms prototyped and run on a sandboxed but realistic environment that successfully detected the live ransomware attacks from open source repositories.

Secure store for FHIR resources with Parquet Encryption
Eliot Salant and Maya Anderson (IBM Research - Haifa); Diana Trojaniello (IRCCS Ospedale San Raffaele)

Abstract, Poster
Abstract:
The ability for medical professionals to efficiently process vast amounts of data is critical. We present a solution which offers cloud-secure analytics on healthcare data, utilizing FHIR, the latest standard from the HL7 organization, for the exchange of health care data, together with Apache Parquet Modular Encryption.