|
Introduction
In-network computing offloads server-based applications to programmable devices. However, network devices are very resource-constrained compared to servers. This project aims to scale in-network computing algorithms further via distributed deployment. This project can also be used as an extension of the in-network ML project.
|
|
Phase I: Distributed Deployment
|
|
DINC: Toward Distributed In-Network Computing
Changgang Zheng, Haoyue Tang, Mingyuan Zang, Xinpeng Hong, Aosong Feng, Leandros Tassiulas, and Noa Zilberman
ACM CoNEXT'23 & Proceedings of the ACM on Networking, 2023
[Acceptance Rate: 24/129=18.6%]
Paper |
BibTex |
Code
Research has focused on enabling on-device functionality, with limited consideration to distributed in-network computing. This paper explores the applicability of distributed computing to in-network computing and presents DINC, a framework enabling distributed in-network computing, generating deployment strategies, overcoming resource constraints and providing functionality guarantees across a network.
|
|
Presented "MUTA: Enabling Multi-Task Neural Network Inference in Programmable Data-Planes" on the 22nd of May, 2025.
Kaiyi Zhang, Changgang Zheng, Nancy Samaan, Ahmed Karmouch, and Noa Zilberman
IEEE HPSR 2025 [CCF B] [Best Paper Award]
Paper |
BibTex
We introduce MUTA; a novel in-network multi-task learning solution. MUTA enables executing multiple inference tasks concurrently in the data-plane, without exhausting available resources.
|
|
Design, Implementation, and Deployment of Multi-Task Neural Networks in Programmable Data-Planes
Kaiyi Zhang, Changgang Zheng, Nancy Samaan, Ahmed Karmouch, and Noa Zilberman
IEEE Transactions on Network and Service Management (TNSM)
Link |
BibTex
We introduce MUTA, a novel in-network multi-task learning framework that enables concurrent inference of multiple tasks in the data-plane, without exhausting available resources. MUTA enhances scalability by supporting distributed deployment, where different layers of a multi-task model can be offloaded across multiple switches.
|
|