Layer size distribution

Memory-aware and context-aware multi-DNN inference on the edge

Masa is a memory-aware multi-DNN scheduling framework for edge devices that ensures low response times without modifying models. It leverages inter/intra-network dependencies and context to cut latency by up to 90% on low-memory devices.

Pervasive and Mobile Computing · July 2022 · Bart Cox,  Robert Birke,  Lydia Y Chen
Architecture of M ASA

Masa: Responsive multi-dnn inference on the edge

Masa, a responsive memory-aware multi-DNN execution framework, an on-device middleware featuring on modeling inter- and intra-network dependency and leveraging complimentary memory usage of each layer.

IEEE PerCom · April 2021 · Bart Cox