Memory-Aware Scheduling

Memory-aware and context-aware multi-DNN inference on the edge

Masa is a memory-aware multi-DNN scheduling framework for edge devices that ensures low response times without modifying models. It leverages inter/intra-network dependencies and context to cut latency by up to 90% on low-memory devices.

Code Paper