Abstract: LLM serving is shifting from cloud to edge due to privacy concerns over user interaction data. However, mobile devices struggle with very limited computing power and memory, requiring ...