Facial expression recognition (FER) is a crucial task for human emotion analysis and has attracted wide interest in the field of computer vision and affective computing. General convolutional-based FER methods rely on the powerful pattern abstraction of deep models, but they lack the ability to use semantic information behind significant facial areas in physiological anatomy and cognitive neurology. In this work, we propose a novel approach for expression feature learning called Semantic Graph-based Dual-Stream Network (SG-DSN), which designs a graph representation to model key appearance and geometric facial changes as well as their semantic relationships. A dual-stream network (DSN) with stacked graph convolutional attention blocks (GCABs) is introduced to automatically learn discriminative features from the organized graph representation and finally predict expressions. Experiments on three lab-controlled datasets and two in-the-wild datasets demonstrate that the proposed SG-DSN achieves competitive performance compared with several latest methods.