The ability to recognize facial expressions will be an important characteristic of next generation human computer interfaces. Towards this goal, we propose a novel REM-based model to learn effectively the relationships (or transformations) between image pairs associated with different facial expressions. The proposed model has the ability to disentangle these transformations (e.g. pose variations and facial expressions) by encoding them into two different hidden sets, namely facial-expression morphlets, and non-facial-expression morphlets. The first hidden set is used to encode facial-expression morphlets through a factored four-way sub-model conditional to label units. The second hidden set is used to encode non-facial-expression morphlets through a factored three-way sub-model. With such a strategy, the proposed model can learn transformations between image pairs while disentangling facial-expression transformations from non-facial-expression transformations. This is achieved using an algorithm, dubbed Quadripartite Contrastive Divergence. Reported experiments demonstrate the superior performance of the proposed model compared to state-of-the-art. (C) 2015 Elsevier Ltd. All rights reserved.