First impressions of personality traits can be inferred by non-verbal behaviours such as head pose, body postures, and hand gestures. Enabling social robots to infer the apparent personalities of their users based on such non-verbal cues will allow robots to gain the ability of adapting to their users, constituting a further step towards the personalisation of human–robot interactions. Deep learning architectures such as residual networks, 3D convolutional networks, and long-short time memory networks have been applied to classify human activities and actions in computer vision tasks. These same architectures are beginning to be applied to study human emotions and personality by focusing mainly on facial features in video recordings. In this work, we exploit body language cues to predict apparent personality traits for human–robot interactions. We customised four state-of-the-art neural network architectures to the task, and benchmarked them on a dataset of short side-view videos of dyadic interactions. Our results show the potential for deep learning architectures to predict apparent personality traits from body language cues. While the performance varied between models and personality traits, our results show that these models could still be able to predict sole personality traits, as exemplified by the results on the conscientiousness trait.