This thesis presents resources capable of enhancing solutions of some Natural Language Processing (NLP) tasks, demonstrates the learning of abstractions by deep models through cross-lingual transferability, and shows how deep learning models trained on idioms can enhance open-domain conversational systems. The challenges of open-domain conversational systems are many and include bland repetitive utterances, lack of utterance diversity, lack of training data for low-resource languages, shallow world-knowledge and non-empathetic responses, among others. These challenges contribute to the non-human-like utterances that open-domain conversational systems suffer from. They, hence,have motivated the active research in Natural Language Understanding (NLU) and Natural Language Generation (NLG), considering the very important role conversations (or dialogues) play in human lives. The methodology employed in this thesis involves an iterative set of scientific methods. First, it conducts a systematic literature review to identify the state-of-the-art (SoTA) and gaps, such as the challenges mentioned earlier, in current research. Subsequently, it follows the seven stages of the Machine Learning (ML) life-cycle, which are data gathering (or acquisition), data preparation, model selection, training, evaluation with hyperparameter tuning, prediction and model deployment. For data acquisition, relevant datasets are acquired or created, using benchmark datasets as references, and their data statements are included. Specific contributions of this thesis are the creation of the Swedish analogy test set for evaluating word embeddings and the Potential Idiomatic Expression (PIE)-English idioms corpus for training models in idiom identification and classification. In order to create a benchmark, this thesis performs human evaluation on the generated predictions of some SoTA ML models, including DialoGPT. As different individuals may not agree on all the predictions, the Inter-Annotator Agreement (IAA) is measured. A typical method for measuring IAA is Fleiss Kappa, however, it has a number of shortcomings, including high sensitivity to the number of categories being evaluated. Therefore, this thesis introduces the credibility unanimous score (CUS), which is more intuitive, easier to calculate and seemingly less sensitive to changes in the number of categories being evaluated. The results of human evaluation and comments from evaluators provide valuable feedback on the existing challenges within the models. These create the opportunity for addressing such challenges in future work. The experiments in this thesis test two hypothesis; 1) an open-domain conversational system that is idiom-aware generates more fitting responses to prompts containing idioms, and 2) deep monolingual models learn some abstractions that generalise across languages. To investigate the first hypothesis, this thesis trains English models on the PIE-English idioms corpus for classification and generation. For the second hypothesis, it explores cross-lingual transferability from English models to Swedish, Yorùbá, Swahili, Wolof, Hausa, Nigerian Pidgin English and Kinyarwanda. From the results, the thesis’ additional contributions mainly lie in 1) confirmation of the hypothesis that an open-domain conversational system that is idiom-aware generates more fitting responses to prompts containing idioms, 2) confirmation of the hypothesis that deep monolingual models learn some abstractions that generalise across languages, 3) introduction of CUS and its benefits, 4) insight into the energy-saving and time-saving benefits of more optimal embeddings from relatively smaller corpora, and 5) provision of public access to the model checkpoints that were developed from this work. We further discuss the ethical issues involved in developing robust, open-domain conversational systems. Parts of this thesis are already published in the form of peer-reviewed journal and conference articles.