Georgios Spithourakis

Projects & News

Numbers in Language Modelling

Text often contains numbers to convey specific information in various domains, e.g. from everyday life ("John is 1.75 meters tall") to scientific and clinical documents ("severe dilation of the left ventricle with EDV=355ml"). There is a relation between the words and numbers we use, as seen in the figure above. In this example, we have extracted from clinical reports pairs of numbers (clinical measurements) and words (descriptions of severity of a clinical condition: "non", "mild", "severe") and estimated the distribution of words given numbers and that of numbers given words.
In language modelling, most numbers are often treated as out-of-vocabulary words (or masked under an "UNKNOWN NUMBER" category) and, thus, their informational content is lost. The goal of this project is to investigate and evaluate extensions to language models that allow them to incorporate numeric information. We find that extending the input of language models to include the magnitude of numeric tokens can lead to improvements in perplexity and the downstream tasks of semantic error correction (Spithourakis et al., 2016a) and text prediction (Spithourakis et al., 2016b).

Tweets by geospith

Human Code
Computer Tongue

Together with poet Zena Edwards we have sought to explore the spectrum between artificial and human creativity. We have already organised two masterclasses, where participants have created poems through traditional poetry writing exercises (e.g. freeflow, ekphrasis) and through an AI-inspired interactive simulation, where participants pretended to be neurons in a poetry-generating artificial neural network. The events have also included invited talks and performances by improv theatre human/AI duet Piotr Mirowski and A.L.E.X., musician Xana, tech poet Dan Simpson, and academic and language expert Mandana Seyfeddinipur.
This project has been supported by Apples and Snakes (a big thanks to Daniela Paolucci!) and UCL's public engagement "Train and Engage" programme. More information can be found on Zena's Tumblr, blog, and website.

Publications

Conference Proceedings

G. Spithourakis, S. Riedel. Numeracy for Language Models: Evaluating their Ability to Predict Numbers. ACL 2018. [paper]
×
@article{mostafazadeh2017image, title={Image-grounded conversations: Multimodal context for natural question and response generation}, author={Mostafazadeh, Nasrin and Brockett, Chris and Dolan, Bill and Galley, Michel and Gao, Jianfeng and Spithourakis, Georgios P and Vanderwende, Lucy}, journal={arXiv preprint arXiv:1701.08251}, year={2017} }

N. Mostafazadeh, C. Brockett, B. Dolan, M. Galley, J. Gao, G. Spithourakis, L. Vanderwende. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation. IJCNLP 2017. [paper] [data]
×
@article{spithourakis2016numerically, title={Numerically Grounded Language Models for Semantic Error Correction}, author={Spithourakis, Georgios P and Augenstein, Isabelle and Riedel, Sebastian}, journal={arXiv preprint arXiv:1608.04147}, year={2016} }

G. Spithourakis, I. Augenstein, S. Riedel. Numerically Grounded Language Models for Semantic Error Correction. EMNLP, 2016. [paper]
×
@article{li2016persona, title={A persona-based neural conversation model}, author={Li, Jiwei and Galley, Michel and Brockett, Chris and Spithourakis, Georgios P and Gao, Jianfeng and Dolan, Bill}, journal={arXiv preprint arXiv:1603.06155}, year={2016} }

J. Li, M. Galley, C. Brockett, G. Spithourakis, J. Gao, B. Dolan. A Persona-Based Neural Conversation Model. ACL, 2016. [paper]

Workshop Proceedings

×
@article{spithourakis2016clinical, title={Clinical Text Prediction with Numerically Grounded Conditional Language Models}, author={Spithourakis, Georgios P and Petersen, Steffen E and Riedel, Sebastian}, journal={EMNLP 2016}, pages={6}, year={2016} }

G. Spithourakis, S. E. Petersen, S. Riedel. Clinical Text Prediction with Numerically Grounded Conditional Language Models. EMNLP workshop, 2016. [paper]
×

G. Spithourakis, S. E. Petersen, and S. Riedel. Harnessing the predictive power of clinical narrative to resolve inconsistencies and omissions in EHRs. Short paper and poster presentation in 2nd Workshop on Machine Learning for Clinical Data Analysis, Healthcare and Genomics, NIPS 2014.

Journals

×
@article{spithourakis2015amplifying, title={Amplifying the learning effects via a forecasting and foresight support system}, author={Spithourakis, Georgios P and Petropoulos, Fotios and Nikolopoulos, Konstantinos and Assimakopoulos, Vassilios}, journal={International Journal of Forecasting}, volume={31}, number={1}, pages={20--32}, year={2015}, publisher={Elsevier} }

G. Spithourakis, F. Petropoulos, K. Nikolopoulos and V. Assimakopoulos. Amplifying the learning effect via a forecasting and foresight support system. International Journal of Forecasting, 31(1):20-32, 2015. [paper]
×
@article{spithourakis2014systemic, title={A systemic view of the ADIDA framework}, author={Spithourakis, Georgios P and Petropoulos, Fotios and Nikolopoulos, Konstantinos and Assimakopoulos, Vassilios}, journal={IMA Journal of Management Mathematics}, volume={25}, number={2}, pages={125--137}, year={2014}, publisher={OUP} }

G. Spithourakis, F. Petropoulos, K. Nikolopoulos and V. Assimakopoulos. A systemic view of the ADIDA framework. IMA Journal of Management Mathematics, 25(2): 125-137, 2014. [paper]
×
@article{petropoulos2013empirical, title={Empirical heuristics for improving intermittent demand forecasting}, author={Petropoulos, Fotios and Nikolopoulos, Konstantinos and Spithourakis, Georgios P and Assimakopoulos, Vassilios}, journal={Industrial Management \& Data Systems}, volume={113}, number={5}, pages={683--696}, year={2013}, publisher={Emerald Group Publishing Limited} }

F. Petropoulos, K. Nikolopoulos, G. Spithourakis and V. Assimakopoulos. Empirical heuristics for improving intermittent demand forecasting. Industrial Management & Data Systems, 113(5):683-696, 2013. [paper]
×
@inproceedings{spithourakis2011improving, title={Improving the performance of popular supply chain forecasting techniques}, author={Spithourakis, Georgios P and Petropoulos, Fotios and Babai, M Zied and Nikolopoulos, Konstantinos and Assimakopoulos, Vassilios}, booktitle={Supply Chain Forum: an international journal}, volume={12}, number={4}, pages={16--25}, year={2011}, organization={Taylor \& Francis} }

G. Spithourakis, F. Petropoulos, M.Z. Babai, K. Nikolopoulos and V. Assimakopoulos. Improving the performance of popular supply chain forecasting techniques: an empirical investigation. Supply Chain Forum: an International Journal, 12(4):16-25, 2012. [paper]

Arxiv

×
@article{riedel2017simple, title={A simple but tough-to-beat baseline for the Fake News Challenge stance detection task}, author={Riedel, Benjamin and Augenstein, Isabelle and Spithourakis, Georgios P and Riedel, Sebastian}, journal={arXiv preprint arXiv:1707.03264}, year={2017} }

B. Riedel, I. Augenstein, G. Spithourakis, S. Riedel. A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. arXiv 2017. [paper]

Georgios Spithourakis

About

Research

Education

Experience

Projects & News

Numbers in Language Modelling

Human Code
Computer Tongue

Publications

Contact

Email Address

Visiting Address

Postal Address

Find me on ...

Research

Education

Experience

Numbers in Language Modelling

Human CodeComputer Tongue

Email Address

Visiting Address

Postal Address

Find me on ...

Human Code
Computer Tongue