In our Machine Learning research projects, we are interested in using the tools of Natural Language Processing (NLP). In the example shown here, we analyze the content of twitter related to City Logistics in order to create a map of concepts.
The use of unsupervised learning algorithms such as dimensional reduction (Truncated singular value decomposition), clustering (k-means) and manifold learning (t-distributed Stochastic Neighbor Embedding) allowed us to autmatically analyze more than 110000 tweets and therefore generate the following “map of concepts“:
Click here for a full screen visualisation.
This work was submitted to the 11th International Conference on City Logistics. Downlaod presentation, Go to publication.
For more information, you can contact Simon Tamayo.
Abstract
City Logistics is characterized by multiple stakeholders that often have different objectives and constraints and therefore, different views of this complex system. Nowadays social media is one of the biggest channels of public expression and it is often used to communicate opinions and content related to City Logistics. The main idea of this research is that analyzing content from mainstream social media, such as Twitter, could help in the understanding of how people see urban logistics. This paper proposes a methodology for collecting content related to City Logistics from Twitter and implementing Machine Learning techniques, and more specifically unsupervised learning and Natural Language Processing (NLP), to perform content and sentiment analysis. The proposed methodology is applied to a set of 110 000 tweets containing City Logistics key-terms, that were posted from 2007 to 2018. Results allowed building an interest map of concepts related to City Logistics and a sentiment assessment to determine if City Logistics posts are positive, negative orneutral.
Keywords: City Logistics, Machine Learning, natural language processing, Twitter.
Acknowledgment
This research is inspired by the works of Olson et Al. [1] and Kruchten [2] who proposed an analysis of the preferences of the users posting in the web site reddit.
[1] R. S. Olson and Z. P. Neal, “Navigating the massive world of reddit: using backbone networks to map user interests in social media,” PeerJ Comput. Sci., vol. 1, p. e4, May 2015.
[2] N. Kruchten, “Data Science and (Unsupervised) Machine Learning with scikit-learn,” in Montreal Python, 2014.