MACHINE LEARNING APPLIQUÉ À L’ANALYSE DES TWEETS DE LOGISTIQUE URBAINE

Dans nos projets de recherche autour du Machine Learning, nous nous intéressons à l’utilisation des outils de Natural Language Processing (NLP). Dans l’exemple montré ici, nous analysons le contenu de twitter relatif à la logistique urbaine pour créer une carte de concepts.

L’utilisation des algorithmes d’apprentissage non-supervisé tels que la reduction dimensionnelle (Truncated singular value decomposition), le clustering (k-means) et le manifold learning (t-distributed Stochastic Neighbor Embedding) nous a permis d’analyser autmatiquement plus de 110000 tweets pour générer la carte ci-après:

Cliquez ici pour une visualisation en écran complet.

Ce travail a été soumis pour être présenté dans City Logistics (11th  International Conference on City Logistics).

Pour plus d’informations vous pouvez contacter Simon Tamayo.

Résumé du travail

City Logistics is characterized by multiple stakeholders that often have different objectives and constraints and therefore, different views of this complex system. Nowadays social media is one of the biggest channels of public expression and it is often used to communicate opinions and content related to City Logistics. The main idea of this research is that analyzing content from mainstream social media, such as Twitter, could help in the understanding of how people see urban logistics. This paper proposes a methodology for collecting content related to City Logistics from Twitter and implementing Machine Learning techniques, and more specifically unsupervised learning and Natural Language Processing (NLP), to perform content and sentiment analysis. The proposed methodology is applied to a set of 110 000 tweets containing City Logistics key-terms, that were posted from 2007 to 2018. Results allowed building an interest map of concepts related to City Logistics and a sentiment assessment to determine if City Logistics posts are positive, negative orneutral.

Keywords: City Logistics, Machine Learning, natural language processing, Twitter.

Acknowledgment

Cette recherche s’inspire des travaux de Olson et Al. [1] et Kruchten [2] qui ont proposé une analyse des préférences des utilisateurs postés sur le site Web reddit.

[1]  R. S. Olson and Z. P. Neal, “Navigating the massive world of reddit: using backbone networks to map user interests in social media,” PeerJ Comput. Sci., vol. 1, p. e4, May 2015.

[2]  N. Kruchten, “Data Science and (Unsupervised) Machine Learning with scikit-learn,” in Montreal Python, 2014.