CROSS-LANGUAGE TEXT CLASSIFICATION WITH CONVOLUTIONAL NEURAL NETWORKS FROM SCRATCH

Musbah Zaid Enweiji, Taras Lehinevych, Аndrey Glybovets

Abstract


Cross language classification is an important task in multilingual learning, where documents in different languages often share the same set of categories. The main goal is to reduce the labeling cost of training classification model for each individual language. The novel approach by using Convolutional Neural Networks for multilingual language classification is proposed in this article. It learns representation of knowledge gained from languages. Moreover, current method works for new individual language, which was not used in training. The results of empirical study on large dataset of 21 languages demonstrate robustness and competitiveness of the presented approach.


Keywords


text classification; convolutional neural network; cross-language text classification; multilingual classification; transfer learning; inductive transfer; supervised learning; artificial neural network

Full Text:

PDF

References


Ko, Y., Seo, J. (2000). Automatic text categorization by unsupervised learning. Proceedings of the 18th Conference on Computational Linguistics, 1, 453–459. doi: 10.3115/990820.990886

Zhang, X., Le Cun, Y. (2016). Text Understanding from Scratch. arXiv:1502.01710v5 [cs.LG]. Available at: https://arxiv.org/pdf/1502.01710.pdf

Korde, V. (2012). Text Classification and Classifiers: A Survey. International Journal of Artificial Intelligence & Applications, 3 (2), 85–99. doi: 10.5121/ijaia.2012.3208

Schäuble, P. (1997). Multimedia Information Retrieval. Springer US, 138. doi: 10.1007/978-1-4615-6163-7

Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems 25 (NIPS 2012), 1097–1105.

Szegedy, C., Wei Liu, Yangqing Jia, Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1–9. doi: 10.1109/cvpr.2015.7298594

Soderland, S. (2001). Building a Machine Learning Based Text Understanding System. Proceedings of IJCAI Workshop on Adaptive Text Extraction and Mining, 64–70.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems, 3111–3119.

Bel, N., Koster, C. H. A., Villegas, M. (2003). Cross-Lingual Text Categorization. Lecture Notes in Computer Science, 126–139. doi: 10.1007/978-3-540-45175-4_13

Oard, D. W. (1998). A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval. Lecture Notes in Computer Science, 472–483. doi: 10.1007/3-540-49478-2_42

Zhou, D., Truran, M., Brailsford, T., Wade, V., Ashman, H. (2012). Translation techniques in cross-language information retrieval. ACM Computing Surveys, 45 (1), 1–44. doi: 10.1145/2379776.2379777

Vulić, I., De Smet, W., Moens, M.-F. (2012). Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora. Information Retrieval, 16 (3), 331–368. doi: 10.1007/s10791-012-9200-5

Littman, M. L., Dumais, S. T., Landauer, T. K. (1998). Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing. Cross-Language Information Retrieval, 51–62. doi: 10.1007/978-1-4615-5661-9_5

Li, Y., Shawe-Taylor, J. (2007). Advanced learning algorithms for cross-language patent retrieval and classification. Information Processing & Management, 43 (5), 1183–1199. doi: 10.1016/j.ipm.2006.11.005

Shi, L., Mihalcea, R., Tian, M. (2010). Cross Language Text Classification by Model Translation and Semi-Supervised Learning. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1057–1067.

Prettenhofer, P., Stein, B. (2011). Cross-Lingual Adaptation Using Structural Correspondence Learning. ACM Transactions on Intelligent Systems and Technology, 3 (1), 1–22. doi: 10.1145/2036264.2036277

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C. (2003). A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3, 1137–1155.

Dos Santos, C. N., Gatti, M. (2014). Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics, 69–78.

Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). doi: 10.3115/v1/d14-1181

Johnson, R., Zhang, T. (2015). Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. doi: 10.3115/v1/n15-1011

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011). Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12, 2493–2537.

Kalchbrenner, N., Grefenstette, E., Blunsom, P. (2014). A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). doi: 10.3115/v1/p14-1062

Conneau, A., Schwenk, H., Barrault, L., Lecun, Y. (2016). Very Deep Convolutional Networks for Text Classification. arXiv:1606.01781 [cs.CL]. Available at: https://arxiv.org/abs/1606.01781

Gutmann, M., Hyvärinen, A. (2010). Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 297–304.

Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 1929–1958.

Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. arXiv:1212.5701 [cs.LG]. Available at: https://arxiv.org/abs/1212.5701




DOI: http://dx.doi.org/10.21303/2461-4262.2017.00304

Refbacks

  • There are currently no refbacks.




Copyright (c) 2017 Musbah Zaid Enweiji, Taras Lehinevych, Аndrey Glybovets

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN 2461-4262 (Online), ISSN 2461-4254 (Print)