Authorship Identification using Generalized Features and Analysis of Computational Method

Authors

  • Smita M Nirkhi G.H.RAISONI COLLEGE OF ENGINEERING,NAGPUR,INDIA
  • R. V. Dharaskar Disha Technical Campus, Raipur, India
  • V. M. Thakare Department of Computer Science, University Campus, Amravati, India

DOI:

https://doi.org/10.14738/tmlai.32.1064

Keywords:

Author identification, support vector machine, feature extraction, classification

Abstract

Authorship Identification is being used for forensics analysis and humanities to identify the author of anonymous text used for communication. Authorship Identification can be achieved by selecting the textual features or writing style. Textual features are the important elements for Authorship Identification .It is therefore important to analyze them and identify the most promising features. This paper tries to identify and analyze promising generalized features and computational methods for authorship Identification. The performed experiments in the authorship identification task shows that, the support vector machine classifier used as computational method can achieve better results with identified generalized feature set.

References

. Abbasi, A., & Chen, H. (2005). Analysis to Extremist- Messages, (October), 67–75.

. B. Loader, D.Thomas (Eds), Cybercrime: Law enforcement, security and surveillance in the information age. Routledge; 2000.

. A. Abbasi, H. Chen. "Writeprint: A stylometric approach to identity level identification and similarity detection in cyberspace". ACM Transaction on Information System, 26(2):1-29, 2008

. R. Zheng, J. Li, H. Chen, Z. Huang. "A framework for authorship identification of online messages: Writing-style features and classification techniques". Journal of the American Society for

Information Science and Technology, 57(3), pp.378-393, 2006.

. S. Nizamani S, N. Memon N, U. K. Wiil, P. Karampelas, "CCM: A Text Classification Model by Clustering", International Conference on Advances in Social Networks Analysis and Mining (ASONAM). Kaohsiung, Taiwan, pp.461-467, 2011.

. UCI Machine Learning Repositiory, Reuter 50 50 Dataset. https://archive.ics.uci.edu/ml/datasets/Reuter_50_50.

. R. Hadjidj, M. Debbabi, H. Lounis, F. Iqbal, A. Szporer, and D. Benredjem. Towards an integrated e-mail forensic analysis framework. Digital Investigation, 5(3-4):124 – 137, 2009

. C. E. Chaski. Who’s at the keyboard: Authorship attribution in digital evidence investigations. International Journal of Digital Evidence, 4(1), Spring 2005.

. F. Iqbal, R. Hadjidj, B. C. Fung, and M. Debbabi. A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digital Investigation, 5, Supplement(0):S42 – S51, 2008. The Proceedings of the Eighth Annual DFRWS Conference

. F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, 7(1-2):56 – 64, 2010.

. ]S.M.Nirkhi, R. V. Dharaskar, V.M.Thakre, “Analysis of online messages for identity tracing in cybercrime investigation”, 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec), pp. 300 - 305, 2012

Downloads

Published

2015-05-02

How to Cite

Nirkhi, S. M., Dharaskar, R. V., & Thakare, V. M. (2015). Authorship Identification using Generalized Features and Analysis of Computational Method. Transactions on Engineering and Computing Sciences, 3(2), 41. https://doi.org/10.14738/tmlai.32.1064