Authorship Identification using Generalized Features and Analysis of Computational Method

Smita M Nirkhi, R. V. Dharaskar, V. M. Thakare

Abstract


Authorship Identification is being used for forensics analysis and humanities to identify the author of anonymous text used for communication. Authorship Identification can be achieved by selecting the textual features or writing style. Textual features are the important elements for Authorship Identification .It is therefore important to analyze them and identify the most promising features. This paper tries to identify and analyze promising generalized features and computational methods for authorship Identification. The performed experiments in the authorship identification task shows that, the support vector machine classifier used as computational method can achieve better results with identified generalized feature set.


Keywords


Author identification, support vector machine, feature extraction, classification

Full Text:

PDF

References


. Abbasi, A., & Chen, H. (2005). Analysis to Extremist- Messages, (October), 67–75.

. B. Loader, D.Thomas (Eds), Cybercrime: Law enforcement, security and surveillance in the information age. Routledge; 2000.

. A. Abbasi, H. Chen. "Writeprint: A stylometric approach to identity level identification and similarity detection in cyberspace". ACM Transaction on Information System, 26(2):1-29, 2008

. R. Zheng, J. Li, H. Chen, Z. Huang. "A framework for authorship identification of online messages: Writing-style features and classification techniques". Journal of the American Society for

Information Science and Technology, 57(3), pp.378-393, 2006.

. S. Nizamani S, N. Memon N, U. K. Wiil, P. Karampelas, "CCM: A Text Classification Model by Clustering", International Conference on Advances in Social Networks Analysis and Mining (ASONAM). Kaohsiung, Taiwan, pp.461-467, 2011.

. UCI Machine Learning Repositiory, Reuter 50 50 Dataset. https://archive.ics.uci.edu/ml/datasets/Reuter_50_50.

. R. Hadjidj, M. Debbabi, H. Lounis, F. Iqbal, A. Szporer, and D. Benredjem. Towards an integrated e-mail forensic analysis framework. Digital Investigation, 5(3-4):124 – 137, 2009

. C. E. Chaski. Who’s at the keyboard: Authorship attribution in digital evidence investigations. International Journal of Digital Evidence, 4(1), Spring 2005.

. F. Iqbal, R. Hadjidj, B. C. Fung, and M. Debbabi. A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digital Investigation, 5, Supplement(0):S42 – S51, 2008. The Proceedings of the Eighth Annual DFRWS Conference

. F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, 7(1-2):56 – 64, 2010.

. ]S.M.Nirkhi, R. V. Dharaskar, V.M.Thakre, “Analysis of online messages for identity tracing in cybercrime investigation”, 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec), pp. 300 - 305, 2012




DOI: http://dx.doi.org/10.14738/tmlai.32.1064

Refbacks

  • There are currently no refbacks.




______________________________________________________________________________

Transactions on Machine Learning and Artificial Intelligence; ISSN (online) 2054-7309

Copyright Society for Science and Education, United Kingdom