• Sep 17, 2018 News!Welcome to 2019 4th International Conference on Information and Network Technologies (ICINT 2019), which will be held in Kyoto, Japan during May 25-27, 2019.   [Click]
  • Jul 04, 2018 News!JACN Vol.6, No.1 has been published with online version.   [Click]
  • May 22, 2018 News!JACN has been included in EBSCO database.
General Information
    • ISSN: 1793-8244
    • Frequency: Semiyearly
    • DOI: 10.18178/JACN
    • Editor-in-Chief: Dr. Ka Wai Gary Wong
    • Executive Editor: Ms. Nina Lee
    • Abstracting/ Indexing: EI (INSPEC, IET),  Electronic Journals Library, Ulrich's Periodicals Directory, EBSCO, ProQuest, and Google Scholar.
    • E-mail: jacn@ejournal.net
Editor-in-chief
Dr. Ka Wai Gary Wong
Division of Information and Technology Studies, Faculty of Education, The University of Hong Kong.
It's a honor to serve as the editor-in-chief of JACN. I'll work together with the editors and reviewers to help the journal progress
JACN 2017 Vol.5(2): 53-58 ISSN: 1793-8244
DOI: 10.18178/JACN.2017.5.2.240

A Corpus of Email Headers with Personal Privacy Protection

Yongchao Wang, Xiao Zhao, Feihang Ge, Yuyan Chao, and Lifeng He
Abstract—Because emails are private information, it is hard to acquire enough authentic data to build a corpus of emails. Email headers, however, do not involve email bodies, thus have less privacy. An email header, which contains the recipient, the sender, and a lot of other key information about the email sending process, has a high value for related research. This paper proposes an idea for building a corpus of email headers for the first time. The idea is to encrypt sensitive data via Secure Hash Algorithm when collecting key fields in email headers. All corpus data can be examined by volunteers themselves to confirm that no privacy remains. For ease of use, each data in this corpus contains is labeled with the number of recipients, the sending and receiving geographical locations, the user's social attributes such as country, language, job, professional, and so on, where some information of user's social attributes are obtained through questionnaires. The corpus can be applied to the research fields such as community discovery, users' relationship analysis, email classification, and spam email recognition, etc. Moreover, the method for building a corpus of email headers proposed in this paper can also be applied for other corpus data collection work where users' privacy protection is necessary.

Index Terms—Corpus, mail header, SHA, privacy protection, corpus labeling, social network, spam.

Yongchao Wang is with the Faculty of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Aichi, Japan, and the School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, Shaanxi, China (e-mail: wyc@xaut.edu.cn). Xiao Zhao is with the Artificial Intelligence Institute, College of Electrical and Information Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China (e-mail: zhaoxiao@sust.edu.cn). Feihang Ge is with the Faculty of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Aichi, Japan, and with the Zhejiang College of Construction, China (e-mail: gfhang@163.com). Yuyan Chao is with the Graduate School of Environment Management, Nagoya Sangyo University, Owariasahi 488-8711, Japan (e-mail: chao@nagoya-su.ac.jp). Lifeng He is with the Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Japan, and Artificial Intelligence Institute, College of Electrical and Information Engineering, Shaanxi University of Science and Technology, Xi’an 710021, Shaanxi, China (e-mail: helifeng@ist.aichi-pu.ac.jp).

[PDF]

Cite:Yongchao Wang, Xiao Zhao, Feihang Ge, Yuyan Chao, and Lifeng He, "A Corpus of Email Headers with Personal Privacy Protection," Journal of Advances in Computer Networks vol. 5, no. 2, pp. 53-58, 2017.

Copyright © 2008-2018. Journal of Advances in Computer Networks.  All rights reserved.
E-mail: jacn@ejournal.net