• Feb 07, 2023 News!JACN will adopt Article-by-Article Work Flow. The benefit of article-by-article workflow is that a delay with one article may not delay the entire issue. Once a paper steps into production, it will be published online soon.   [Click]
  • May 30, 2022 News!JACN Vol.10, No.1 has been published with online version.   [Click]
  • Dec 24, 2021 News!Volume 9 No 1 has been indexed by EI (inspec)!   [Click]
General Information
    • ISSN: 1793-8244 (Print)
    • Abbreviated Title:  J. Adv. Comput. Netw.
    • Frequency: Semiyearly
    • DOI: 10.18178/JACN
    • Editor-in-Chief: Professor Haklin Kimm
    • Executive Editor: Ms. Cherry Chan
    • Abstracting/ Indexing: EBSCO, ProQuest, and Google Scholar.
    • E-mail: jacn@ejournal.net
Editor-in-chief
Professor Haklin Kimm
East Stroudsburg University, USA
I'm happy to take on the position of editor in chief of JACN. We encourage authors to submit papers on all aspects of computer networks.

JACN 2017 Vol.5(2): 53-58 ISSN: 1793-8244
DOI: 10.18178/JACN.2017.5.2.240

A Corpus of Email Headers with Personal Privacy Protection

Yongchao Wang, Xiao Zhao, Feihang Ge, Yuyan Chao, and Lifeng He

Abstract—Because emails are private information, it is hard to acquire enough authentic data to build a corpus of emails. Email headers, however, do not involve email bodies, thus have less privacy. An email header, which contains the recipient, the sender, and a lot of other key information about the email sending process, has a high value for related research. This paper proposes an idea for building a corpus of email headers for the first time. The idea is to encrypt sensitive data via Secure Hash Algorithm when collecting key fields in email headers. All corpus data can be examined by volunteers themselves to confirm that no privacy remains. For ease of use, each data in this corpus contains is labeled with the number of recipients, the sending and receiving geographical locations, the user's social attributes such as country, language, job, professional, and so on, where some information of user's social attributes are obtained through questionnaires. The corpus can be applied to the research fields such as community discovery, users' relationship analysis, email classification, and spam email recognition, etc. Moreover, the method for building a corpus of email headers proposed in this paper can also be applied for other corpus data collection work where users' privacy protection is necessary.

Index Terms—Corpus, mail header, SHA, privacy protection, corpus labeling, social network, spam.

Yongchao Wang is with the Faculty of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Aichi, Japan, and the School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, Shaanxi, China (e-mail: wyc@xaut.edu.cn). Xiao Zhao is with the Artificial Intelligence Institute, College of Electrical and Information Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China (e-mail: zhaoxiao@sust.edu.cn). Feihang Ge is with the Faculty of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Aichi, Japan, and with the Zhejiang College of Construction, China (e-mail: gfhang@163.com). Yuyan Chao is with the Graduate School of Environment Management, Nagoya Sangyo University, Owariasahi 488-8711, Japan (e-mail: chao@nagoya-su.ac.jp). Lifeng He is with the Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute 480-1198, Japan, and Artificial Intelligence Institute, College of Electrical and Information Engineering, Shaanxi University of Science and Technology, Xi’an 710021, Shaanxi, China (e-mail: helifeng@ist.aichi-pu.ac.jp).

[PDF]

Cite:Yongchao Wang, Xiao Zhao, Feihang Ge, Yuyan Chao, and Lifeng He, "A Corpus of Email Headers with Personal Privacy Protection," Journal of Advances in Computer Networks vol. 5, no. 2, pp. 53-58, 2017.

Copyright © 2008-2024. Journal of Advances in Computer Networks.  All rights reserved.
E-mail: jacn@ejournal.net