Please use this identifier to cite or link to this item: http://hdl.handle.net/10553/42220
DC FieldValueLanguage
dc.contributor.authorCabrera-León, Ylermien_US
dc.contributor.authorGarcía Báez, Patricioen_US
dc.contributor.authorSuárez-Araujo, Carmen Pazen_US
dc.date.accessioned2018-10-23T17:15:11Z-
dc.date.available2018-10-23T17:15:11Z-
dc.date.issued2019en_US
dc.identifier.isbn978-3-319-99282-2en_US
dc.identifier.issn1860-949Xen_US
dc.identifier.urihttp://hdl.handle.net/10553/42220-
dc.description.abstractSpam, or unsolicited messages sent massively, is one of the threats that affects email and other media. Its huge quantity generates considerable economic and time losses. A solution to this issue is presented: a hybrid anti-spam filter based on unsupervised Artificial Neural Networks (ANNs). It consists of two steps, preprocessing and processing, both based on different computation models: programmed and neural (using Kohonen SOM). This system has been optimized by utilizing a dataset built with ham from “Enron Email” and spam from two different sources: traditional (user’s inbox) and spamtrap-honeypot. The preprocessing was based on 13 thematic categories found in spams and hams, Term Frequency (TF) and three versions of Inverse Category Frequency (ICF). 1260 system configurations were analyzed with the most used performance measures, achieving AUC > 0.95 the optimal ones. Results were similar to other researchers’ over the same corpus, although they utilize different Machine Learning (ML) methods and a number of attributes several orders of magnitude greater. The system was further tested with different datasets, characterized by heterogeneous origins, dates, users and types, including samples of image spam. In these new tests the filter obtained 0.75 < AUC < 0.96. Degradation of the system performance can be explained by the differences in the characteristics of the datasets, particularly dates. This phenomenon is called “topic drift” and it commonly affects all classifiers and, to a larger extent, those that use offline learning, as is the case, especially in adversarial ML problems such as spam filtering.en_US
dc.languageengen_US
dc.publisher1860-949Xen_US
dc.relation.ispartofStudies in Computational Intelligenceen_US
dc.sourceStudies in Computational Intelligence [ISSN 1860-949X], v. 792, p. 239-262en_US
dc.subject3325 Tecnología de las telecomunicacionesen_US
dc.subject120304 Inteligencia artificialen_US
dc.subject.otherSpam filteringen_US
dc.subject.otherArtificial neural networksen_US
dc.subject.otherSelf-organizing mapsen_US
dc.subject.otherThematic categoryen_US
dc.subject.otherTerm frequencyen_US
dc.subject.otherInverse category frequencyen_US
dc.subject.otherTopic driften_US
dc.subject.otherAdversarial machine learningen_US
dc.titleE-mail spam filter based on unsupervised neural architectures and thematic categories: design and analysisen_US
dc.typeinfo:eu-repo/semantics/bookPartes
dc.typeBookPartes
dc.identifier.doi10.1007/978-3-319-99283-9_12
dc.identifier.scopus85054370983
dc.contributor.authorscopusid57192423564
dc.contributor.authorscopusid6506952458
dc.contributor.authorscopusid6603605708
dc.description.lastpage262-
dc.description.firstpage239-
dc.relation.volume792-
dc.investigacionIngeniería y Arquitecturaen_US
dc.type2Capítulo de libroen_US
dc.identifier.external1860-949X-
dc.utils.revisionen_US
dc.identifier.ulpgces
dc.description.sjr0,215
dc.description.sjrqQ4
item.fulltextSin texto completo-
item.grantfulltextnone-
crisitem.author.deptGIR IUCES: Computación inteligente, percepción y big data-
crisitem.author.deptIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.deptGIR IUCES: Computación inteligente, percepción y big data-
crisitem.author.deptIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.deptGIR IUCES: Computación inteligente, percepción y big data-
crisitem.author.deptIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.orcid0000-0001-5709-2274-
crisitem.author.orcid0000-0002-9973-5319-
crisitem.author.orcid0000-0002-8826-0899-
crisitem.author.parentorgIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.parentorgIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.parentorgIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.fullNameCabrera León, Ylermi-
crisitem.author.fullNameGarcía Baez, Patricio-
crisitem.author.fullNameSuárez Araujo, Carmen Paz-
Appears in Collections:Capítulo de libro
Show simple item record

Page view(s)

182
checked on Dec 14, 2024

Google ScholarTM

Check

Altmetric


Share



Export metadata



Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.