Συνέδριο

Συγγραφείς: Belsis P., Fragos K., Gritzalis S., Skourlas C.
Τίτλος: SF-HME system: A Hierarchical Mixtures-of-Experts classification system for Spam Filtering
Συνέδριο: ACM SAC 2006 21st ACM Symposium on Applied Computing – Computer Security Track
Editors: G. Bella, P. Ryan
Ed: Όχι
Eds: Ναι
Σελίδες: 354-360
Να εμφανιστεί: Όχι
Μήνας: Απρίλιος
Έτος: 2006
Τόπος: Dijon, France
Εκδότης: ACM Press
Δεσμός: http://dl.acm.org/ft_gateway.cfm?id=1141360&ftid=361583&dwn=1&CFID=265073195&CFTOKEN=10352292
Όνομα αρχείου:
Περίληψη: Many linear statistical models have been lately proposed in text classification related literature and evaluated against the Unsolicited Bulk Email filtering problem. Despite their popularity - due both to their simplicity and relative ease of interpretation - the non-linearity assumption of data samples is inappropriate in practice, due to its inability to capture the apparent non-linear relationships, which characterize these samples. In this paper, we propose the SF-HME, a Hierarchical Mixture-of-Experts system, attempting to overcome limitations common to other machine-learning based approaches when applied to spam mail classification. By reducing the dimensionality of data through the usage of the effective Simba algorithm for feature selection, we evaluated our SF-HME system with a publicly available corpus of emails, with very high similarity between legitimate and bulk email - and thus low discriminative potential - where the traditional rule based filtering approaches achieve considerable lower degrees of precision. As a result, we confirm the domination of our SF-HME method against other machine learning approaches, which appeared to present lesser degree of recall.