Brand Impersonation Detection By Knowledge Verification On Text Containing Hyperlinks

Abstract

A system and method are disclosed of verifying authenticity of a message using embedded hyperlinks therein. The method works by checking that the hyperlinks the text points to match the companies/brands the text portrays. The verification makes use of a knowledgebase of company/brand information as a source for retrieving the known sites/pages/domains for those companies/brands. The method then estimates a probabilistic measure of identity checking based on matches between the companies/brands mentioned in the text and the outgoing links from the text. In a variation, the method may additionally use contextual information of the message text to determine authenticity to compute a coverage score. The coverage score is then input into a machine learning model that uses these along with other known indicators of genuineness to determine that the message or Web page is either genuine or not.

Research Areas