Palo Alto Networks, Inc.
Pattern-based malicious URL detection
Last updated:
Abstract:
To perform pattern-based detection of malicious URLs, patterns are first generated from known URLs to build a pattern repository. A URL is first normalized and parsed, and keywords are extracted and stored in an additional repository of keywords. Tokens are then determined from the parsed URL and tags are associated with the parsed substrings. Substring text may also be replaced with general identifying information. Patterns generated from known malicious and benign URLs satisfying certain criteria are published to a pattern repository of which can be accessed during subsequent detection operations. During detection, upon identifying a request which indicates an unknown URL, the URL is parsed and tokenized to generate a pattern. The repository of malicious URL patterns is queried to determine if a matching malicious URL pattern can be identified. If a matching malicious URL pattern is identified, the URL is detected as malicious.
Utility
28 Jul 2020
16 Aug 2022