Zscaler, Inc.
Utilizing Machine Learning for dynamic content classification of URL content
Last updated:
Abstract:
Systems and methods include obtaining data from Uniform Resource Locator (URL) transactions monitored by a cloud-based system; labeling the data for the URL transactions with a category of a plurality of categories that describe the content of a page associated with the URL; performing preprocessing of raw Hypertext Markup Language (HTML) files for the URL transactions; extracting features from the preprocessed raw HTML files; and creating a machine learning model based on the features, wherein the machine learning model is configured to score content associated with an unknown URL to determine a category of the plurality of categories.
Status:
Application
Type:
Utility
Filling date:
21 Oct 2020
Issue date:
3 Mar 2022