Zscaler, Inc.
Utilizing Machine Learning for dynamic content classification of URL content

Last updated:

Abstract:

Systems and methods include obtaining data from Uniform Resource Locator (URL) transactions monitored by a cloud-based system; labeling the data for the URL transactions with a category of a plurality of categories that describe the content of a page associated with the URL; performing preprocessing of raw Hypertext Markup Language (HTML) files for the URL transactions; extracting features from the preprocessed raw HTML files; and creating a machine learning model based on the features, wherein the machine learning model is configured to score content associated with an unknown URL to determine a category of the plurality of categories.

Status:
Application
Type:

Utility

Filling date:

21 Oct 2020

Issue date:

3 Mar 2022