Meta Platforms, Inc.
URL normalization

Last updated:

Abstract:

In one embodiment, a method includes receiving a plurality of uniform resource identifiers (URI's) associated with a particular domain. Each of the URI's identifies a content page comprising one or more signature elements. The method further includes, for each URI in the plurality of URI's, successively testing the URI to identify a core of the URI and any unnecessary elements of the URI. The core of the URI is sufficient to retrieve a version of the content page including all of its signature elements. The method additionally includes, for each URI in the plurality of URI's, updating a set of rules based on the identified core and the identified unnecessary elements. The set of rules establishes a normalized version of the URI.

Status:
Grant
Type:

Utility

Filling date:

30 May 2019

Issue date:

26 Oct 2021