Bank of America Corporation
Logical upstream preprocessing at edge node of data to be stored in a data lake
Last updated:
Abstract:
Preprocessing of data destined for storage in a data lake is accomplished upstream, such as at edge nodes. The preprocessing includes filtering data that is deemed to be unnecessary for subsequent analytical use purposes. An initial intelligent determination is performed on whether a data feed is to be preprocessed at (i) the data lake, or (ii) upstream of the data lake, such as at an edge node. Once upstream preprocessing has been determined, an intelligent determination of which edge node is to be chosen for preprocessing is performed. The determination on which edge node is to be chosen for preprocessing is based on response times between the application server and the edge nodes and network bandwidth usage encountered by the network transmitting the data feed.
Utility
4 Oct 2021
12 Jul 2022