🔗 Permalink

Patent application title:

System and method for distributed cluster configuration monitoring and management

Publication number:

Publication date:

2022-01-18

Application number:

16/022,644

Filed date:

2018-06-28

✅ Patent granted

Patent number:

US 11,228,491 B1

Grant date:

2022-01-18

PCT filing:

PCT publication:

Examiner:

Cheikh T Ndiaye

Agent:

Rutan & Tucker, LLP

Adjusted expiration:

2038-09-18

Smart Summary: A system helps monitor and manage the settings of multiple computers working together to detect cyber threats. It uses a shared data storage to keep a standard configuration that all computers should follow. Each computer has a management tool that checks if its settings match the standard configuration. If any computer's settings are different, it sends this information back to the shared storage. This allows security managers to decide whether to fix the settings or allow the differences. 🚀 TL;DR

Abstract:

A cyber-threat detection system that maintains consistency in local configurations of one or more computing nodes forming a cluster for cyber-threat detection is described. The system features a distributed data store for storage of at least a reference configuration and a management engine deployed within each computing node, including the first computing node and configured to obtain data associated with the reference configuration from the distributed data store, From such data, the management engine is configured to detect when the shared local configuration is non-compliant with the reference configuration, and upload information associated with the non-compliant shared local configuration into the distributed data store. Upon notification, the security administrator may initiate administrative controls to allow the non-compliant shared local configuration or modify the shared local configuration to be compliant with the reference configuration.

Inventors:

Alexander Otvagin 14 🇺🇸 Campbell, CA, United States
Alexey Yakymovych 1 🇺🇸 Milpitas, CA, United States

Assignee:

FireEye Security Holdings US LLC 22 🇺🇸 Milpitas, CA, United States

Applicant:

FireEye, Inc. 🇺🇸 Milpitas, CA, United States

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

H04L41/0853 » CPC main

Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks; Configuration management of networks or network elements; Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information

H04L41/0816 » CPC further

Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks; Configuration management of networks or network elements; Configuration setting characterised by the conditions triggering a change of settings the condition being an adaptation, e.g. in response to network events

H04L41/0873 » CPC further

Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks; Configuration management of networks or network elements; Checking the configuration Checking configuration conflicts between network elements

H04L41/0893 » CPC further

H04L67/1097 » CPC further

Network arrangements or protocols for supporting network services or applications; Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

G06F15/173 IPC

Digital computers in general ; Data processing equipment in general; Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs; Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake

Description

FIELD

Embodiments of the disclosure relate to the field of cybersecurity and distributed computing. More specifically, one embodiment of the disclosure relates to a scalable, cyber-threat detection system managed to reduce operational errors caused by node misconfiguration and enhance cluster scalability.

GENERAL BACKGROUND

Network devices provide useful and necessary services that assist individuals in business and in their everyday lives. Given the growing dependence on these services, increased security measures have been undertaken to protect these network devices against cybersecurity attacks (hereinafter, “cyberattacks”). These cyberattacks may involve an attempt to gain access to content stored on one or more network devices for illicit (i.e., unauthorized) purposes or an attempt to adversely influence the operability of a network device. For instance, the cyberattack may be designed to alter functionality of a network device (e.g., ransomware), steal sensitive information or intellectual property, or harm information technology or other infrastructure.

One type of security measure that is growing in popularity involves the deployment of compute clusters. A “compute cluster” (hereinafter, referred to as “cluster”) is a scalable cyber-threat detection architecture that includes multiple computing nodes that collectively perform analytics on received objects (e.g., data extracted from network traffic, files, etc.) to determine if these objects are malicious or non-malicious. Stated differently, the computing nodes are configured to analyze the received objects and determine whether such objects are part of a cyberattack (e.g., a likelihood of the object being associated with a cyberattack greater than a prescribed threshold). An example of a cluster is described in detail in U.S. patent application Ser. No. 15/283,128 entitled “Cluster Configuration Within A Scalable Malware Detection System,” filed Sep. 30, 2016, the entire contents of which are incorporated by reference herein.

Clusters are central to large scale computing and cloud computing. However, for a cluster deployment, each computing node within a cluster is subject to operational errors caused by (i) third party disruptive activities (e.g., cyberattack), (ii) hardware or software failures, or (iii) misconfiguration that may be caused by a failed installation or an errand reconfiguration of a computing node, a failed or accidental software update, or the like. These operational errors may lead to inconsistent behavior of the cluster, and thus, depending on which computing node is handling an analysis of an object, the cluster may provide unreliable or inconsistent analytic results.

In some conventional implementations, cyber-threat detection systems are configured with a centralized cluster management system that periodically communicates directly with each computing node to detect operational errors and prevent unreliable cluster operability caused by configuration mishaps. However, the use of a centralized management system limits scalability, as throughput issues arise as the number of computing nodes within the cluster increase. Additionally, conventional cyber-threat detection systems typically require the configuration and/or reconfiguration of computer nodes in a cluster to be performed through direct communications with the central cluster management system, limiting cluster management to a single point of failure.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:

FIG. 1 is a block diagram of an exemplary embodiment of a cluster-based cyber-threat detection system including a configuration management framework.

FIG. 2 is a block diagram of an exemplary embodiment of a computing node of a cluster forming part of the cyber-threat detection system.

FIG. 3 is a block diagram of an exemplary embodiment of the management system operating with the configuration management framework deployed in the cyber-threat detection system.

FIG. 4 is a logical representation of an exemplary embodiment of the cyber-threat detection system including the configuration management framework of FIG. 1.

FIG. 5 is a logical representation of an exemplary embodiment of the configuration management framework provided by a cluster of the cyber-threat detection system of FIG. 1.

FIG. 6A is a block diagram of a first exemplary embodiment of the cluster formation scheme directed to removal of a requesting computing node from a cluster.

FIG. 6B is a block diagram of a second exemplary embodiment of the cluster formation scheme directed to removal of a computing node based on an initiated command by a different computing node.

FIG. 7 is a flow diagram of the operations conducted by the configuration management framework of FIG. 1.

DETAILED DESCRIPTION

I. Overview

Embodiments of the present disclosure generally relate to a distributed, configuration management framework that relies on interoperating management engines deployed within the computing nodes of the cluster to reduce operational errors, increase scalability, and ease cluster management. The operability of each computing node is based, at least in part, on its local configuration; namely, information stored within the computing node that is directed to properties (e.g., settings, parameters, permissions, etc.) that control operability of the computing node. The local configuration may include (i) “shared local configuration,” namely one or more portions of the local configuration data each directed to a different functionality that is commonly shared by computing nodes operating within the same cluster; and (ii) “private local configuration,” namely a portion of the local configuration data that is specific to the particular computing node (e.g., common properties). Besides the information associated with the common properties described above, the shared local configuration may further include metadata directed to the properties, such as the monitoring method of the properties, configuration values assigned to different properties, and the method utilized in modifying the configuration values.

The configuration management framework reduces operational errors that may be experienced by the cyber-threat detection system through the installation of a management engine into each computing node capable of forming and/or joining with a cluster. Each management engine within a computing node is configured to periodically or aperiodically analyze the shared local configuration for that computing node and detect when the shared local configuration is non-compliant with a reference (or golden) configuration. Non-compliance may be detected when information (e.g., configuration values, etc.) within the shared local configuration (hereinafter, “shared local configuration data”) is inconsistent with corresponding information within the reference configuration (hereinafter, “reference configuration data”). This level of inconsistency may be absolute without allowing any discrepancies between the shared local configuration and the reference data may allow certain tolerances (e.g., allowable prescribed differences) that may be set based on property type.

It is contemplated, however, as another embodiment, that the level of inconsistency between shared local configuration data and the reference data may allow certain tolerances (e.g., the tolerance may be dependent on property type), and thus, a difference in configuration data may not be deemed “inconsistent.” However, it is further contemplated that a difference in other properties (e.g., certain permissions) maintained in the shared local configuration and the reference configuration may lead to a non-compliance finding.

As described below, both the shared local configuration and the reference configuration are changeable, so the description may referred to certain states of the configurations as “current” or “next” shared local (or reference) configurations based on their current or modified (next) state.

Representing the current shared local configuration expected for each of the computing nodes within the same cluster, the reference configuration may include (i) a set (one or more) of properties of the computing node to be monitored (e.g., what logic for threat detection analytics is enabled or disabled, Operating System “OS” type, etc.); (ii) the method by which the properties are monitored (e.g., read/write monitoring, read only monitoring, etc.); (iii) the value assigned to each property (e.g., integer value, string value, etc.); and/or (iv) the method for modifying the configuration value assigned to each of the properties (e.g., call, function, Application Programming Interface “API”, etc.). Herein, the reference configuration may be stored in a distributed (shared) data store and may be configurable. For one embodiment, the distributed data store may be memory contained within each computing node collectively operating as the distributed data store (e.g., each computing node contains a portion of the data in the distributed data store but the distributed data store appears via a user interface as a single storage). For another embodiment, the distributed data store may be a separate, addressable memory that is shared between or segmented and allocated to the computing nodes forming a cluster.

More specifically, according to one embodiment of the disclosure, each management engine is adapted to (i) monitor the shared local configuration of its computing node, (ii) identify whether any shared local configuration data is non-compliant (e.g., inconsistent) with corresponding reference configuration data, and (iii) report the detected non-compliance being a configuration violation for a particular cluster. In some cases, the configuration violation, caused by a deviation from the reference configuration, would result in a false positive or false negative verdict (determination) or increased latency in detection of a cyberattack.

The reporting of the configuration violation may be accomplished by the management engine uploading the non-compliant, shared local configuration data (or metadata representing the non-compliant, shared local configuration data) into the distributed data store of the particular cluster. More specifically, according to one embodiment of the disclosure, a management engine of a computing node operating as a configuration lead node (e.g., the first computing node to join the cluster or otherwise an automatically elect or determined node of the cluster) generates an inconsistency report directed to its own configuration compliance or non-compliance with the reference configuration (e.g., node health) as well as to configuration inconsistencies for the entire cluster realized by aggregating the shared local configuration data for all of the computing nodes. Each management engine of the other computing nodes generates an inconsistency report for itself (e.g., node health). The configuration lead node makes the cluster-wide inconsistency report available to the management system, which makes the inconsistency report available to a security administrator.

Using a predefined API (hereinafter, “cluster management API”), the management system may conduct a polling operation to retrieve cluster status information (e.g., name, health, size, etc.) along with status information for the computing nodes within the particular cluster (e.g., node name, node health, network address of the node, etc.). The status information for the non-compliant computing node may identify a configuration violation and provide an IT administrator (e.g., a security administrator) with access to the non-compliant, shared local configuration data via an administrative portal (pulling data). Alternatively, upon detecting a configuration violation, the management system initiates transmission of the non-compliant, shared local configuration data via the portal (pushing data). Herein, the security administrator may be an analyst or an automated system that relies on preconfigured rule sets, machine learning models, or other artificial intelligence schemes (e.g., artificial neural networks, etc.) to determine how to handle a configuration violation through rule enforcement, remediation using repair instructions to return the shared local configuration back to a prior state that is in compliance with the reference configuration, leave the cluster, or the like.

The configuration management framework further reduces operational errors by at least management engine for the first computing node (serving as the configuration lead node) being adapted to (i) acquire shared local configurations for other computing nodes within the cluster, (ii) identify whether the shared local configuration of another computing node (e.g., a second computing node) is non-compliant with (e.g., differs from) the reference configuration, and (iii) report detected non-compliance by the second computing node. The reporting of non-compliance by the second computing node may be accomplished as described above, namely the first computing node uploading the detected, non-compliant shared local configuration data for the second computing node (or metadata representing the non-compliant shared local configuration data) into the distributed data store. This non-compliant, shared local configuration data (or representative metadata) may be appended to the shared local configuration data stored within the distributed data store. Access to the non-compliant shared local configuration data (or representative metadata) is made available to the security administrator via a push or pull data delivery scheme, as described above.

Responsive to determining that the shared local configuration is non-compliant, the first computing node may execute repair instructions that returns the shared local configuration back to a prior state that is in compliance with the reference configuration. Alternatively, in other cases, the first computing node may issue a leave command to remove itself from the cluster. The issuance of the leave command may depend on the degree of non-compliance and/or the non-compliant properties.

It is noted that, rather than the first computing node requesting removal of itself or another computing node, the management system may issue commands for such removal and change the listing of computing nodes forming the cluster to account for misconfigured computing nodes. Also, in some cases, the misconfiguration may be due to a security breach where the computing node goes rogue (misconfigured) and removal of an untrusted computing node is desired for the health of the cluster and the enterprise network as a whole.

The distributed, configuration management framework enhances scalability of the cluster by the management engines using a cascading, “multicast” communication scheme. One type of “multicast” communication scheme relies a Gossip communication protocol. In lieu of relying on a single point (e.g., a centralized management system) in managing configurations for each computing node within a cluster, the distributed, configuration management framework disseminates messages between neighboring computing nodes. This multicast communication scheme allows the security administrator, via the management system, to upload intended updates and/or modifications to the reference configuration within the distributed data store. Alternatively, the reference configuration updates may be detected computing nodes including logic that monitors for changes in internal services provided by a computing node (e.g., local configuration changes after successful testing at a computing node, etc.) and any intended changes in service may cause an update to the reference configuration within the distributed data store. Thereafter, the updating of the reference configuration may cause other computing nodes, over time, to detect that their shared local configuration is non-compliant and modify their shared local configuration accordingly.

II. Terminology

In the following description, certain terminology is used to describe features of the invention. In certain situations, terms “logic,” “engine,” “component” and “client” may be representative of hardware, firmware and/or software that is configured to perform one or more functions. As hardware, logic (or engine or component or client) may include circuitry having data processing or storage functionality. Examples of such circuitry may include, but are not limited or restricted to a processor such as a microprocessor, one or more processor cores, a programmable gate array, a microcontroller, an application specific integrated circuit, a digital signal processor (DSP), field-programmable gate array (FPGA), wireless receiver, transmitter and/or transceiver circuitry, combinatorial logic, or any other hardware element with data processing capability. The circuitry may include memory operating as non-persistent or persistent storage.

Logic (or engine or component or client) may be software in the form of one or more software modules. The software modules may be executable code in the form of an executable application, an API, a subroutine, a function, a procedure, an applet, a servlet, a routine, source code, object code, a shared library/dynamic load library, or one or more instructions. These software modules may be stored in any type of a suitable non-transitory storage medium, or transitory storage medium (e.g., electrical, optical, acoustical or other form of propagated signals such as carrier waves, infrared signals, or digital signals). Examples of non-transitory storage medium may include, but are not limited or restricted to a programmable circuit; a semiconductor memory; non-persistent storage such as volatile memory (e.g., any type of random access memory “RAM”); persistent storage such as non-volatile memory (e.g., read-only memory “ROM”, power-backed RAM, flash memory, phase-change memory, etc.), a solid-state drive, hard disk drive, an optical disc drive, or a portable memory device. As firmware, the executable code is stored in persistent storage.

A “network device” generally refers to either a physical electronic device featuring data processing and/or network connection functionality or a virtual electronic device being software that virtualizes certain functionality of the physical network device. Examples of a network device may include, but are not limited or restricted to, a server, a mobile phone, a computer, a standalone cybersecurity appliance, a network adapter, an industrial controller, an intermediary communication device (e.g., router, firewall, etc.), a virtual machine, or any other virtualized resource.

The term “object” generally relates to content having a logical structure or organization that enables it to be classified during threat analysis. The content may include an executable (e.g., an application, program, code segment, a script, dynamic link library “dll” or any file in a format that can be directly executed by a computer such as a file with an “.exe” extension, etc.), a non-executable (e.g., a storage file; any document such as a Portable Document Format “PDF” document; a word processing document such as Word® document; an electronic mail “email” message, web page, etc.), or simply a collection of related data. The object may be retrieved from information in transit (e.g., a plurality of packets) or information at rest (e.g., data bytes from a storage medium).

The terms “message” generally refers to information placed in a prescribed format and transmitted in accordance with a suitable delivery protocol or information provided to (or made available from) a logical data structure such as a prescribed API in order to perform a prescribed operation. Examples of a delivery protocol include, but are not limited or restricted to Gossip protocol, User Datagram Protocol (UDP); or the like. Hence, each message may be in the form of one or more packets, frame, instruction such as a command, or any other series of bits having the prescribed, structured format.

The term “computerized” generally represents that any corresponding operations are conducted by hardware in combination with software and/or firmware. In certain instances, the terms “compare,” comparing,” “comparison,” or other tenses thereof generally mean determining if a match (e.g., identical or a prescribed level of correlation) is achieved.

The term “transmission medium” generally refers to a physical or logical communication link (or path) between two or more network devices. For instance, as a physical communication path, wired interconnects in the form of electrical wiring, optical fiber, cable, or bus trace may be used. For a wireless interconnect, wireless transmitter/receiver logic supporting infrared or radio frequency (RF) transmissions may be used.

Finally, the terms “or” and “and/or” as used herein are to be interpreted as inclusive or meaning any one or any combination. As an example, “A, B or C” or “A, B and/or C” mean “any of the following: A; B; C; A and B; A and C; B and C; A, B and C.” An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

As this invention is susceptible to embodiments of many different forms, it is intended that the present disclosure is to be considered as an example of the principles of the invention and not intended to limit the invention to the specific embodiments shown and described.

III. Distributed, Configuration Management Framework

Referring to FIG. 1, an exemplary block diagram of a cyber-threat detection system 100 is shown. The threat detection system 100 includes one or more computing nodes 110₁-110_M(M≥1), each computing node 110₁, . . . , or 110_Mis communicatively coupled to a management system 120. As shown as an illustrative example, selected subsets of the computing nodes 110₁-110₆may be grouped to form one or more clusters 130₁-130_N(M≥N≥1, e.g., N=2), where each of the clusters 130₁-130₂performs threat detection analytics on objects received for analysis. The threat detection analytics may be used to determine the likelihood (e.g., probability) of a received object 140 being malicious and/or part of a cyberattack based on behavioral analyses of the received object 140 (or of components processing the received object 140) during execution of the received object 140 and/or analyses of the content of the received object 140 without execution of the received object.

More specifically, according to one embodiment of the disclosure, a “computing node” (e.g., any of the computing node 110₁-110₆) may be implemented as a physical network device (e.g., server, computer, etc.) configured to analyze received objects 140 and determine whether the received objects 140 are part of a cyberattack. Alternatively, the computing node may be implemented as a virtual network device (e.g., software adapted to perform the functionality of the computing node), or a combination thereof. To analyze a received object 140, one of the computing nodes 110₁-110₄(e.g., the first computing node 110₁) within a selected cluster (e.g., first cluster 130₁) is selected for conducting an in-depth analysis of the received object 140 based on a variety of factors—processing capability of an object analysis engine 250₁(see FIG. 2) deployed within the first computing node 110₁(described below) and/or software profile of the first computing node 110₁for example. Otherwise, such processing may be handled by a different computing node 110₂-110₄within the cluster 130₁.

As an illustrative example, the first cluster 130₁is formed to include a first plurality of computing nodes 110₁-110₄while a second cluster 130₂is formed to include a second plurality of computing nodes 110₅-110₆. Besides threat analytics, each computing node within a cluster (e.g., cluster 130₁) may subscribe to a configuration service supported by the configuration management framework (hereinafter, “configuration management service”) to monitor and maintain the local configuration of these computing nodes. For clarity sake, the following description may focus on the architecture and operations of one of the computing node 110₁-110₆within a corresponding cluster that subscribes to the configuration management service, such as the first computing node 110₁within the first cluster 130₁for example, when describing the formation of the cluster and management of the computing node configurations within a cluster. It is contemplated, however, that the other computing nodes 110₂-110₆may feature the same or substantially similar architecture as the first computing node 110₁and perform similar operations to support the cluster services (e.g., object analytics, etc.) offered by their cluster 130₁or 130₂.

Referring now to FIG. 2, each of the first plurality of computing nodes 110₁-110₄forming the cluster 130₁, such as computing node 110₁for example, may include one or more processors 205₁, one or more interfaces 210₁and a memory 220₁communicatively coupled together by one or more transmission mediums 230. The interface(s) 210₁may operate as a network interface configured to receive the object 140 as well as communications from the management system 120, as described below. The computing nodes 110₁-110₄may be deployed as physical network devices, although any or all of first plurality of computing nodes 110₁-110₄forming the cluster 130₁may be virtualized and implemented as software modules that communicate with each other via a selected communication protocol.

For this illustrative example, the memory 220₁may include a management engine 240₁, an object analysis engine 250₁, a local configuration 260₁, a reference (or golden) configuration 270, and/or credentials 280 to access a predefined API (hereinafter, “cluster management API 290). The cluster management API 290 is structured to receive executable commands to form a cluster (create cluster) or destroy the formation of a cluster (delete cluster).

Herein, the local configuration 260₁includes information associated with the setting, monitoring and/or modifying of properties that control operability of the first computing node 110₁. The local configuration 260₁features (i) shared local configuration 262₁representing one or more portions of configuration data directed to different functionality that is commonly shared by computing nodes operating within the same cluster 130₁and (ii) private local configuration 264₁representing configuration information that is specific to the particular computing node. The shared local configuration 262₁is stored within a portion 222 of the memory 220₁with shared access by any of the computing nodes 110₁-110₄as well as the management system 120. This shared memory 222 operates as part of a “distributed data store,” where the logical representation of the shared access is illustrated in FIG. 1. The shared local configuration 262₁is monitored by the management engine 240₁for non-compliance with the reference configuration 270, which is the expected configuration for each of the computing nodes 110₁-110₄within the cluster 130₁.

As further shown in FIG. 2, the object analysis engine 250₁for illustrative purposes includes logic that is capable of conducting an in-depth analysis of the received object 140 for cyber-security threats. For example, the object analysis engine 250₁may include one or more virtual machines (hereinafter, “VM(s)” 255). Each of the VM(s) 255 may be provisioned with different guest image bundles that include a plurality of software profiles as represented by a different type of operating system (OS), a different type and/or version of application program. Hence, the operability of computing node 110₁, in particular the object analysis engine 250₁deployed therein, is at least partially based on its local configuration 260₁.

Referring to both FIGS. 1-2, the management engine 240₁, when executed, is capable of evaluating whether the shared local configuration 262₁is compliant (e.g., consistent) with the reference configuration 270. It is noted that, for the newly formed cluster 130₁, the shared local configuration 262₁, being a portion of the local configuration 260₁of the first computing node 110₁, may initially operate as the reference configuration 270 for the cluster 130₁. During operation, the shared local configuration 262₁and the reference configuration 270 may be altered.

Also, the management engine 240₁is capable of evaluating, potentially through an automatically elected, configuration lead node, whether shared local configurations 262₂-262₄of other computing nodes 110₂-110₄within its cluster 130₁are compliant with the reference configuration 270. Compliance between the shared local configurations 262₁-262₄and the reference configuration 270 improves correlation between results produced from object analysis engines 250₁-250₄within the computing nodes 110₁-110₄on identical or similar received objects 140. The management engine 240₁is further responsible for inter-operations with one or more “neighboring” computing (e.g., nodes 110₂-110₃), as described below.

Referring back to FIG. 1, focusing on the operations of the first cluster 130₁for clarity sake, the computing nodes 110₁-110₄within the first cluster 130₁may be located within the same sub-network (not routing between nodes). As shown, the computing nodes 110₁-110₄may be positioned at various locations on a transmission medium 152 that is part of a network 150 (e.g., connected at various ingress points on a wired network or positioned at various locations for receipt of wireless transmissions) and receive objects included within traffic propagating over the transmission medium 152. The “traffic” may include an electrical transmission of certain objects, such as files, email messages, executables, or the like. For instance, each computing nodes 110₁, . . . , or 110₄may be implemented either as a standalone network device, as logic implemented within a network device, logic integrated into a firewall, or as software running on a network device.

More specifically, according to one embodiment of the disclosure, the first computing node 110₁may be implemented as a network device (or installed within a network device) that is coupled to the transmission medium 152 directly (not shown) or is communicatively coupled with the transmission medium 152 via an interface 154 operating as a data capturing device. According to this embodiment, the data capturing device 154 is configured to receive the incoming data and subsequently process the incoming data, as described below. For instance, the data capturing device 154 may operate as a network tap (in some embodiments with mirroring capability) that provides at least one or more objects (or copies thereof) extracted from data traffic propagating over the transmission medium 152. Alternatively, although not shown, the first computing node 110₁may be configured to receive files or other objects automatically (or on command), accessed from a storage system.

As further shown in FIGS. 1-2, the computing nodes 110₁-110₄may be positioned in close proximity, perhaps within a server farm or facility. As described above, it is contemplated that any or all of clusters 130₁-130_N(e.g., first cluster 130₁and/or second cluster 130₂) may be virtualized and implemented as software, where the computing nodes 110₁-110₄are software modules that communicate with each other via any selected communication protocol (e.g., Gossip or other UDP-based protocol, etc.). For this virtualized deployment, one or more of the computing nodes within a cluster (e.g., the first computing node 110₁within the first cluster 130₁) may be implemented entirely as software for uploading into a network device and operating in cooperation with an operating system running on the network device. For this implementation, a software-based computing node is configured to operate in a manner that is substantially similar or identical to a hardware-based computing node.

Additionally according to this embodiment of the disclosure, with respect to the first cluster 130₁, each of the computing nodes 110₁-110₄is communicatively coupled to a distributed data store 170. The distributed data store 170 may be deployed as a separate data store to store at least the shared local configuration 262₁-262₄and the reference configuration 270, which are accessible by the computing nodes 110₁-110₄. Alternatively, as shown, the distributed data store 170 may be provided as a collection of synchronized memories within the computing nodes 110₁-110₄(e.g., data stores that collectively form distributed data store 170). Hence, the portion of memory (data store 222 may be configured to individually store the shared local configuration 262₁for computing node 110₁along with the reference configuration 270. The other synchronized data stores may be configured to individually store their corresponding shared local configurations 262₂-262₄for computing nodes 110₂-110₄along with the reference configuration 270.

Referring still to FIG. 1, the management system 120 assists in formation of each clusters 130₁, . . . , or 130_N(e.g., cluster 130₁), and after such formation, the management system 120 initiates operations to confirm shared local configuration consistency between the computing nodes 110₁-110₄. Also, the management system 120 maintains communications with the cluster 130₁in support of cluster-based services. Stated differently, after formation of the cluster 130₁, the management system 120 is configured to discontinue communications with the computing nodes 110₁-110₄on a per node basis; instead, the management system 120 communicates with the cluster 130₁on a per cluster basis.

Referring now to FIG. 3, the management system 120 may include one or more processors 300, one or more interfaces 310 and a memory 320, which are communicatively coupled together by one or more transmission mediums 330. The management system 120 may be deployed as a type of physical network device, although the management system 120 may be virtualized and implemented as software modules that communicate with one or more clusters (e.g., cluster 130₁and 130₂) via a selected communication protocol.

For this illustrative example, the interface(s) 310 may operate as a network interface configured to access one or more distributed data stores (e.g., distributed data store 170) maintained within the clusters (e.g., clusters 130₁-130₂) managed by the management system 120. The processor 300 is a multi-purpose, processing component as described above, which is configured to execute logic, such as a cluster formation engine 340 and a management client engine 350 for example, stored within non-transitory storage medium operating as the memory 320. Herein, the memory 320 may further store cluster-based information related to how to access a particular cluster, such a data store including a listing 360 of computing node addresses associated with a particular cluster and credentials 280 to access the cluster management API 290 (hereinafter, “API credentials” 280″).

Referring to FIGS. 1-3, the cluster formation engine 340 is responsible for assisting in the formation of clusters, as described below. According to one embodiment of the disclosure, the cluster formation engine 340 receives a request for cluster creation from an authorized user via an administrative portal 370. The cluster creation request may include the API credentials 280, which may be subsequently stored within the management system 120 as shown above. Upon receipt of the cluster creation request via the administrative portal 370, the cluster formation engine 340 may initiate commands 380 to the cluster management API 290 to form and/or modify the computing node composition of the clusters 130₁-130₂.

The management client engine 350 is adapted to acquire configuration status (e.g., presence of any metadata identifying non-compliance of a shared local configuration with the reference configuration) from the distributed data stores maintained within the managed cluster(s). In particular, according to one embodiment of the disclosure, the management client engine 350 is responsible for periodically or aperiodically polling the distributed data stores (e.g., distributed data store 170 of FIG. 1) for configuration status. During this polling operation, responsive to detecting a configuration violation, the management client engine 350 provides a security administrator with access to data representative of the non-compliant configuration parameter(s) via the administrative portal 370 (pulling data). Alternatively, responsive to detecting a configuration violation, the management client engine 350 may initiate transmissions of data (e.g., alert or report) identifying the configuration violation (e.g., cluster ID, computing node name and/or IP address, inconsistent configuration parameters, etc.) to the security administrator via the administrative portal 370 (pushing data).

As an optional function, the management client engine 350 may attempt to remediate non-compliance. As an illustrative example, upon detecting the shared local configuration 262₁of the first computing node 110₁is non-compliant with the reference configuration 270, the management client engine 350 may retrieve repair instructions (not shown) from the distributed data store 170. Thereafter, the management client engine 350 may execute the repair instructions to overwrite the non-compliant data and return the shared local configuration 262₁into compliance with the reference configuration 270.

Additionally, the management client engine 350 is adapted to initiate updates to the reference configuration 270 of FIGS. 1-2 within a supported cluster (e.g., reference configuration 270 within the cluster 130₁). Such updates would cause each of the computing nodes 110₁-110₄within the cluster 130₁to update its shared local configuration 262₁-262₄, respectively. More specifically, upon updating the reference configuration 270, such as changing a setting, permission or a parameter (e.g., cluster ID, computing node, IP address, etc.), the management engines 240₁-240₄for each of the computing nodes 110₁-110₄, during their periodic or aperiodic evaluation, would detect that the shared local configurations 262₁-262₄for the computing nodes 110₁-110₄are non-compliant with the reference configuration 270. Depending on the remediation procedure selected, the management engines 240₁-240₄may alter the local configuration services to adjust their setting, permission or parameter to be consistent with the updates in the reference configuration 270. Alternatively, the management engines 240₁-240₄may prompt the non-compliant computing nodes 110₁-110₄to leave the cluster 130₁, as described below and illustrated in FIGS. 6A-6B.

Although not illustrated in detail, some or all of the logic forming the management system 120 may be located at an enterprise's premises (e.g., located as any part of the enterprise's network infrastructure whether located at a single facility utilized by the enterprise or at a plurality of facilities). As an alternative embodiment, some or all of the management system 120 may be located outside the enterprise's network infrastructure, generally referred to as public or private cloud-based services that may be hosted by a cybersecurity provider or another entity separate from the enterprise (service customer). Obtaining a high degree of deployment flexibility, embodiments of the management system 120 may also feature a “hybrid” solution, where some logic associated with the management system 120 may be located on premises and other logic of the management system 120 may operate as a cloud-based service. This deployment may be utilized to satisfy data privacy requirements that may differ based on access, use and storage of sensitive information (e.g., personally identifiable information “PII”) requirement for different geographical locations.

IV. Cluster Configuration Management

Referring to FIG. 4, a logical representation of an exemplary embodiment of a configuration management framework 400 deployed within the cyber-threat detection system 100 of FIG. 1 is shown. Herein, a plurality of clusters 130₁-130_Nmay be configured in accordance with the distributed, configuration management framework 400. As shown, for this embodiment of the disclosure, the configuration management framework 400 includes a management service 410 for a first cluster 130₁and another management service 412 for a second cluster 130₂, management engines 240₁-240₄deployed within corresponding computing nodes 110₁-110₄, and a management client engine 350 (e.g., virtual or physical logic) deployed within the management system 120 or deployed within each of the multiple management systems as shown (hereinafter, “management system(s) 120₁-120_K,” where K≥1).

According to one embodiment of the disclosure, the configuration management framework 400 is designed so each computing node 110₁-110₄monitors and maintains compliance (e.g., consistency) between its shared local configuration 262₁-262₄and the reference (golden) configuration 270. Each shared local configuration 262₁-262₄partially controls operability of its computing nodes 110₁-110₄, and the reference configuration 270 represented the expected configuration for each of the computing nodes within the same cluster. Additionally, the configuration management framework 400 further controls interoperability of the computing nodes 110₁-110₄by propagating modifications to the configuration data through each of the shared location configurations 262₁-262₄in response to updating the reference configuration 270 for example.

More specifically, the management service 410 supports communications between the computing nodes 110₁-110₄and one or more management system(s) 120₁-120_Kin accordance with a selected messaging protocol. Similarly, management service 412 supports communications between the computing nodes within a second cluster 130₂and one or more management system(s) 120₁-120_K. As shown, the management service 410 may utilize the distributed data store 170, where changes made to the reference configuration 270 maintained within the distributed data store 170 may cause such changes to be propagated to the computing nodes 110₁-110₄being part of the configuration management framework 400, although some of the computing nodes belonging to the cluster 130₁(not shown) may operate separately and their local configuration is not monitored.

As an illustrative example, in some situations, the first computing device 110₁may initiate a request to modify the reference configuration 270. This request may be initiated upon completion of a successful testing phase of a modified shared local configuration 262₁, which has been permitted to be non-compliant with the reference configuration 270. Once the request has been authenticated (e.g., message relayed to a security administrator via the management system 120 to modify the reference configuration 270 has been approved), the first computing device 110₁modifies the reference configuration 270. The modification of the reference configuration 270 may be accomplished by overwriting a portion of the reference configuration 270 with the inconsistent configuration parameters or by overwriting the entire reference configuration 270 with the modified shared local configuration 262₁for example.

Upon completing the modification of the reference configuration 270, the shared local configurations 262₂-262₄of the computing nodes 110₂-110₄are now non-compliant (e.g., inconsistent) with the modified reference configuration 270. As a result, given that the computing nodes 110₂-110₄subscribe to the configuration management service, the modification of the reference configuration 270 prompts corresponding changes to the shared local configurations 262₂-262₄of the computing nodes 110₂-110₄to be made in order to remain compliant.

Additionally, each computing node 110₁. . . 110₄includes a management engine 240₁. . . 240₄being logic that monitors for compliance between the reference configuration 270 and each of the shared local configuration 262₁-262₄maintained on the computing nodes 110₁. . . 110₄. Upon detecting non-compliance between a shared local configuration (e.g., shared local configuration 262₂) and the reference configuration 270 for example, the management engine (e.g., management engine 240₂) updates that shared local configuration 262₂stored within the distribution data store 170 by including the differences between the shared local configurations 262₂and the reference configuration 270.

Periodically or aperiodically, at least one of the management system(s) 120₁-120_Kpolls the distributed data store 170 (or a configuration lead node) for the current state of one of more of the computing nodes 110₁-110₄to uncover differences between any of the shared local configurations 262₁-262₄and the reference configuration 270. Upon detecting non-compliance by any of the shared local configurations 262₁-262₄, the management system(s) 120₁-120_Kmay generate a display accessible to a security administrator that identifies the non-compliance and allows the security administrator to initiate administrative controls that (i) temporarily ignore the non-compliance, (ii) prompt the reference configuration 270 to alter the non-compliant shared local configurations 262₁. . . and/or 262₄, or (iii) cause one of the computing nodes 110₁-110₄(e.g., computing node 110₄) to initiate a leave command in efforts to remove the non-compliant, shared local configuration from the cluster 130₁. This enables greater flexibility in more detailed analysis of the content.

Furthermore, upon receipt of data for updating the local configuration for each of the computing nodes 110₁-110₄(hereinafter, “local configuration update”) to be shared between computing nodes within the cluster 130₁, a management engine for the receiving computing node (e.g., management engine 240₂of the computing node 110₂) propagates the data to its neighboring computing nodes (e.g., computing nodes 110₁and 110₃. The “neighboring” computing nodes may be determined based, at least in part, on network coordinates or round-trip time with other computing nodes within the cluster. A “round-trip time” is a measured delay between transmission of a signal and return of a response signal from a responding computing node. A predetermined number of computing nodes with the lowest round-trip latency are considered to be the neighboring computing nodes for a particular computing node. The exchange occurs in an iterative manner, where the neighboring computing nodes may propagate the local configuration update.

Referring now to FIG. 5, a logical representation of an exemplary embodiment of communications among components of the configuration management (system) framework 700 deployed within the cluster 130₁of FIG. 1 is shown. Herein, the cluster 130₁includes the computing nodes 110₁-110₄with the distributed data store 170. As shown, the distributed data store 170 maintains the shared local configurations 262₁-262₄for each of the computing nodes 110₁-110₄and the reference configuration 270. Additionally, each of the management engines 240₁-240₄within the computing nodes 110₁-110₄periodically or aperiodically evaluates whether its shared local configuration 262₁-262₄is compliant with the reference configuration 270. This evaluation is described below and operations of the management engine 240₁of the first computing node 110₁are illustrated in the logical representation of FIG. 5, although the management engines 240₂-240₄associated with the computing nodes 110₂-110₄within the cluster 130₁would perform similar operations concurrently with or at least independent from the configuration management described below.

As shown, the management engine 240₁of the first computing node 110₁includes a management controller 500, a distributed data store (DDS) client 510, and one or more local management clients 520₁-520_R(R≥1) that are designed to access configuration data from corresponding local management services 530₁-530_R. More specifically, the local management services 530₁-530_Rare services running on the first computing node 110₁that directed to operability of the first computing node 110₁. Stated differently, each of the local management services 530₁-530_Rmay maintain (store) shared local configuration data in different forms: plain text file, relational database, customized database such as an operating system management database. For each type of the configuration form, the management system 120 uses a client to read, write, or receive notification about changes to configuration data. Read operations are being used by the client to monitor changes of the configuration values. For instance, a first local management service 530₁may be directed to database management being performed by the first computing node 110₁while a second local management service 530₈may be directed to file management being performed by the first computing node 110₁.

According to one embodiment of the disclosure, each of these management services 530₁-530_Rprovides an API 540₁-540_Rfrom which the corresponding local management clients 520₁-520_Rmay monitor for changes in shared local configuration data associated with these services. Upon detecting any changes in the shared local configuration data, such as a change to shared local configuration data associated with the first management services 530₁for example (hereinafter, “changed configuration data 550”), the local management clients 520₁provides the changed configuration data 550 to the management controller 500. For instance, the local management client 520₁may temporarily store the changed configuration data 550 until read by the management controller 500 during a polling operation. Alternatively, the local management client 520₁may “push” the changed configuration data 550 to the management controller 500. Herein, the initial configuration file 555 includes descriptions that identifies what properties of the shared local configuration (formed by local management services 530₁-530_R) and/or the reference configuration 270 should be monitored.

The management controller 500 compares the changed configuration data 550 to a portion of the reference configuration 270, which is received from the distributed data store 170 via the DDS client 510. In the event that the changed configuration data 550 is inconsistent with the portion of the reference configuration 270, rendering the shared local configuration data 262₁non-compliant with the reference configuration 270, the management controller 500 may be configured to address this non-compliance in accordance with any number of configuration enforcement schemes. For instance, according to one embodiment of the disclosure, the management controller 500 may be configured to automatically return the shared local configuration data 262₁back to its prior state upon detecting that it is non-compliant with the current reference configuration 270. Alternatively, the management controller 500 may mark the first computing node 110₁as not healthy, so the node 110₁will not be used for processing, or the node 110₁could be detached from the cluster.

Alternatively, the management controller 500 may be configured to perform operations a security administrator to address the configuration violation, which may include providing recommendations or reporting on remedial actions via an administrative portal or transmitted alert. Herein, the management controller 500 may generate representative data 560 of the non-compliance and upload the representative data 560 for storage with the shared local configuration data 262₁in the distributed data store 170. Upon monitoring the distributed data store 170, the management system 120 detects a change in the shared local configuration data 262₁(i.e., the addition of the representative data 560) and reports the configuration violation to a security administrator that determines how to proceed. The security administrator may return a message instructing the management system 120 to signal the management controller 500 to (i) ignore the inconsistent shared local configuration data 262₁for now, (ii) return the changed configuration data 550 back to its prior state in compliance with the reference configuration 270, or (iii) alter the reference configuration 270 with the changed configuration data 550.

According to another embodiment of the disclosure, the management system 120 may alter the reference configuration 270. The DDS client 510 is configured to monitor the distributed data store 170, notably the reference configuration 270, for changes. Upon detecting changes in data associated with the reference configuration 270 (hereinafter, “changed reference configuration data 570”), the DDS client 510₁provides the changed reference configuration data 570 to the management controller 500. For instance, the DDS client 510 may temporarily store the changed reference configuration data 570 until read by the management controller 500 during a polling operation. Alternatively, the DDS client 510₁may “push” the changed reference configuration data 570 to the management controller 500.

The management controller 500 compares the changed reference configuration data 570 to the shared local configuration data 262₁, which is gathered from the local management service 530₁-530_Rby the local management clients 520₁-520_R. Upon detecting that the shared local configuration data 262₁is non-compliant with the reference configuration 270, the management controller 500 may be configured to alter the shared local configuration data 262₁to be consistent with the reference configuration 270 as one type of remedial action (remediation). Another remedial action, where ease and/or timeliness in dealing with the misconfiguration of a computing node is a primary concern, may cause the computing node to leave the cluster, as illustrated in FIGS. 6A-6B. Removal of the computing node may be initiate by command (while the computing node is healthy or addressing a misconfigured computing node) or upon request of the misconfigured computing node.

Referring now to FIG. 6A, a block diagram of an exemplary embodiment of the cluster formation in which a particular computing node (e.g., first computing node 110₁) leaves the cluster 130₁is shown. For this embodiment, to leave the cluster 130₁, the first computing node 110₁may issue a leave command 600 to the cluster management API 290, which may be provided by another computing node within the cluster 130₁(e.g., second computing node 110₂) operating as a proxy to cluster-wide services provided by the cluster 130₁. The issuance of the leave command 600 by the first computing node 110₁may be based on non-compliance (e.g., inconsistency) between the shared local configuration 262₁of the first computing node 110₁and the reference configuration 270, where the shared local configuration 262₁may requires immediate removal or more in-depth analysis before being placed into compliance with the reference configuration 270 (e.g., more requisite time needed than allotted to cure non-compliance, potential tampering of the provisioning of the first computing node 110₁, etc.). As a result, in response to the leave command 600 from the first computing node 110₁, the cluster management API 290 will remove the IP address of the first computing node 110₁from its listing of computing nodes forming the first cluster 130₁and the first computing node 110₁will cause removal of the shared local configuration data 262₁from the distributed data store 170.

Referring now to FIG. 6B, a block diagram of an exemplary embodiment of the cluster formation controlled by the first computing node 110₁, operating as the configuration lead node, in causing another computing node (e.g., third computing node 110₃) to leave the cluster 130₁is shown. As described, each computing node (e.g., computing node 110₁) may acquire the shared local configurations 262₂, 262₃and 262₄associated with each corresponding computing node 110₂, 110₃, and 110₄within its cluster 130₁. As a result, the first computing node 110₁is able to (i) identify non-compliance (e.g., inconsistency) between configuration data associated with a shared local configuration 262₂, 262₃or 262₄(e.g., shared local configuration 262₃) and configuration data of the reference configuration 270. In some cases, the uncovered non-compliance may signify that the third computing node 110₃is non-responsive (e.g., failed), which may cause the first computing node 130₁may generate the leave command 620. Unlike the leave command 600 of FIG. 6A, the leave command 620 would identify the third computing node 110₃as the entity to be removed from the cluster 130₁.

As a result, in response to the leave command 620 from the first computing node 110₁, the cluster management API 290 will remove the IP address of the third computing node 110₃from its listing of computing nodes forming the first cluster 130₁and the first computing node 110₁will cause removal of the shared local configuration data 262₃of the third computing node 110₃from the distributed data store 170.

Referring to FIG. 7, a flow diagram of the operations conducted by the configuration management system of FIG. 5 is shown. Herein, one or more computing nodes within a cluster subscribes to or operates in conjunction with a configuration management service (item 700). Each of the computing nodes (e.g., management service engine within) monitors for a change in the reference configuration (items 710 and 715). Responsive to detection of an authorized change in the reference configuration, each of the computing nodes determines whether its shared local configuration is compliant with the reference configuration (items 720-725). Such a determination may involve a management controller within each computing node comparing configuration data within the reference configuration to corresponding configuration data within its shared local configuration. For each computing node, if the shared local configuration is non-compliant (e.g., inconsistent) with the reference configuration, the management controller modifies the shared local configuration to be compliant with the reference configuration (item 730). If the shared local configuration is compliant with the reference configuration, the configuration management analysis ends (item 740).

Additionally, or in the alternative, each of the computing nodes monitors for changes in its corresponding shared local configuration (items 750-755). Responsive to detection of a change in the shared local configuration within one of the computing nodes (hereinafter, “detecting computing node”), the management controller within the detecting computing node compares the configuration data within the shared local configuration to corresponding configuration data within the reference configuration (items 760-765). If the shared local configuration is non-compliant with the reference configuration, the management controller stores the inconsistent configuration data into the distributed data store to be accessed by the management system (item 770).

Thereafter, if the change to the shared local configuration is authorized, the management controller may receive instructions to (i) ignore alteration of the changed shared local configuration data at this time or (ii) alter the reference configuration with the changed shared local configuration data (items 780, 782, 784). If the change is unauthorized or no update to the reference configuration is desired, the management controller of the non-compliant computing node may receive instructions to alter the shared local configuration data and return the changed shared local configuration data back to its prior state and in compliance with the reference configuration (item 790). If the shared local configuration is in compliance with the reference configuration, the configuration management analysis ends (item 795).

In the foregoing description, the invention is described with reference to specific exemplary embodiments thereof. However, it will be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims.

Claims

What is claimed is:

1. A computing node deployed within a cluster including a data store maintaining at least a reference configuration representing an expected configuration for each computing node within the cluster, the computing node comprising:

one or more processors;

a memory communicatively coupled to the one or more processors, the memory comprises

a first client that, when executed by the one or more processors, is configured to obtain data associated with the reference configuration,

a second client that, when executed by the one or more processors, is configured to obtain data associated with a shared local configuration for the computing node maintained within a first local management service via an application programming interface (API) provided by the first local management service, wherein the data associated with the shared local configuration corresponds to a first portion of local configuration data that is associated with functionality commonly shared by a plurality of computing nodes forming the cluster and is separate from private local configuration data, the private local configuration data, corresponding to a second portion of the local configuration data, pertains to functionality specific to the computing node, the plurality of computing nodes comprises the computing node, and

a management controller that, when executed by the one or more processors, is configured to detect when the shared local configuration is non-compliant with the reference configuration and notify an administrator upon detecting that the shared local configuration is non-compliant with the reference configuration,

wherein the shared local configuration is non-compliant with the reference configuration when a difference between data associated with a property maintained as part of the shared local configuration and the data associated with a property maintained as part of the reference configuration exceeds a tolerance based on the property type.

2. The computing node of claim 1, wherein the first client is configured to monitor the data store for changes to the reference configuration and the second client is configured to monitor for changes to the shared local configuration.

3. The computing node of claim 2, wherein the data associated with the reference configuration is obtained from the data store.

4. The computing node of claim 2, wherein the management controller is configured to notify the administrator upon detecting that a portion of the data associated with the shared local configuration is non-compliant with a corresponding portion of the data associated with the reference configuration by at least uploading the portion of the data associated with the shared local configuration into the data store for subsequent access and analysis by the administrator.

5. The computing node of claim 2, wherein the second client, when executed by the one or more processors, obtains the data associated with the shared local configuration from one or more local management services running on the computing node.

6. The computing node of claim 5, wherein the second client, being a local management client, accesses the API provided by the first local management service of the one or more local management services to obtain a portion of the data associated with the shared local configuration to analyze for inconsistencies with a corresponding portion of the data associated with the reference configuration.

7. The computing node of claim 1, wherein the first client monitors for changes in the reference configuration stored within the data store and the second client monitors for changes in the local management services running on the computing node.

8. The computing node of claim 1, wherein the first client, when executed by the one or more processors, is further configured to obtain data associated with shared local configurations associated with a second computing node different than the computing node.

9. The computing node of claim 8, wherein the management controller, when executed by the one or more processors, is further configured to detect when the shared local configuration of the second computing node is non-compliant with the reference configuration and notify the administrator upon detecting that the shared local configuration of the second computing node is non-compliant with the reference configuration.

10. The computing node of claim 1, wherein the tolerance allows for prescribed differences between data associated with the property maintained as part of the shared local configuration and the data associated with the property maintained as part of the reference configuration.

11. A system for maintaining node configuration consistency throughout a cluster configured for cyber-threat detection, the system comprising:

a data store for storage of at least data associated with a reference configuration;

a management system communicatively coupled to the data store; and

one or more computing nodes communicatively coupled to the data store, the one or more computing nodes including a first computing node that comprises a management engine configured to (i) obtain the data associated with the reference configuration from the data store, (ii) obtain a first portion of shared local configuration data for the first computing node maintained within a first local management service via an application programming interface (API) provided by the first local management service, (iii) detect when the first portion of the shared local configuration data is non-compliant with the data associated with the reference configuration, and (iv) upload information associated with the non-compliant data associated with the first portion of the shared local configuration data into the data store,

wherein the first computing node is configured to remove itself from the cluster based on the first portion of the shared local configuration data being non-compliant with the data associated with the reference configuration.

12. The system of claim 11, wherein

the first computing node maintains local configuration data including the shared local configuration data being a first portion of the local configuration data associated with functionality shared by the one or more computing nodes and a second portion of the local configuration data associated functionality specific to a particular computing node of the one or more computing nodes, and

the management system further to provide the first computing node with administrative controls to modify the shared local configuration data to be compliant with the data associated with the reference configuration.

13. The system of claim 11, wherein

the management engine of the first computing node to monitor the data store for changes to the data associated with the reference configuration and to monitor for changes to the shared local configuration data.

14. The system of claim 13, wherein the management engine of the first computing node is configured to notify the administrator upon detecting that the first portion of the shared local configuration data is non-compliant with the data associated with the reference configuration by at least uploading the first portion of the shared local configuration data into the data store for subsequent access and analysis by the administrator.

15. The system of claim 11, wherein the management engine of the first computing node to monitor for changes in the data associated with the reference configuration stored within the data store and monitor for changes in local management services, including the first local management service, running on the first computing node.

16. The system of claim 11, wherein the management engine of the first computing node is further configured to obtain a second portion of the data associated with shared local configuration data associated with a second computing node different than the first computing node.

17. The system of claim 11, wherein the management engine to detect when the shared local configuration data is non-compliant with the data associated with the reference configuration when data associated with a property maintained as part of the shared local configuration data is different than the data associated with a property maintained as part of the reference configuration.

18. The system of claim 11, wherein the management system is deployed as a cloud service in which the shared local configuration data being the first portion of the local configuration data associated with functionality shared by the one or more computing nodes with access to the cloud service.

19. The system of claim 18, wherein the one or more computing nodes are deployed within a public cloud including the management system operating as the cloud service.

20. The system of claim 11, wherein the first computing node, upon detecting a configuration violation in which the shared local configuration data is non-compliant with the data associated with the reference configuration, uploading the information by making the shared local configuration data available to the administrator via a portal.

21. The system of claim 20, wherein the configuration violation is handled by an automated system relying on preconfigured rule sets, one or more machine learning models, or an artificial neural network.

22. The system of claim 20, wherein the configuration violation is through a remediation using repair instructions to return the shared local configuration back to a prior state that is in compliance with the reference configuration.

23. The system of claim 20, wherein the particular computing node corresponds to the first computing node.

24. A computerized method for monitoring local configurations of computing nodes forming a cluster including a plurality of computing nodes that are configured to collectively perform cyber-threat detection analytics on received objects to determine if the received objects are malicious or non-malicious, the computerized method comprising:

obtaining data associated with a reference configuration;

obtaining data associated with a shared local configuration for a computing node of the plurality of computing modes forming the cluster, wherein (i) the data associated with the shared local configuration is maintained within a first local management service accessible via an application programming interface (API) provided by the first local management service and (ii) the shared local configuration corresponds to a first portion of local configuration data associated with functionality shared by the plurality of computing nodes including the computing node while a private local configuration corresponding to a second portion of the local configuration data pertains to functionality specific to the computing node,

detecting when the shared local configuration is non-compliant with the reference configuration; and

notifying an administrator upon detecting that the shared local configuration is non-compliant with the reference configuration.

25. The computerized method of claim 24, wherein the computing node is configured to monitor for changes to the shared local configuration and obtain the data associated with the shared local configuration and the data associated with the reference configuration responsive to a change to the shared local configuration.

26. The computerized method of claim 25, wherein the data associated with the reference configuration is obtained from the data store.

27. The computerized method of claim 24, wherein the notifying of the administrator upon detecting that a portion of the data associated with the shared local configuration is non-compliant with a corresponding portion of the data associated with the reference configuration by at least uploading the portion of the data associated with the shared local configuration into a data store for subsequent access and analysis by the administrator.

28. The computerized method of claim 24, wherein the detecting whether the shared local configuration is non-compliant with the reference configuration is conducted within a public cloud network.

29. The computerized method of claim 28, wherein the obtaining of the data associated with the shared local configuration for the computing node of the cluster further comprising obtaining shared local configuration from each of the plurality of computer nodes of the cluster other than the computing node, wherein the shared local configuration corresponds to data associated with functionality shared by the plurality of computing nodes operating in the public cloud network.

30. The computerized method of claim 28, wherein the obtaining of the data associated with the shared local configuration for the computing node of the cluster further comprising obtaining shared local configuration from each of the plurality of computer nodes of the cluster other than the computing node, wherein the shared local configuration corresponds to data associated with functionality shared by the plurality of computing nodes accessing the management system deployed within the public cloud network.

31. The computerized method of claim 24, wherein the notifying of the administrator, upon detecting a configuration violation in which the shared local configuration is non-compliant, comprises providing access to the shared local configuration via a portal.

32. The computerized method of claim 31, wherein the configuration violation is handled by an automated system relying on preconfigured rule sets, one or more machine learning models, or an artificial neural network.

33. The system of claim 31, wherein the configuration violation is handled through a remediation using repair instructions to return the shared local configuration back to a prior state that is in compliance with the reference configuration.

Resources

Images & Drawings included:

Fig. 01 - System and method for distributed cluster configuration monitoring and management — Fig. 01

Fig. 02 - System and method for distributed cluster configuration monitoring and management — Fig. 02

Fig. 03 - System and method for distributed cluster configuration monitoring and management — Fig. 03

Fig. 04 - System and method for distributed cluster configuration monitoring and management — Fig. 04

Fig. 05 - System and method for distributed cluster configuration monitoring and management — Fig. 05

Fig. 06 - System and method for distributed cluster configuration monitoring and management — Fig. 06

Fig. 07 - System and method for distributed cluster configuration monitoring and management — Fig. 07

Fig. 08 - System and method for distributed cluster configuration monitoring and management — Fig. 08

Sources:

United States Patent and Trademark Office - verify current appl. status at the USPTO↗

Recent applications in this class:

» 20250211488 2025-06-26
AUTOMATIC DETERMINATION OF DISPLAY DEVICE FUNCTIONALITY
» 20250168069 2025-05-22
L2 SWITCH DETECTION METHOD AND PROGRAM
» 20250150343 2025-05-08
Unified Network Entity
» 20250088421 2025-03-13
NETWORK ENTITIES FOR SUPPORTING ANALYTICS GENERATION IN A MOBILE NETWORK
» 20250088420 2025-03-13
METHOD AND SYSTEM FOR RETRIEVING CONFIGURATION SCHEMA FOR RAPP
» 20250023781 2025-01-16
METHOD OF DISPLAYING INFORMATION FOR CONTENT DELIVERY NETWORK
» 20240422060 2024-12-19
METHOD FOR DETERMINING OPTIMAL COMPUTING AND STORING PATH OF COMPUTING POWER NETWORK AND MONITORING APPARATUS
» 20240380657 2024-11-14
WIRELESS COMMUNICATION METHOD TO SUPPORT RESILIENCY OF NG-RAN NODES
» 20240356806 2024-10-24
Configuration Management Method and Apparatus, Device, System, Storage Medium, and Program Product
» 20240275676 2024-08-15
Sharded Model for Configuration Management and Status Retrieval

Recent applications for this Assignee:

» 20230188571 2023-06-15
Automated enforcement of security policies in cloud and hybrid infrastructure environments
» 20200244696 2020-07-30
Dynamic adaptive defense for cyber-security threats
» 20190207967 2019-07-04
Platform and method for retroactive reclassification employing a cybersecurity-based global data store
» 17157968 2023-01-31
Detection of phishing attacks using similarity analysis
» 17133411 2022-12-06
Subscription and key management system
» 17035538 2022-07-26
Subscription-based malware detection
» 16726723 2023-03-07
Automated system for triage of customer issues
» 16666335 2022-02-01
Malware detection verification and enhancement by coordinating endpoint and malware detection systems
» 16572537 2022-04-05
Selective virtualization for security threat detection
» 16457573 2022-07-19
System and method for supporting cross-platform data verification