🔗 Permalink

Patent application title:

HVAC CONTROLLER PARAMETER DETERMINATION BASED ON SIGNAL STABILIZATION

Publication number:

US20260016182A1

Publication date:

2026-01-15

Application number:

18/968,062

Filed date:

2024-12-04

Smart Summary: Methods and systems are designed to improve how HVAC controllers operate by determining the best settings for them. First, they simulate the environment where the controller works and analyze its output to find the right parameters. They can also use real data from the actual environment, including signals from other controllers and feedback from the system. This data is assessed to refine the controller's settings further. The goal is to make the HVAC controller more stable and efficient in responding to temperature or other control signals. 🚀 TL;DR

Abstract:

Methods and systems for generating parameters and loading them into an HVAC controller are described herein. Setpoint functions are input into a simulation of an environment of the controller. An output of the controller within the simulation is evaluated, using an objective function based on oscillations of the output of the controller, to determine the parameters, which are then loaded into the controller. Alternatively or additionally, data corresponding to an environment in which the controller is implemented is received. The data includes information about a setpoint signal received from an external controller and feedback from a physical system controlled by the controller. The data is evaluated using a reward function to determine parameters, which are then loaded into, or used to augment existing parameters within, the controller. By using the techniques herein, the controller may be configured to stabilize the setpoint signal received from the external controller.

Inventors:

Stefan MISCHLER 15 🇨🇭 Wald, Switzerland
Volkher SCHOLZ 6 🇨🇭 Zürich, Switzerland
Babak MOHAJER 1 🇨🇭 Zurich, Switzerland
Neelaksh SINGH 1 🇩🇪 Tübingen, Germany

Joram LIEBESKIND 1 🇨🇭 Winterthur, Switzerland

Assignee:

BELIMO HOLDING AG 122 🇨🇭 Hinwil, Switzerland

Applicant:

BELIMO HOLDING AG 🇨🇭 Hinwil, Switzerland

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

F24F11/64 » CPC main

Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values; Electronic processing using pre-stored data

F24F11/46 » CPC further

Control or safety arrangements for purposes related to the operation of the system, e.g. for safety or monitoring Improving electric energy efficiency or saving

Description

FIELD

This disclosure is directed to systems and methods for generating one or more parameters for a heating, ventilation, and air conditioning (HVAC) controller and for tuning an HVAC controller to stabilize a setpoint signal received by the HVAC controller from an external HVAC controller.

BACKGROUND

Control systems are used in many applications to manage behaviors of other devices or systems. For example, control systems are often used to control environmental conditions (e.g., temperature, humidity, air quality, air composition, etc.) within buildings or other spaces. The control systems usually receive indications of existing environmental conditions, e.g., from one or more sensors, and compare them with a particular setpoint to control equipment (e.g., dampers, valves, heaters, chillers, air handlers, etc.) to adjust or affect the environmental conditions.

Many control systems include cascaded control loops, where one or more controllers are controlled by setpoint signals (e.g., setpoint values over time) received from one or more external controllers. For example, a flow controller for a hydronic heating system may be controlled via a setpoint signal received from a thermostat. The one or more controllers and the one or more external controllers are coupled because outputs produced by one or more plants (e.g., physical systems) controlled by the one or more controllers (e.g., heaters, hydraulic circuits, valves, or dampers) become feedback for the one or more external controllers. Often, coupling between the one or more controllers and the one or more external controllers is very nonlinear, making such systems hard to predict and/or the one or more controllers difficult to tune. Furthermore, the setpoint signals received by the one or more controllers are often unstable (e.g., they may include many oscillations and/or direction changes) and/or are rapidly changing, which may lead to inefficient systems, premature wear of components (e.g., dampers, valves, or actuators), and/or decreased user satisfaction.

SUMMARY

A method of generating one or more parameters for an HVAC controller is described herein. The method includes receiving one or more setpoint functions and inputting the setpoint functions into a simulation of an HVAC controller environment. The method also includes evaluating an objective function on the simulation of the HVAC controller environment. The objective function is based on oscillations of an output of the HVAC controller in the simulation. The method further includes determining, based on the evaluating the objective function, the parameters for the HVAC controller. The method also includes loading the parameters into the HVAC controller.

A method of tuning an HVAC controller to stabilize a setpoint signal received by the HVAC controller from an external HVAC controller is also described herein. The method includes receiving data corresponding to an HVAC controller implementation environment. The data includes information about the setpoint signal and feedback from an HVAC physical system controlled by the HVAC controller. The method also includes evaluating a reward function based on the data and determining, based on the evaluating the reward function, a set of parameters for the HVAC controller. The method further includes loading the parameters into the HVAC controller.

Optimization systems configured to perform the methods discussed above are also described herein. Each of the optimization systems may be included within a computing device remote to the controller, a cloud computing device, the controller, or some combination thereof.

The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description. In the drawings, like reference numbers indicate identical or functionally similar elements.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example of a cascaded control loop where HVAC controller parameter determination based on signal stabilization may be used.

FIG. 2 illustrates an example of the cascaded control loop of FIG. 1 used in an HVAC application.

FIG. 3 illustrates an example of an offline process of HVAC controller parameter determination based on signal stabilization.

FIG. 4 illustrates an example of an online process of HVAC controller parameter determination based on signal stabilization.

FIG. 5 illustrates an example of an offline optimization system configured to perform HVAC controller parameter determination based on signal stabilization.

FIG. 6 illustrates an example of an online optimization system configured to perform HVAC controller parameter determination based on signal stabilization.

FIG. 7 illustrates a flow chart of an example offline method of HVAC controller parameter determination based on signal stabilization.

FIG. 8 illustrates a flow chart of an example online method of HVAC controller parameter determination based on signal stabilization

DETAILED DESCRIPTION

Overview

Controllers use many different control strategies. A commonly used control strategy, for example, is proportional-integral-derivative (PID) control. A PID controller works by adjusting the input according to the current error, where the error is defined as the difference between a desired and an actual output. For instance, in the situation of heating or cooling a room, the error is the difference between a desired temperature and a measured temperature (e.g. via a temperature sensor). The input can be adjusted proportionally to the current error (P), proportional to the historical past of the error (I), or proportional to the rate of change of the error (D). However, PID control is only one type of control strategy, and different controllers have different error optimization objectives depending on the type of control strategy deployed.

Unless otherwise indicated, the term controller(s) refers to internal controller(s) of a cascaded control loop. Often, controllers and external controllers within cascaded control loops (e.g., many HVAC control systems) are non-linearly coupled together, making such environments hard to model and/or the controllers difficult to tune. Furthermore, setpoint signals received by controllers are often unstable, which may lead to inefficient systems, premature wear of components, and/or decreased user satisfaction.

Methods and systems for generating parameters (e.g., optimized parameters) and loading them into an HVAC controller are described herein. Setpoint functions (e.g., inputs to an external controller) are input into a simulation of an environment of the controller. An output of the controller within the simulation is evaluated, using an objective function based on oscillations of the output of the controller, to determine the parameters, which are then loaded into the controller.

Alternatively or additionally, data corresponding to an environment in which the controller is implemented is received. The data includes information about a setpoint signal received from an external controller and feedback from a physical system controlled by the controller. The data is evaluated using a reward function to determine parameters (e.g., optimized parameters), which are then loaded into, or used to augment existing parameters (e.g., those determined using the technique above or determined using another technique) within the controller.

Both of the above techniques, alone or in combination, produce parameters configured to cause the controller to control a physical system (also known in the art as a ‘plant’) such that the setpoint signal received by the controller from the external controller is more stable (e.g., less oscillation and/or has less direction changes). In so doing, the control loop of the controller may be better optimized, thereby leading to better efficiency of the physical system, less wear of components within the physical system, and/or increased user satisfaction.

In the following description, numerous specific details are set forth, such as particular structures, components, materials, dimensions, processing steps and techniques, in order to provide an understanding of the various embodiments of the present disclosure. However, it will be appreciated by one of ordinary skill in the art that the various embodiments of the present application may be practiced without these specific details. In other instances, well-known structures or processing steps have not been described in detail in order to avoid obscuring the present disclosure.

Example Cascaded Control Loops

Cascade control is a form of feedback control that uses a specific arrangement of two or more control loops to control a process, e.g. a physical system. FIG. 1 illustrates an example of a cascaded control loop 100 where HVAC controller parameter determination based on signal stabilization may be used. A loop, as used herein, refers to a closed loop or feedback loop. The cascaded control loop 100 includes an inner feedback control loop 104 nested within an external feedback control loop 102. The external feedback control loop 102 and the inner feedback control loop 104 are coupled to form the cascaded control loop 100. For example, a setpoint 106 from the external feedback control loop 102 is used as an input for the inner feedback control loop 104, and an external output 108 output by the cascaded control loop 100 is fed back to the external feedback control loop 102.

An external setpoint 110 (e.g., a temperature setting, a pressure setting, a humidity setting, a flow setting, or a composition setting) corresponding to a desired target for the external output 108 (e.g., temperature, pressure, humidity, flow, or composition) is received by an external error generator 112. The external error generator 112 produces an external setpoint error 118 by comparing an external value 114 (e.g., temperature, pressure, humidity, flow, or composition) received from an external sensor 116 (e.g., temperature sensor, pressure sensor, humidity sensor, flow sensor, or composition sensor) that measures the external output 108 to the external setpoint 110.

The external setpoint error 118 is received by an external controller 120 (e.g., process controller, proportional, integral, and/or derivative (P, I, D, PI, PD, ID, or PID) controller, thermostat, or humidistat) which outputs the setpoint 106 (e.g., valve setting, electrical setting, motor setting, or physical setting) based on an external control law or algorithm (not shown in FIG. 1). It will be appreciated that any portions of the external feedback control loop 102 other than the external controller 120 may be part of the external controller 120. For example, the external setpoint 110 may be received by the external controller 120. Similarly, the external sensor 116 and/or the external error generator 112 may be part of the external controller 120.

The setpoint 106 output by the external controller 120 is received by an error generator 122 that produces a setpoint error 130 by comparing a value 124 (e.g., valve setting, electrical setting, motor setting, or physical setting) received from a sensor 126 (e.g., potentiometer, volt/amp/resistance/frequency sensor, or speed sensor) that measures part of a physical system 128 (e.g., heating/cooling valve, heat exchanger, fan, damper, humidifier, dehumidifier, or gas system) to the setpoint 106.

The setpoint error 130 is received by a controller 132 (e.g., P, I, D, PI, PD, ID, or PID controller, parameterizable controller, model predictive controller, artificial neural network controller, linear model controller, adaptive controller, process controller, valve controller, and/or motor controller) which controls the physical system 128 based on a control law 134 (e.g., P, I, D, PI, PD, ID, or PID, neural network, algorithm, machine-learned or machine-learning (ML) algorithm or network). The control law 134 utilizes one or more parameters (e.g., controller parameters) that affect the output from the controller 132. The output from the controller 132 may be any suitable control signal configured to affect an operation of the physical system 128. It will be appreciated that any portions of the inner feedback control loop 104 other than the controller 132 may be part of the controller 132. For example, the setpoint 106 may be received by the controller 132. Similarly, the sensor 126 and/or the error generator 122 may be part of the controller 132.

The physical system 128 receives the output from the controller 132 and produces the external output 108. By generating parameters for the control law 134 according to the techniques discussed herein, the controller 132 may be able to stabilize the setpoint 106 received from the external controller 120. Doing so may enable the physical system 128 to run more efficiently, operate with less wear on components, and/or produce the external output 108 with more favorable attributes.

FIG. 2 illustrates an example of the cascaded control loop 100 used in an HVAC application such as hydronic heating or cooling (e.g., a hydronic heating or cooling cascaded control loop 200). The hydronic heating or cooling cascaded control loop 200 is similar to the cascaded control loop 100 but specific to a hydronic heating or cooling system, as those are some of the most common HVAC environments.

A temperature setpoint 202 (e.g., an example of the external setpoint 110) corresponding to a desired target for the HVAC environment, e.g., a temperature output 204 (e.g., an example of the external output 108) is received by a temperature error generator 206 (e.g., an example of the external error generator 112). The temperature error generator 206 produces a temperature setpoint error 212 (e.g., an example of the external setpoint error 118) by comparing a temperature 208 (e.g., an example of the external value 114) received from a temperature sensor 210 (e.g., an example of the external sensor 116) that measures the temperature of the HVAC environment, e.g., the temperature output 204, to the temperature setpoint 202.

The temperature setpoint error 212 is received by a temperature controller 214 that is an example of the external controller 120 (e.g., thermostat) which outputs a flow setpoint 216 (e.g., an example of the setpoint 106) based on a temperature control law or algorithm (not shown in FIG. 2). It will be appreciated that the temperature setpoint 202 may be received by the temperature controller 214. Similarly, the temperature sensor 210 and/or the temperature error generator 206 may be part of the temperature controller 214.

The flow setpoint 216 is received by a flow error generator 218 (e.g., an example of the error generator 122) that produces a flow setpoint error 226 (e.g., an example of the setpoint error 130) by comparing a flow 220 received from a flow sensor 222 (e.g., an example of the sensor 126) that measures flow out of, or at a point within, a hydronic circuit 224 (e.g., pump, boiler, chiller, and/or valve) of the physical system 128 to the flow setpoint 216. The physical system 128 may include the hydronic circuit 224 and a heat exchanger 228 configured to exchange heat between the hydronic circuit 224 and air flow of the HVAC environment.

The flow setpoint error 226 is received from the flow error generator 218 by a flow controller 230 (e.g., an example of the controller 132, such as a process controller, P, I, D, PI, PD, ID, or PID controller, or valve controller) which controls the hydronic circuit 224 based on the control law 134. The output from the flow controller 230 may be any suitable control signal configured to influence operation of the hydronic circuit 224. The flow setpoint 216 may be received by the flow controller 230. Similarly, the flow sensor 222 and/or the flow error generator 218 may be part of the flow controller 230.

The hydronic circuit 224 receives the output from the flow controller 230 and provides flow and/or different temperature fluid to the heat exchanger 228. The heat exchanger 228 then produces the temperature output 204 (e.g., via heat transfer to an air stream entering a target environment). By generating the control law 134 according to the techniques discussed herein, the flow controller 230 may be able to stabilize the flow setpoint 216 received from the temperature controller 214. Doing so may enable the hydronic circuit 224 to run more efficiently, operate with less wear on components, and/or cause the temperature output 204 to have more favorable attributes.

Example Offline Parameter Determination

FIG. 3 illustrates an offline process 300 of HVAC controller parameter determination based on signal stabilization. The offline process 300 may be performed without information on an actual operating environment of the controller 132. For example, the offline process 300 may be performed prior to the controller 132 controlling the physical system 128 and/or before it goes “online.” The offline process 300, may, however, also be performed while the controller 132 has been implemented and/or is “online.”

An offline optimization system 302 that is communicatively coupled with the controller 132 includes an offline optimization module 304 that is implemented at least partially in hardware of the offline optimization system 302. The offline optimization module 304 is configured to generate offline parameters 306 (e.g., proportional (P), integral (I), and/or derivative (D) gains, neural network weights, adaptive weights, model predictive weights, and/or equation/algorithm values) that become at least a portion of the control law 134.

To generate the offline parameters 306, the offline optimization module 304 receives one or more setpoint functions 308. The setpoint functions 308 are time variant functions of the setpoint 106 or profiles of the setpoint 106 over time (e.g., setpoint versus time functions). In other words, the setpoint functions 308 represent examples of the setpoint 106 from the external controller 120 over time and may be considered as a modeled representation of the external controller 120. Each setpoint function 308 may correspond to a certain time frame (e.g., a week, a month, a year, etc.).

The setpoint functions 308 may be heuristically derived (e.g., chosen, created, or based on knowledge) and/or based on data. For example, the setpoint functions 308 may be derived from historical setpoint data 310. The historical setpoint data 310 comprises real-world setpoint data about setpoints 106 gathered from any number of respective environments/systems.

To derive setpoint functions 308 from the historical setpoint data 310, the historical setpoint data 310 may be split by time frames (e.g., into smaller time frames, by days, by weeks, etc.). The historical setpoint data 310 for the time frames may be cropped or padded to create a dataset with samples of equal length, thereby standardizing the dataset. There may be unknown factors that lead to several classes of setpoint profiles that share some similarities within each class. Accordingly, the dimensionality may be reduced such that only more informative dimensions are kept.

To reduce the dimensionality of the historical setpoint data 310, various techniques (e.g., an autoencoder, PCA, LDA, t-SNE, or compression algorithm) may be used. For example, an autoencoder may be configured to learn a mapping from high-dimensional data to a low-dimensional representation as well as an inverse mapping. Two mappings, f_θ:ⁿ→^cand g_ϕ:^c→ⁿmay be defined, where the encoder with parameters θ is denoted by f_θ and the decoder with parameters ϕ is denoted by g_ϕ. The data dimension is n and the latent dimension of the encoding is c. Values may be set such that c<<n.

The encoder may take a sample ∈ⁿand generate a low-dimensional representation z=f_θ(). The input same can then be approximately recovered as ≈=g_ϕ(). The autoencoder may be trained by minimizing the reconstruction error, which may be done according to Equation 1.

ℒ AE ( 𝕏 , θ , ϕ ) = 1 n ⁢ ∑ i = 1 n ⁢  𝕩 i - g ϕ ( f θ ( 𝕩 i ) )  2 2 ( 1 )

The mathematical program may be given by Equation 2.

θ , ϕ = argmin θ , ϕ ⁢ { ℒ AE ( 𝕏 , θ , ϕ ) } ( 2 )

The autoencoder may be trained on the historical setpoint data 310 to get ⁱ=f_θ(ⁱ). Equation 2 may then be minimized (e.g., by using an Adam optimizer) to produce encodings.

Once the encodings are generated, clustering on the latent encodings ⁺ can be performed to determine representative setpoint functions. Assuming that k∈₊ denotes the number of setpoint profile types, clustering techniques may be used to group the input samples into k subsets such that distances within clusters are minimized. For example, centroid-based methods may be used to find k cluster centers such that the distances to the centroids are minimized for each corresponding subset. As an example, a k-means algorithm may be used for setpoint profile clustering.

Thus, the setpoint functions 308 may be any number of setpoint functions and may be heuristically chosen or derived from the historical setpoint data 310 (e.g., via dimension reduction and/or clustering). The offline optimization module 304 may receive any number of setpoint functions 308 as inputs.

The offline optimization module 304 includes a simulation 312 of an environment of the controller 132. The simulation 312 may work on one or more of the setpoint functions 308 at a time. Within the simulation 312, datapoints (e.g., time points) of a setpoint function 308 may be received by an error generator 314 (e.g., similar to the error generator 122). The error generator 314 produces an error 318 (e.g., similar to the setpoint error 130) by comparing each datapoint of the setpoint function 308 with an output 316 fed back from an unknown physical system 324.

The error 318 is received by a simulated controller 320 (e.g., similar to controller 132) with a control law 322 (e.g., similar to control law 134) defined as u(t)=f(r(t),y(t),θ). The simulated controller 320 produces an output 334 that controls the unknown physical system 324 (similar to the physical system 128) within the simulation 312 to produce the output 316.

An optimizer 326 is configured to determine a set of optimal parameters θ from a plurality or range of parameters 328. Any number of the parameters 328 may be fed back into the simulation 312 as part of the optimization. The optimal parameters θ become the offline parameters 306 that are loaded onto the controller 132 as part of the control law 134.

To do so, the optimizer 326 is configured to, based on a continuous time representation, minimize an objective function 330 given by Equation 3

J ⁡ ( θ ) = ω 1 ⁢ ∫ ❘ "\[LeftBracketingBar]" te control ❘ "\[RightBracketingBar]" ⁢ dt + ω 2 ⁢ ∫ ❘ "\[LeftBracketingBar]" u ⁡ ( t + 1 ) - u ⁡ ( t ) ❘ "\[RightBracketingBar]" ⁢ dt +   ω 3 ⁢ ∫ ❘ "\[LeftBracketingBar]" te oscillation ( t ) ❘ "\[RightBracketingBar]" ⁢ dt + ω 4 ⁢ ∫ D ⁡ ( t ) ⁢ dt ( 3 )

where ω₁, ω₂, ω₃, ω₄are weights for each term, e_control(t) is the error 318, e_oscillation(t)=e_control(t) during oscillations of the output 334 and 0 otherwise, and D(t) is a number of sign changes of the first derivative of the output 334 of the simulated controller 320 fed to the unknown physical system 324.

The objective function 330 may include a plurality of weighted terms. The first term of the objective function 330 is an Integral Time-weighted Absolute Error (ITAE) and may be used to penalize overshoots of the output 334. The second term of the objective function 330 calculates a total variation of the output 334 and may be used to minimize oscillations in the output 334. The third term of the objective function 330 is the ITAE measured during oscillations of the output 334. The final term of the objective function 330 may be based on direction changes of the output 334 and may be configured to ensure that the direction changes of the output 334 are minimized (e.g., due to sharp tuning of the simulated controller 320). The objective function 330 may contain any number of the terms above (e.g., it may contain less than four) and may contain more terms. Furthermore, in some embodiments, weights may not be used or used only on a subset of the terms given by Equation 3.

The objective function 330 is calculated over the plurality or range of parameters 328 and over durations of the setpoint functions 308 that have been fed into the simulation 312. The objective function 330 may appear as an oracle model to an optimization routine performed by the optimizer 326.

To determine the offline parameters 306 from the parameters 328, a black-box optimization may be used, as differentiating the objective in Equation 3 with respect to the parameters 328 is intractable. Furthermore, evaluation of each set of parameters 328 may be computationally expensive as cach set of parameters 328 may require a run of the simulation 312.

Instead of running the simulation 312 (e.g., evaluating the objective function 330) for cach set of parameters 328 over the entire range of parameters 328, a Bayesian Optimization (BO) method may be used, as it may perform a more efficient parameter search (e.g., be less computationally expensive). The BO method may determine a region in a search space of the parameters 328. For example, a Radial Basis Function kernel added to a white kernel for Gaussian Process regression may be used with an Upper Confidence bound acquisition function. As the region may not be close enough, the result of the BO method may be further refined using a simplex algorithm (e.g., a Nelder-Mead optimization). The BO method and/or the simplex algorithm may produce the optimal parameters from the range of parameters 328 that become the offline parameters 306 of the control law 134.

In the case where the setpoint functions 308 were determined using clustering based on the historical setpoint data 310, for each cluster, two setpoint functions 308 may be selected that are nearest their respective cluster center. One setpoint function 308 becomes part of a training set and the other setpoint function 308 becomes part of a test set. The training set may be used by the offline optimization module 304 to determine optimal parameters, while the test set may be used to evaluate performance of the determined optimal parameters.

In some embodiments, historical sensor data 332 (e.g., cloud data from implemented devices) may be used to better simulate the unknown physical system 324. For example, a machine-learning algorithm may generate characteristics of the unknown physical system 324 for use by the offline optimization module 304.

By using the above techniques, the offline optimization module 304 may determine the offline parameters 306 that are configured to stabilize an output of the controller 132. As a result, a setpoint signal (e.g., the setpoint 106 over time) received by the controller 132 when it is put into use (e.g., from the external controller 120) may also be stabilized. Doing so may also stabilize the controller output from the controller 132 which may enable the physical system 128 to operate better, more efficiently, and/or with less maintenance.

Example Online Process

FIG. 4 illustrates an online process 400 of controller parameter determination based on signal stabilization. The online process 400 may be performed using information on an actual operating environment of the controller 132. For example, the offline process 400 may be performed after the controller 132 has been controlling the physical system 128 and/or after it goes “online.”

An online optimization system 402, which may be the same or different than the offline optimization system 302, is communicatively coupled with the controller 132. The online optimization system 402 includes an online optimization module 404 that is implemented at least partially in hardware of the online optimization system 402. The online optimization module 404 is configured to generate online parameters 406 (e.g., proportional (P), integral (I), and/or derivative (D) gains, neural network weights, adaptive weights, model predictive weights, and/or equation/algorithm values) that become at least a second portion of the control law 134 or are used to update at least a first portion of the control law 134.

For example, the control law 134 may be according to Equation 4

u ⁡ ( t ) = f ⁡ ( r e ( t ) , y e ( t ) , θ ) + g ⁡ ( r ⁡ ( t ) , y ˜ ( t ) , ϕ ) ( 4 )

where f(r_e(t),y_e(t),θ) corresponds to an offline portion 408 of the control law 134 based on the offline parameters 306 and g(r(t),{tilde over (y)}(t),ϕ) corresponds to an online portion 410 of the control law 134 based on the online parameters 406. The offline portion 408 and the online portion 410 may be the same or different types of control laws (e.g., P, I, and/or D control, ML control, neural network control, or other control algorithms). The offline parameters 306 may be generated as discussed above in regard to FIG. 3. Alternatively, the offline parameters 306 may be generated via another process (e.g., manual or auto-tuning or another optimization algorithm). It should be noted that some processes other than those discussed above in regard to FIG. 3 may generate the offline parameters 306 “online” based on the actual operating environment of the controller 132.

In some embodiments, the online parameters 406 may be used to update the offline parameters 306 instead of becoming part of the online portion 410. In other words, the online portion 410 of the control law 134 may not exist in such embodiments, and the online parameters 406 may be used instead to update values of the offline parameters 306 (e.g., determined using the offline optimization module 304). For example, the online optimization system 402 may receive existing offline parameters 306 from the controller 132, update them using the online parameters 406 (e.g., via averaging, replacement, or other statistical means), and send updated offline parameters 306 back to the controller 132.

To generate the online parameters 406, the online optimization module 404 receives observations 412 about the environment in which the controller 132 is implemented (e.g., from the controller 132 and/or other sensors/systems disposed within the environment of the controller 132). The observations 412 may be multidimensional and may contain information about the value 124, the setpoint 106, the setpoint error 130, and/or others at various times.

The online optimization module 404 may use a reward function 414 or objective function to determine the online parameters 406 based on the received observations 412. Signals are used to designate values as functions of time (e.g., time-series). For example, a time-series of the value 124 may be referred to as a value signal, a time-series of the setpoint 106 may be referred to as a setpoint signal, a time-series of the control value (e.g., to the physical system 128) may be referred to as a control signal, and a time-series of the setpoint error 130 may be referred to as the setpoint error signal.

The reward function 414 may be according to Equation 5

J k ( ϕ ) = λ 1 ⁢  e k  2 + λ 2 ⁢ F osc , k + λ 3 ⁢ F dir , k + λ 4 ⁢ J set , k ( ϕ ) ( 5 )

where e_kis the setpoint error 130 at time step k, and λ_ifor i=1, . . . , 4 are weights of each component. F_osc,kand F_dir,kmay be according to Equations 6 and 7, respectively,

F osc , k = { 1 if ⁢ the ⁢ value ⁢ signal ⁢ is ⁢ oscillating ⁢ at ⁢ time ⁢ k 0 otherwise , ( 6 ) F dir , k = { 1 if ⁢ u ˙ ( t ) ⁢ changed ⁢ sign ⁢ at ⁢ time ⁢ k 0 otherwise , ( 7 )

where {dot over (u)}(t) is the first-order time derivative of the control signal. Equations 5 and 6 may be configured to penalize the control law 134 if {dot over (u)}(t) makes too many changes in direction and/or if the value signal oscillates over time. J_set,t(ϕ) may be configured to enforce a stability of the setpoint signal.

The stability of the setpoint signal refers to how quickly the setpoint signal varies with time. Accordingly, a variation in the setpoint signal over time may be quantified and used as a penalty term. To quantify the variation in the setpoint signal, a magnitude of the first- and second-order derivates of the setpoint signal:

 d dt ⁢ r ref , t  p , and ⁢  d 2 dt ⁢ r ref , t  p ,

where r_ref,tis the setpoint signal at time step t and ∥·∥_pis the p-norm of some vector.

The 2-norm of the first-order finite difference approximation for the derivative and the penalty term according to Equation 8 may be used.

J set , t ( ϕ ) =  r ref , t - r ref , t - 1  2 ( 8 )

A reinforcement learning (RL) technique, an online optimization-based technique (e.g., if the controller 132 is an online optimization-based controller) or an adaptive control technique (e.g., if the controller 132 is an adaptive controller that allows the control law to be updated) may be used. An RL technique may be used because of its capability to learn control policies that maximize the reward function 414 via policy search algorithms. Policy search algorithms are classes of RL techniques that directly optimize the policy (e.g., decision-making strategy) rather than merely relying on a value function. The policy search algorithms are configured to handle a wide range of reward functions tailored to different objectives. Accordingly, policy search algorithms may adjust to maximize a total reward determined by the online optimization module 404.

For example, a Soft-Actor Critic (SAC) RL algorithm may be used, as its off-policy nature makes it sample efficient. Furthermore, SAC maximizes the entropy of the action policy which ensures that the policy does not place all probability mass on a single action in some states. Doing so may lead to better exploration and regularization in policy updates than other techniques.

The RL approach may find online parameters 406 that minimize the discounted infinite horizon expected cost (e.g., maximize a reward) at a given time step t

J ⁡ ( ϕ ) = ∑ k = t ∞ ⁢ γ k - t ⁢ J k ( ϕ ) ( 9 )

where γ is a discount factor (e.g., between 0 and 1). It should be noted that the time weighting of the ITAE terms is captured though F_osc,kand F_dir,k, but a constant penalty of 1 may be used instead of the errors to ensure that the policy gradients do not take unbounded values to ensure numerical stability. Furthermore, γ may be kept close to 1 to ensure stable policies (e.g., more weight to future rewards).

By using the above techniques, the online optimization module 404 may determine the online parameters 406 that are configured to stabilize the setpoint signal received by the controller 132. Doing so may also stabilize the control signal from the controller 132 which may enable the physical system 128 to operate better, more efficiently, and/or with less maintenance.

Example Systems

FIG. 5 illustrates an example of the offline optimization system 302 that is configured to perform HVAC controller parameter determination based on signal stabilization. The offline optimization system 302 may be included within the controller 132, a remote computing device 500 that is remote to the controller 132, a cloud computing device 502, or some combination thereof. The offline optimization system 302 includes at least one processing unit 504, at least one computer-readable storage medium 506, the offline optimization module 304, and a communication system 508.

The processing unit 504 (e.g., one or more of an application processor, central processor (CPU), graphics processor (GPU), microprocessor, digital-signal processor (DSP), or controller) executes instructions 510 (e.g., code) stored within the computer-readable storage medium 506 (e.g., a non-transitory storage devices such as a hard drive, SSD, flash memory, read-only memory (ROM), EPROM, or EEPROM) to cause the offline optimization system 302 to perform the techniques described herein. The instructions 510 may be part of an operating system and/or one or more applications of the offline optimization system 302.

The instructions 510 cause the offline optimization system 302 to act upon (e.g., create, receive, modify, delete, transmit, or display) data 512 (e.g., setpoint functions 308, application data, module data; sensor data, or I/O data). Although shown as being within the computer-readable storage medium 506, portions of the data 512 may be within a random-access memory (RAM) or a cache of the offline optimization system 302 (not shown). Furthermore, the instructions 510 and/or the data 512 may be remote to the offline optimization system 302.

The offline optimization module 304 (or portions thereof) may be comprised by the computer-readable storage medium 506 or be a stand-alone component (e.g., executed in dedicated hardware in communication with the processing unit 504 and computer-readable storage medium 506). For example, the instructions 510 may cause the processing unit 504 to implement or otherwise cause the offline optimization module 304 to determine the offline parameters 306 and cause them to be loaded onto the controller 132 as part of the control law 134.

The communication system 508 may be any wired or wireless communication system configured to communicate data between the controller 132, the remote computing device 500, and/or the cloud computing device 502. For example, the communication system 508 may be configured to transfer the offline parameters 306 from the remote computing device 500 or the cloud computing device 502 to the controller 132. The communication system 508 may also be configured to communicate data between the controller 132, the remote computing device 500, and/or the cloud computing device 502 and one or more wired or wireless networks, one or more databases, and/or the internet.

FIG. 6 illustrates an example of the online optimization system 402 that is configured to perform HVAC controller parameter determination based on signal stabilization. The online optimization system 402 may be included within the controller 132, a remote computing device 600 that is remote to the controller 132 (which may be the same or different than the remote computing device 500), a cloud computing device 602 (which may be the same or different than the cloud computing device 502), or some combination thereof. The online optimization system 402 includes at least one processing unit 604, at least one computer-readable storage medium 606, the online optimization module 404, and a communication system 608.

The processing unit 604 (e.g., one or more of an application processor, central processor (CPU), graphics processor (GPU), microprocessor, digital-signal processor (DSP), or controller) executes instructions 510 (e.g., code) stored within the computer-readable storage medium 606 (e.g., a non-transitory storage devices such as a hard drive, solid-state drive (SSD), flash memory, read-only memory (ROM), EPROM, or EEPROM) to cause the online optimization system 402 to perform the techniques described herein. The instructions 610 may be part of an operating system and/or one or more applications of the online optimization system 402.

The instructions 610 cause the online optimization system 402 to act upon (e.g., create, receive, modify, delete, transmit, or display) data 612 (e.g., observations 412, application data, module data; sensor data, or I/O data). Although shown as being within the computer-readable storage medium 606, portions of the data 612 may be within a random-access memory (RAM) or a cache of the online optimization system 402 (not shown). Furthermore, the instructions 610 and/or the data 612 may be remote to the online optimization system 402.

The online optimization module 404 (or portions thereof) may be comprised by the computer-readable storage medium 606 or be a stand-alone component (e.g., executed in dedicated hardware in communication with the processing unit 604 and computer-readable storage medium 606). For example, the instructions 610 may cause the processing unit 604 to implement or otherwise cause the online optimization module 404 to determine the online parameters 406 and cause them to be loaded onto the controller 132 as part of the control law 134.

The communication system 608 may be any wired or wireless communication system configured to communicate data between the controller 132, the remote computing device 600, and/or the cloud computing device 602. For example, the communication system 608 may be configured to transfer the online parameters 406 from the remote computing device 600 or the cloud computing device 602 to the controller 132. The communication system 608 may also be configured to communicate data between the controller 132, the remote computing device 600, and/or the cloud computing device 602 and one or more wired or wireless networks, one or more databases, and/or the internet.

Example Methods

FIG. 7 illustrates a flow chart of an example offline method 700 of HVAC controller parameter determination based on signal stabilization. The offline method 700 may be performed by the offline optimization system 302. Steps of the offline method 700 may be rearranged, split, or combined without departing from the scope defined by the appended claims.

At 702, one or more setpoint functions are received. For example, the offline optimization module 304 may receive the setpoint functions 308.

At 704, the setpoint functions are input into a simulation of an HVAC controller environment. For example, the setpoint functions 308 may be run through the simulation 312.

At 706, an objective function is evaluated on the simulation of the environment. The objective function considers oscillations of an output of the HVAC controller in the simulation. For example, the objective function 330 may be evaluated on the simulation 312. Thus the objective function 330 contains at least one term based on oscillations of the output 334 of the simulated controller 320.

At 708, parameters for the HVAC controller are determined based on evaluating the objective function at 706. For example, the optimizer 326 may determine the offline parameters 306 based on evaluating the objective function 330.

At 710, the parameters are loaded into the HVAC controller. For example, if the offline optimization system 302 is separate from the controller 132, the offline parameters 306 may be sent to the controller 132 via the communication system 508 to become at least part of the control law 134.

FIG. 8 illustrates a flow chart of an example online method 800 of HVAC controller parameter determination based on signal stabilization. The online method 800 may be performed by the online optimization system 402. Steps of the online method 800 may be rearranged, split, or combined without departing from the scope defined by the appended claims.

At 802, data corresponding to an HVAC implementation environment is received. For example, the online optimization module 404 may receive the observations 412 about an environment in which the controller 132 is, or has been, implemented. The observations 412 may come from the controller 132 via a communication link between the controller 132 and the online optimization system 402.

At 804, a reward function is evaluated based on the data received at 802. For example, the reward function 414 may be evaluated based on the observations 412.

At 806, parameters for the HVAC controller are determined based on evaluating the reward function from 804. For example, the online optimization module 404 may determine the online parameters 406 by maximizing the reward function 414.

At 808, the parameters are loaded into the HVAC controller. For example, if the online optimization system 402 is separate from the controller 132, the online parameters 406 may be sent to the controller 132 via the communication system 508 (e.g., via the communication link) to become at least part of the control law 134 (e.g., replace or update existing parameters such as the offline parameters 306 in the offline portion 408 or augment the control law 134 with the online portion 410).

EXAMPLES

Example 1: A method of generating one or more parameters for a heating, ventilation, and air conditioning (HVAC) controller, the method comprising: receiving one or more setpoint functions; inputting the setpoint functions into a simulation of an HVAC controller environment; evaluating an objective function on the simulation of the HVAC controller environment, the objective function being based on oscillations of an output of the HVAC controller in the simulation; determining, based on the evaluating the objective function, the parameters for the HVAC controller; and loading the parameters into the HVAC controller.

Example 2: The method of example 1, wherein the parameters are configured to stabilize the output of the HVAC controller.

Example 3: The method of example 1 or 2, wherein the setpoint functions are setpoint versus time functions that are heuristically derived.

Example 4: The method of any previous example, further comprising deriving the setpoint functions based on historical data, wherein the setpoint functions are setpoint versus time functions.

Example 5: The method of example 4, wherein each of the setpoint functions corresponds to a cluster of the historical data.

Example 6: The method of any previous example, further comprising receiving historical sensor data, wherein a physical system within the simulation is based on the historical sensor data.

Example 7: The method of any previous example, wherein the determining the parameters comprises using at least one of: a Bayesian Optimization, a Nelder-Mead Optimization, or a Machine Learning Technique.

Example 8: The method of any previous example, wherein: the objective function comprises a plurality of weighted terms; and at least one of the weighted terms corresponds to the oscillations of the output of the HVAC controller.

Example 9: The method of example 8, wherein another of the weighted terms corresponds to direction changes of the output of the HVAC controller.

Example 10: The method of any previous example, wherein the parameters comprise one or more of: a proportional gain, an integral gain, or a derivative gain.

Example 11: A method of tuning a heating, ventilation, and air conditioning (HVAC) controller to stabilize a setpoint signal received by the HVAC controller from an external HVAC controller, the method comprising: receiving data corresponding to an HVAC implementation environment, the data including information about the setpoint signal and feedback from an HVAC physical system controlled by the HVAC controller; evaluating a reward function based on the data, determining, based on the evaluating the reward function, one or more parameters for the HVAC controller; and loading the parameters into the HVAC controller.

Example 12: The method of example 11, wherein the reward function includes a term based on oscillations of the setpoint signal.

Example 13: The method of example 11 or 12, wherein the reward function includes a term based on direction changes of the setpoint signal.

Example 14: The method of any of examples 11-13, wherein the reward function includes a term based on a stability of the setpoint signal.

Example 15: The method of example 14, wherein the stability of the setpoint signal is based on a variation in the setpoint signal.

Example 16: The method of any of examples 11-15, wherein the loading the parameters comprises augmenting a control law of the HVAC controller such that the control law, after augmentation, comprises a first portion corresponding to other parameters and a second portion corresponding to the parameters.

Example 17: The method of any of examples 11-16, wherein the loading the parameters comprises updating existing parameters of the HVAC controller.

Example 18: The method of any of examples 11-17, wherein the determining the parameters comprises using a reinforcement learning (RL) technique or an online optimization-based technique.

Example 19: The method of any of examples 11-18, wherein the parameters comprise neural network weights.

Example 20: The method of any of examples 11-19, wherein the parameters comprise one or more of: a proportional gain, an integral gain, or a derivative gain.

Example 21: A system comprising: a processing unit configured to perform the method of any preceding example.

Example 22: A computer-readable storage media comprising instructions that, when executed by a processing unit, causes the processing unit to perform the method of any of examples 1-20.

Example 23: An offline optimization system configured to parameterize a heating, ventilation, and air-conditioning (HVAC) controller, the offline optimization system comprising: a processing unit configured to: receive one or more setpoint functions; input the setpoint functions into a simulation of an HVAC controller environment; evaluate an objective function on the simulation of the HVAC controller environment, the objective function being based on oscillations of an output of the HVAC controller in the simulation; determine, based on the evaluation of the objective function, parameters for the HVAC controller; and load the parameters into the HVAC controller.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of this disclosure. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “include(s),” “comprise(s),” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The corresponding structures, materials, acts, and equivalents of all means or step plus function elements, if any, in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present disclosure has been presented for purposes of illustration and description but is not intended to be exhaustive or limited to this disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of this disclosure. The various embodiments were chosen and described in order to best explain the principles of this disclosure and the practical application, and to enable others of ordinary skill in the art to understand this disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

What is claimed is:

1. A method of generating one or more parameters for a heating, ventilation, and air conditioning (HVAC) controller, the method comprising:

receiving one or more setpoint functions;

inputting the setpoint functions into a simulation of an HVAC controller environment;

evaluating an objective function on the simulation of the HVAC controller environment, the objective function being based on oscillations of an output of the HVAC controller in the simulation;

determining, based on the evaluating the objective function, the parameters for the HVAC controller; and

loading the parameters into the HVAC controller.

2. The method of claim 1, wherein the parameters are configured to stabilize the output of the HVAC controller.

3. The method of claim 1, wherein the setpoint functions are setpoint versus time functions that are heuristically derived.

4. The method of claim 1, further comprising deriving the setpoint functions based on historical data, wherein the setpoint functions are setpoint versus time functions.

5. The method of claim 4, wherein each of the setpoint functions corresponds to a cluster of the historical data.

6. The method of claim 1, further comprising receiving historical sensor data, wherein a physical system within the simulation is based on the historical sensor data.

7. The method of claim 1, wherein the determining the parameters comprises using at least one of: a Bayesian Optimization, a Nelder-Mead Optimization, or a Machine Learning Technique.

8. The method of claim 1, wherein:

the objective function comprises a plurality of weighted terms; and

at least one of the weighted terms corresponds to the oscillations of the output of the HVAC controller.

9. The method of claim 8, wherein another of the weighted terms corresponds to direction changes of the output of the HVAC controller.

10. The method of claim 1, wherein the parameters comprise one or more of: a proportional gain, an integral gain, or a derivative gain.

11. A method of tuning a heating, ventilation, and air conditioning (HVAC) controller to stabilize a setpoint signal received by the HVAC controller from an external HVAC controller, the method comprising:

receiving data corresponding to an HVAC controller implementation environment, the data including information about the setpoint signal and feedback from an HVAC physical system controlled by the HVAC controller;

evaluating a reward function based on the data,

determining, based on the evaluating the reward function, one or more parameters for the HVAC controller; and

loading the parameters into the HVAC controller.

12. The method of claim 11, wherein the reward function includes a term based on oscillations of the setpoint signal.

13. The method of claim 11, wherein the reward function includes a term based on direction changes of the setpoint signal.

14. The method of claim 11, wherein the reward function includes a term based on a stability of the setpoint signal.

15. The method of claim 14, wherein the stability of the setpoint signal is based on a variation in the setpoint signal.

16. The method of claim 11, wherein the loading the parameters comprises augmenting a control law of the HVAC controller such that the control law, after augmentation, comprises a first portion corresponding to other parameters and a second portion corresponding to the parameters.

17. The method of claim 11, wherein loading the parameters comprises updating existing parameters of the HVAC controller.

18. The method of claim 11, wherein the determining the parameters comprises using a reinforcement learning (RL), an online optimization-based, or an adaptive control technique.

19. The method of claim 11, wherein the parameters comprise neural network weights.

20. An offline optimization system configured to parameterize a heating, ventilation, and air-conditioning (HVAC) controller, the offline optimization system comprising:

a processing unit configured to:

receive one or more setpoint functions;

input the setpoint functions into a simulation of an HVAC controller environment;

evaluate an objective function on the simulation of the HVAC controller environment, the objective function being based on oscillations of an output of the HVAC controller in the simulation;

determine, based on the evaluation of the objective function, parameters for the HVAC controller; and

load the parameters into the HVAC controller.

Resources