ThreatSim: Securing Wattage When It’s Needed
ThreatSim: Securing Wattage When It’s Needed
Roger N. Anderson & Albert Boulanger
Lamont-Doherty Earth Observatory
New York, New York 10027
The terrorist events of 9/11, when combined with the power
shortages we are all experiencing across the country this summer, suggest that
we as a country are not well prepared to meet what is sure to be a growing
threat to our ability to deliver electricity where its needed, when it is
needed, in the future. Our Power Infrastructure, in particular, delivers the
electricity that our entire economy depends upon. Take out the grid for more than about 12
hours (the operational maximum of most back-up power generation systems) and
you shut down the internet, stop all bank transfers, credit card and cash
machine transactions, pumps and compressors needed to transport drinking water,
fill your car or truck up with gasoline and diesel, deliver natural gas through
pipelines and storage facilities no longer work, and stop lights and other key
components of the transportation system like railroad and subway power and
signaling fail. Most manufacturing comes to a standstill (c.f. NY Times,
Figure 1. Thermal image of
power consumption in
Is it realistic to think that a coordinated attack on the national power grid could succeed in shutting down electricity across the country for a substantial time? We believe the threat is real, and that training in response and prevention should be a highest priority of the new Homeland Defense Department.
What does it take to build an accurate Electric Grid Threat Simulator? First and foremost, the topologies of the regional high voltage grids managed by Regional Transmission Operators (RTO’s) and Independent System Operators (ISO’s) must be combined on the computer with local power grids, and more generally, distribution networks managed by utilities such as ConEd in New York City, Keyspan on Long Island, PSE&G in New Jersey, and hundreds of other public and private generation and distribution companies across the country (Figure 2).
Figure 2. Northeastern high voltage electric grid on left is connected to local low voltage grids through vulnerable switching and transformer sub-stations (triangles at right).
An Electricity Threat Simulator
The burden of developing efficient electric grid threat simulators for war gaming must be shared by city, state and federal governments, the national labs, academia and electrical research institutions, but also by the power distribution, generation, and service companies themselves. It is they who will use the simulators to train their operators. Below we review the current state of affairs in Power Control Systems(PCS’s) in this country and discuss the need for anti-terrorist training and simulation software that will allow us to determine the true threats and appropriate responses to sustained, coordinated attacks on the electric grid.
Of particular concern are the PCS’s that control the production and distribution of electricity throughout the country. Below, we show that they will have to be very much more sophisticated and integrated than at present. It will take a whole new generation of technologies to unite the topologies of the electricity grid on its many scales. Luckily, much of this systems integration technology has been developed recently by the aerospace, automotive and manufacturing industries. The task at hand is its adoption by an electricity industry that is historically late adaptor of new technologies. That, in turn, will require experts and expertise imported into a corporate and governmental regulatory culture bred out of the electrical utilities of our grandparents: one that is notorious for being insular and slow to respond to technological change (not late, but the very last technological adapters).
9/11 has emphasized the need to fix the Electric Grid Threat Simulation and modeling system before the next attack. With this need in mind, let’s review the present state-of-preparedness of Power Control Systems, and compare them to more modern, integrated command-and-control systems in the military and from other industries. We argue that there is little doubt that fixes must come quickly or our very economic stability may be at risk.![endif]>![if>
Figure 3. Seamless communications among Power Control Systems of the RTO, ISO and local utilities will be required to redistribute electricity in case of future terrorist attacks against the grid. Pictured here is the Connecticut Valley Electric Exchange. It’s computers do not communicate easily with those from the regional network managers.
Today’s Power Control System
Developing an adequate Electric Grid Threat Simulator to train operators of the multiple organizations required to respond to coordinated attacks is not a matter of simply joining the various computer systems that currently run the grids (c.f. Figure 3). As the Pentagon found in trying to integrate computer systems from the different armed forces, predictability declines as the integration tasks become more and more complex. Breakdowns occur that have not been foreseen from historical experiences with the smaller, more linear systems that in the past acted independently.
We participated in a detailed
analyses of current and planned technical improvements in the transmission and
generation Power Control System of a major regional electricity supplier
considered to be a technological leader in the electricity industry. The grid under its management supplies a
major urban area of the
The PCS in basic concept is really very simplistic. Since consumption is not known second-to-second (meters are not analyzed for consumption patterns, but are instead used only by the billing department), the computer merely balances the spin of power generator turbines under its control to keep the AC of the grid at as close to 60 Hz as possible. Any less and the computer revs up the RPM’s of turbine generators; any more and the computer sells the excess power to the regional grid through itsbrokerage. Problems appear if the frequency of the AC in the transmission grid begins to drop below 59.99997 Hz (five nines) for computers, and below 59.997 (three nines) for electric motors, and then “all hell breaks lose” to quote an operator from the PCS.
The inflow and outflow of electricity is monitored in real-time at all Interconnect sites and at critical junctions of the company’s own transmission lines. The data are transmitted to the PCS every 2 seconds. Simultaneously, real-time costs are computed for all generators used to produce power for the company. A diverse mix of natural gas, coal, oil, steam and nuclear energy fuels these generators. Costs to produce power for all generators and fuel combinations are constantly compared with prices available from suppliers. The PCS automatically selects the cheapest alternative at any time for adding power to the grid.
In addition, the PCS manages a one-way, real-time Supervisory Control and Data Acquisition (SCADA) network that sends an additional 230,000 measurement inputs to the PCS every 30 seconds. For emergencies such as hurricanes and tornadoes, the PCS has computerized controls that extend directly into circuit breakers for computer banks and expensive electrical equipment of major business customers.
The problem is that both the software and hardware of the PCS were designed (and often built) decades ago under the assumption that excess power would always be readily available from other utilities on the regional grids. If the computer didn’t have enough generators to meet demand, it would purchase electricity from the regional grid at a fixed price. With de-regulation of the electricity industry, thousands of independent electricity producers are popping up all over the country to sell expensive power at times of high demand. In addition, choke points are popping up at critical and varying junctions of the electricity grid all over the nation.
Most operator tasks are not automated within the PCS, but depend upon the experience and awareness of the people themselves. The operational processes of the staff are procedure-based and well-documented, but are available only in paper manuals. The company does not use new software capabilities available for automating alarms, work-tag tracking, and the opening and closing of circuit breakers remotely. No trend analyses or problem resolution is done computationally, nor is a data historian used (common practices in other industries). The “technology cycle” for new computer software and hardware (still paired) has historically been 14-16 years, with the latest upgrade the most rapid in company’s history (1988 to 2000). We found the PCS operators “bracing for a long next few years” and the software vendor lamenting the “incredibly long sales cycle in the power industry.”
Perhaps more critical in today’s world, while there is an excellent and well practiced plan for restoration of services from natural disaster outages (common), there is still nothing about terrorism (not yet anticipated when the procedures were last updated in 2000). Training has become a special issue: the operational staff is “too busy”, and has erratically attended organization, and training sessions. The SCADA data that is used for training must be real-time, and cannot be replayed for instructional purposes. No case histories are used. There was no training simulator in this software update cycle, a casualty of budget cuts. It is ironic that the cost to maintain an up-to-date simulator became too high because of the rapidly changing configuration and complexity of the national power grid, and particularly of the rapidly expanding power input into the company’s grid from independent power producers and customer co-generation facilities as the result of de-regulation.
The major drivers to operational costs of the company are Operations and Maintenance (O&M) of its facilities. Overhauls of generators and reconfigurations and modernizations of its power grid must be scheduled well in advance and coordinated with other regional suppliers in order to be transparent to customers. Software updates must be handled with particular care.
The company upgraded the PCS computer systems in 2001 to a client-server, UNIX architecture, supported by an Oracle database, and modern graphical User Interfaces (GUI). However, the networkability of the system still leaves something to be desired. Its Ethernet is just now being upgraded to 100Mbps, and top management for security reasons forbids use of the Internet for communications with the field and its own SCADA systems. The company Intranet is primitive at best, and no Microsoft products are found in the PCS at all (perhaps the last remaining industry for Bill Gates to conquer).
Operators are NOT utilizing many of the new features of the PCS software system. For example, the 2001 software design supports interoperability between the two types of UNIX workstations: one to control interaction between the company and outside power suppliers, and one for control of internal company power distribution. In spite of this feature, operators of one system cannot call up or interact with the other. Operators are trained to operate both systems, and they do rotate from one to the other on a regular schedule, but they are not allowed to let the computers communicate.
Work orders to substations and power linesmen are created on a computer, printed out, and then FAXED to the field offices by the PCS operators. These work orders are not tracked further by the PCS, although it has the capability to manage electronic work orders and automatically send e-mails.
Use of a Threat Simulator in the PCS?
Optimization within the PCS is a manual process executed by experienced personnel without much computer help. In our Case Study, expert systems and Artificial Intelligence (AI) technologies for the complex scheduling required for power management “were looked at years ago by IBM. They tried to develop a prototype system. However, IBM declared their process to be too complex, and moved on to easier markets.” This analysis was done in 1985, and the power company still considers it valid. IBM’s opinion is that they tried to develop a prototype of too much of the operations at once, back then. New neural network and data mining technologies should make this a “very doable task in today’s computational world” according to IBM.
It is ironic that the added complexity of the system made the keeping of an accurate computer simulator expendable. That would be like an aerospace company saying that its new planes are too complex to create a training environment for pilots -- other than flying the machine itself.
A Coordinated Terrorist Attack on the Power Grid
A new generation of American
engineers and managers must be trained in electricity production and
distribution under threat from terrorism.
An Electric Grid Threat Simulator for the PCS is required that will
train in the complexities introduced by terrorism, combined with the coincident
convergence of supply and demand across the electricity grid of
Automated variance detection, combined with “make-it-so” problems-to-solutions mappings, is a non-linear inverse problem that requires a simulator to teach operators how to solve. The integration of technologies required for this cross-system optimization problem will require an unprecedented degree of interdisciplinary collaboration among the various operators of the topology of the grid, from local to regional and national and international, in and of itself.
In a grid model with hundreds of thousands of failure points, training becomes problematic without proper computer simulators. The Electric Grid Threat Simulator must not be too general. It must focus on critical failures that have specific remedies. The chaining of these events is where the simulator becomes powerful. Each element of a transfer function that covers both the regional, national, and local grid topologies can then be transformed into responses. Closure can then be computed. Global behavior is then determined from the synthesis of the component models. Put directly, what are the threats, and what are the failure points. These must be determined through what we call a “Learning Harness” wrapped over the topological models of the various scales of the grid.
Consider a coordinated attack on the local components of the power grid. In order, they attack the microwave communications of SCADA data, then a power generator and a transformer substation, all within several minutes of each other. The cascading failures result in escalating problems throughout the local grid that don’t at first affect the national high voltage grid (c.f., Peerenboom, 2001).
now, however, that this attack is followed a few minutes later by a coordinated
attack on the high voltage regional transmission grid. The first hit causes
4. Simulated terrorist attack on the Northeastern Power Grid. Note how quickly
the problem spreads from
Design of a Electric Grid Threat Simulator
In general, few PCS simulation environments exist to train new engineers and managers about how to respond to crisis scenarios of any kind. The case study revealed that fault detection and tracking of what has failed, where, and when, remains dependent upon operator experience and “instinct”. We hope that the incentive for change got a significant boost on 9/11. No question the PCS can be better supported by computer intelligence in the form of a Electric Grid Threat Simulator for War Gaming.
As we said, we believe a Learning Harness is required in order to build such an Electric Grid Threat Simulation environment. The Learning Harness represents first a fundamental mapping of the business, security, environmental, and engineering processes and activities required to maintain and operate the grid under attack, and then a reinforcement learning feedback loop to optimize decisions across systems (Figure 5). This explicitly requires a concerted, confidential, unprecedented collaboration of all involved parties. These processes must be known in enough detail to develop computerized variance detection and contribute weights to the “credit assignment” problem of what to do to anticipate and fix the problems caused by terrorists. Known solutions to problems are kept in a best practices data historian. The system must learn from mistakes by tracking performance metrics of previous actions in much the manner of a chess, backgammon or checkers program. A key technology we use is the Suitability Matrix(sm). This is a linked set of matrix representations (a set of spread sheets) that use generalized weights as the values of the cells. It maps import of an attribute (a problem) to possible decisions (a solution). These matrices are "populated" using reinforcement learning, a type of dynamic programming, which optimizes decision-making under uncertainty and time (4D learning). Data gathering for such a system will provide the following:
- Eliminate the "wish I could have seen it coming" through multiple scenario planning
- Estimate risk on all decisions
- Identify solutions quickly
- Eliminate latency in getting the right actions to the right people
- Verify that actions are being executed properly in the field
Figure 5. The Electric Grid Threat Simulator must connect software applications that constantly re-compute each bullet indicated above – this framework has already been enacted for the Oil, Internet, and Aerospace industries (Bertsekas and Tsitsiklis, 1995).
The key foundation to our Threat Simulator is an adaptive feedback control system, which involves the solving of implicit and explicit inverse algorithms in a controller to minimize error and arrive at an optimal solution (Bertsekas and Tsitsiklis, 1996.). We adopt a mixture of AI, operations research, and systems engineering to build our controller. AI works well with discrete, richly structured, and nonlinear problems and control theory offers an overall framework for solving the linear and nonlinear components of the system-wide problem (Werbos, 1999, 2001). Certainty factors or probabilities to represent ranking of alternatives can be adapted by the Electric Grid Threat Simulator learning system over time (e.g. Neuneier, 1995, Werbos, 1998, 1999). This online learning is key to a successful Electric Grid Threat Simulator. In sum, our Threat Simulator implements a unified framework for generating corrective actions that individuals and automated systems in the organization must take to align the business to reality in the face of multiple threats.
The Learning harness uses metrics to gauge feedback and train for best responses. It uses a discrete forward model to compute event propagation. This is the same problem encountered by the Internet, and there are management programs that do just that to reroute message traffic in case of a failure in a router or a series of routers, automatically. The use of Codebooks within Internet fault detection software (Yemini et al, 1996, 1997, 2001) is a good example (see www.smarts.com). Causal analysis determines how each problem propagates through the topology, then “cost-to-go” simulation within a reinforcement-learning framework is used to determine automatic corrective actions Update the learning harness several times with varying simulated disasters, and it learns the correct responses. Priorities in response are then developed by the system depending upon a damage metric (Anderson et al, 1996, 1998a,b,c, 1999, 2000, 2001a,b, 2002).
electrical grid of
Anderson, R.N., Boulanger, A., Bagdonas, E., He, W., and Xu, L., Method for Identifying Subsurface Fluid Migration and Drainage Pathways in and Among Oil and Gas Reservoirs Using 3-D and 4-D Seismic Imaging, U.S. Patent 5,586,082, 1996.
Anderson, R.N., et al, Quantitative Tools link Portfolio Management with use of Technology, Oil Gas Journal, Nov. 30, p. 48, 1998a.
Anderson, R.N., Oil Production in the 21st Century, Sci. Am., 278, p. 86-91, 1998c.
Anderson, R.N., Esser, W., Energy Company as Advanced Digital Enterprise, American Oil & Gas Reporter, Jan. 2001a.
Anderson, R.N., Boulanger, A., He, W., Xu, L., Method and System for Automated Support of Real-Time 4D Business Decisions for the Upstream Petroleum Industry, U.S. Patent, applied for, 2001b.
Anderson, R.N., Boulanger, A., Mello, U., He, W., Wiggins, W., and Xu, L., 4-D Seismic Reservoir Simulation and Characterization Method and System, U.S. Patent, applied for, 2002.
Bertsekas, D.P., Tsitsiklis, J. N., Neuro-Dynamic Programming, Athena Scientific, 1996.
Neuneier, R., Optimal Strategies with density-Estimating Neural Networks, ICANN 95, Paris, 1995.
Peerenboom, J., Infrastructure interdependencies: Overview of concepts and terminology, Argonne National Laboratory, 2001.
Werbos, P.J., Elastic Fuzzy Logic System, U.S. Patent 5,751,915, 1998.
Werbos, P.J., Maximizing Long-Term Gas Industry Profits in two Minutes using Neural Network Methods, IEEE trans. On Systems, Man, and Cybernetics, Vol. 19, No. 2, 315-333, 1989. , U.S. Patent 5,924,085, 1999.
Werbos, P.J., 3-Brain Architecture for an Intelligent decision and Control System, U.S. Patent 6,169,981, 2001.
Yemini, S., Kliger, S., Mozes, E., Yemini, Y., and Ohsie, D., High Speed and Robust Event Correlation. IEEE Communications, May, 1996.
Yemini, Y., Yemini, S., Kliger, S., Apparatus and Method for Anaylzing and Correlating Events in a System using a Casualty Matrix, U. S. Patent 5,661,668, 1997.
Yemini, Y., Yemini, S., Kliger, S., Apparatus and Method for Event Correlation and Problem Reporting, U. S. Patent 6,249,755, 2001.