ThreatSim: Securing Wattage When It’s Needed
ThreatSim: Securing Wattage When It’s Needed
Roger N.
Anderson & Albert Boulanger
Lamont-Doherty
Earth Observatory
New York , New York 10027
The terrorist events of 9/11, when combined with the power
shortages we are all experiencing across the country this summer, suggest that
we as a country are not well prepared to meet what is sure to be a growing
threat to our ability to deliver electricity where its needed, when it is
needed, in the future. Our Power Infrastructure, in particular, delivers the
electricity that our entire economy depends upon. Take out the grid for more than about 12
hours (the operational maximum of most back-up power generation systems) and
you shut down the internet, stop all bank transfers, credit card and cash
machine transactions, pumps and compressors needed to transport drinking water,
fill your car or truck up with gasoline and diesel, deliver natural gas through
pipelines and storage facilities no longer work, and stop lights and other key
components of the transportation system like railroad and subway power and
signaling fail. Most manufacturing comes to a standstill (c.f. NY Times,

Figure 1. Thermal image of
power consumption in
Is it realistic to think that a coordinated attack on the national
power grid could succeed in shutting down electricity across the country for a
substantial time? We believe the threat is real, and that training in response
and prevention should be a highest priority of the new Homeland Defense
Department.
What does it take to build an accurate Electric Grid Threat Simulator? First and foremost, the topologies of the regional high voltage grids managed by Regional Transmission Operators (RTO’s) and Independent System Operators (ISO’s) must be combined on the computer with local power grids, and more generally, distribution networks managed by utilities such as ConEd in New York City, Keyspan on Long Island, PSE&G in New Jersey, and hundreds of other public and private generation and distribution companies across the country (Figure 2).

Figure 2. Northeastern high voltage electric grid on left
is connected to local low voltage grids through vulnerable switching and
transformer sub-stations (triangles at right).
An Electricity Threat Simulator
The burden of developing efficient electric grid threat simulators
for war gaming must be shared by city, state and federal governments, the
national labs, academia and electrical research institutions, but also by the
power distribution, generation, and service companies themselves. It is they who
will use the simulators to train their operators. Below we review the current
state of affairs in Power Control Systems(PCS’s) in this country and discuss
the need for anti-terrorist training and simulation software that will allow us
to determine the true threats and appropriate responses to sustained,
coordinated attacks on the electric grid.
Of
particular concern are the PCS’s that control the production and distribution
of electricity throughout the country. Below, we show that they will have to be
very much more sophisticated and integrated than at present. It will take a
whole new generation of technologies to unite the topologies of the electricity
grid on its many scales. Luckily, much of this systems integration technology
has been developed recently by the aerospace, automotive and manufacturing
industries. The task at hand is its adoption by an electricity industry that is
historically late adaptor of new technologies.
That, in turn, will require experts and expertise imported into a corporate
and governmental regulatory culture bred out of the electrical utilities of our
grandparents: one that is notorious for being insular and slow to respond to
technological change (not late, but the very last technological adapters).

9/11 has emphasized the
need to fix the Electric Grid Threat Simulation and modeling system before the next attack. With this need in mind, let’s review the
present state-of-preparedness of Power Control Systems, and compare them to
more modern, integrated command-and-control systems in the military and from
other industries. We argue that there is
little doubt that fixes must come quickly or our very economic stability may be
at risk.
Figure
3. Seamless communications among Power Control Systems of the RTO, ISO and
local utilities will be required to redistribute electricity in case of future
terrorist attacks against the grid.
Pictured here is the Connecticut Valley Electric Exchange. It’s
computers do not communicate easily with those from the regional network
managers.
Today’s Power Control System
Developing an adequate Electric Grid Threat
Simulator to train operators of the multiple organizations required to respond
to coordinated attacks is not a matter of simply joining the various computer
systems that currently run the grids (c.f. Figure 3). As the Pentagon found in
trying to integrate computer systems from the different armed forces,
predictability declines as the integration tasks become more and more complex.
Breakdowns occur that have not been foreseen from historical experiences with
the smaller, more linear systems that in the past acted independently.
We participated in a detailed
analyses of current and planned technical improvements in the transmission and
generation Power Control System of a major regional electricity supplier
considered to be a technological leader in the electricity industry. The grid under its management supplies a
major urban area of the
The PCS in basic concept is really very simplistic. Since consumption is not known second-to-second (meters are not analyzed for consumption patterns, but are instead used only by the billing department), the computer merely balances the spin of power generator turbines under its control to keep the AC of the grid at as close to 60 Hz as possible. Any less and the computer revs up the RPM’s of turbine generators; any more and the computer sells the excess power to the regional grid through itsbrokerage. Problems appear if the frequency of the AC in the transmission grid begins to drop below 59.99997 Hz (five nines) for computers, and below 59.997 (three nines) for electric motors, and then “all hell breaks lose” to quote an operator from the PCS.
The inflow and outflow of electricity is monitored in real-time at
all Interconnect sites and at critical junctions of the company’s own
transmission lines. The data are
transmitted to the PCS every 2 seconds. Simultaneously, real-time costs are
computed for all generators used to produce power for the company. A diverse mix of natural gas, coal, oil,
steam and nuclear energy fuels these generators. Costs to produce power for all generators and
fuel combinations are constantly compared with prices available from suppliers.
The PCS automatically selects the cheapest alternative at any time for adding
power to the grid.
In addition, the PCS manages a one-way, real-time Supervisory Control and Data Acquisition (SCADA) network that sends an additional 230,000 measurement inputs to the PCS every 30 seconds. For emergencies such as hurricanes and tornadoes, the PCS has computerized controls that extend directly into circuit breakers for computer banks and expensive electrical equipment of major business customers.
The problem is that both the software and
hardware of the PCS were designed (and often built) decades ago under the
assumption that excess power would always be readily available from other
utilities on the regional grids. If the
computer didn’t have enough generators to meet demand, it would purchase
electricity from the regional grid at a fixed price. With de-regulation of the
electricity industry, thousands of independent electricity producers are
popping up all over the country to sell expensive power at times of high
demand. In addition, choke points are
popping up at critical and varying junctions of the electricity grid all over
the nation.
Human-in-the-Loop
Most operator tasks are not automated within the PCS, but depend
upon the experience and awareness of the people themselves. The operational processes of the staff are
procedure-based and well-documented, but are available only in paper manuals.
The company does not use new software capabilities available for automating
alarms, work-tag tracking, and the opening and closing of circuit breakers
remotely. No trend analyses or problem resolution is done computationally, nor
is a data historian used (common practices in other industries). The
“technology cycle” for new computer software and hardware (still paired) has
historically been 14-16 years, with the latest upgrade the most rapid in
company’s history (1988 to 2000). We found the PCS operators “bracing for a
long next few years” and the software vendor lamenting the “incredibly long
sales cycle in the power industry.”
Perhaps more critical in today’s world, while there is an
excellent and well practiced plan for restoration of services from natural
disaster outages (common), there is still nothing about terrorism (not yet
anticipated when the procedures were last updated in 2000). Training has become a special issue: the
operational staff is “too busy”, and has erratically attended organization, and
training sessions. The SCADA data that
is used for training must be real-time, and cannot be replayed for
instructional purposes. No case
histories are used. There was no
training simulator in this software update cycle, a casualty of budget cuts. It
is ironic that the cost to maintain an up-to-date simulator became too high
because of the rapidly changing configuration and complexity of the national
power grid, and particularly of the rapidly expanding power input into the
company’s grid from independent power producers and customer co-generation
facilities as the result of de-regulation.
The major drivers to operational
costs of the company are Operations and Maintenance (O&M) of its
facilities. Overhauls of generators and reconfigurations and modernizations of
its power grid must be scheduled well in advance and coordinated with other
regional suppliers in order to be transparent to customers. Software updates
must be handled with particular care.
The company upgraded the PCS computer
systems in 2001 to a client-server, UNIX architecture, supported by an Oracle
database, and modern graphical User Interfaces (GUI). However, the networkability of the system
still leaves something to be desired. Its Ethernet is just now being upgraded
to 100Mbps, and top management for security reasons forbids use of the Internet
for communications with the field and its own SCADA systems. The company
Intranet is primitive at best, and no Microsoft products are found in the PCS at
all (perhaps the last remaining industry for Bill Gates to conquer).
Operators are NOT utilizing many of
the new features of the PCS software system. For example, the 2001 software
design supports interoperability between the two types of UNIX workstations:
one to control interaction between the company and outside power suppliers, and
one for control of internal company power distribution. In spite of this feature, operators of one
system cannot call up or interact with the other. Operators are trained to operate both
systems, and they do rotate from one to the other on a regular schedule, but
they are not allowed to let the computers communicate.
Work orders to substations and power linesmen are created on a
computer, printed out, and then FAXED to the field offices by the PCS
operators. These work orders are not
tracked further by the PCS, although it has the capability to manage electronic
work orders and automatically send e-mails.
Use of a Threat Simulator in the PCS?
Optimization within the PCS is a manual process executed by experienced personnel without much computer help. In our Case Study, expert systems and Artificial Intelligence (AI) technologies for the complex scheduling required for power management “were looked at years ago by IBM. They tried to develop a prototype system. However, IBM declared their process to be too complex, and moved on to easier markets.” This analysis was done in 1985, and the power company still considers it valid. IBM’s opinion is that they tried to develop a prototype of too much of the operations at once, back then. New neural network and data mining technologies should make this a “very doable task in today’s computational world” according to IBM.
It is ironic that the added complexity of the system made the
keeping of an accurate computer simulator expendable. That would be like an aerospace company
saying that its new planes are too complex to create a training environment for
pilots -- other than flying the machine itself.
A Coordinated Terrorist Attack on the Power Grid
A new generation of American
engineers and managers must be trained in electricity production and
distribution under threat from terrorism.
An Electric Grid Threat Simulator for the PCS is required that will
train in the complexities introduced by terrorism, combined with the coincident
convergence of supply and demand across the electricity grid of
Automated variance detection, combined with “make-it-so” problems-to-solutions mappings, is a non-linear inverse problem that requires a simulator to teach operators how to solve. The integration of technologies required for this cross-system optimization problem will require an unprecedented degree of interdisciplinary collaboration among the various operators of the topology of the grid, from local to regional and national and international, in and of itself.
In a grid model with hundreds of thousands of failure points, training becomes problematic without proper computer simulators. The Electric Grid Threat Simulator must not be too general. It must focus on critical failures that have specific remedies. The chaining of these events is where the simulator becomes powerful. Each element of a transfer function that covers both the regional, national, and local grid topologies can then be transformed into responses. Closure can then be computed. Global behavior is then determined from the synthesis of the component models. Put directly, what are the threats, and what are the failure points. These must be determined through what we call a “Learning Harness” wrapped over the topological models of the various scales of the grid.
Consider a coordinated attack on the local components of the power grid. In order, they attack the microwave communications of SCADA data, then a power generator and a transformer substation, all within several minutes of each other. The cascading failures result in escalating problems throughout the local grid that don’t at first affect the national high voltage grid (c.f., Peerenboom, 2001).
Suppose
now, however, that this attack is followed a few minutes later by a coordinated
attack on the high voltage regional transmission grid. The first hit causes
problems in

Figure
4. Simulated terrorist attack on the Northeastern Power Grid. Note how quickly
the problem spreads from
Design of a Electric Grid Threat Simulator
In general, few PCS simulation environments exist to train new engineers and managers about how to respond to crisis scenarios of any kind. The case study revealed that fault detection and tracking of what has failed, where, and when, remains dependent upon operator experience and “instinct”. We hope that the incentive for change got a significant boost on 9/11. No question the PCS can be better supported by computer intelligence in the form of a Electric Grid Threat Simulator for War Gaming.
As we said,
we believe a Learning Harness is required in order to build such an Electric
Grid Threat Simulation environment. The Learning Harness represents first a
fundamental mapping of the
business, security, environmental, and engineering processes and activities
required to maintain and operate the grid under attack, and then a
reinforcement learning feedback loop to optimize decisions across systems
(Figure 5). This explicitly requires a concerted, confidential, unprecedented
collaboration of all involved parties. These processes must be known in enough
detail to develop computerized variance detection and contribute weights to the
“credit assignment” problem of what to do to anticipate and fix the problems
caused by terrorists. Known solutions to problems are kept in a best practices
data historian. The system must learn from mistakes by tracking performance
metrics of previous actions in much the manner of a chess, backgammon or
checkers program. A key technology we use is the Suitability Matrix(sm). This
is a linked set of matrix representations (a set of spread sheets) that use
generalized weights as the values of the cells. It maps import of an attribute
(a problem) to possible decisions (a solution). These matrices are
"populated" using reinforcement learning, a type of dynamic programming,
which optimizes decision-making under uncertainty and time (4D learning). Data
gathering for such a system will provide the following:
- Eliminate the "wish I could
have seen it coming" through multiple scenario planning
- Estimate risk on all decisions
- Identify solutions quickly
- Eliminate latency in getting the
right actions to the right people
- Verify that actions are being
executed properly in the field

Figure
5. The Electric Grid Threat Simulator must connect software applications that
constantly re-compute each bullet indicated above – this framework has already
been enacted for the Oil, Internet, and Aerospace industries (Bertsekas and Tsitsiklis, 1995).
The key foundation to our Threat Simulator is an adaptive feedback control system, which involves the solving of implicit and explicit inverse algorithms in a controller to minimize error and arrive at an optimal solution (Bertsekas and Tsitsiklis, 1996.). We adopt a mixture of AI, operations research, and systems engineering to build our controller. AI works well with discrete, richly structured, and nonlinear problems and control theory offers an overall framework for solving the linear and nonlinear components of the system-wide problem (Werbos, 1999, 2001). Certainty factors or probabilities to represent ranking of alternatives can be adapted by the Electric Grid Threat Simulator learning system over time (e.g. Neuneier, 1995, Werbos, 1998, 1999). This online learning is key to a successful Electric Grid Threat Simulator. In sum, our Threat Simulator implements a unified framework for generating corrective actions that individuals and automated systems in the organization must take to align the business to reality in the face of multiple threats.
The Learning harness uses metrics to gauge feedback and
train for best responses. It uses a discrete forward model to compute event
propagation. This is the same problem encountered by the Internet, and there
are management programs that do just that to reroute message traffic in case of
a failure in a router or a series of routers, automatically. The use of
Codebooks within Internet fault detection software (Yemini et al, 1996, 1997,
2001) is a good example (see www.smarts.com). Causal analysis determines how each problem
propagates through the topology, then “cost-to-go” simulation within a
reinforcement-learning framework is used to determine automatic corrective
actions Update the learning harness several times with varying simulated
disasters, and it learns the correct responses. Priorities in response are then
developed by the system depending upon a damage metric (Anderson et al, 1996,
1998a,b,c, 1999, 2000, 2001a,b, 2002).
Summary
The
electrical grid of
Bibliography:
Anderson, R.N., Boulanger, A., Bagdonas, E., He, W., and Xu, L., Method for Identifying Subsurface Fluid Migration and Drainage Pathways in and Among Oil and Gas Reservoirs Using 3-D and 4-D Seismic Imaging, U.S. Patent 5,586,082, 1996.
Anderson,
R.N., et al, Quantitative Tools link Portfolio Management with use of
Technology, Oil Gas Journal, Nov. 30, p. 48, 1998a.
Anderson,
R.N., Oil Production in the 21st Century, Sci. Am., 278, p. 86-91,
1998c.
Anderson, R.N., Esser, W., Energy Company as Advanced Digital Enterprise, American Oil & Gas Reporter, Jan. 2001a.
Anderson, R.N., Boulanger, A., He, W., Xu, L., Method and System for Automated Support of Real-Time 4D Business Decisions for the Upstream Petroleum Industry, U.S. Patent, applied for, 2001b.
Anderson, R.N., Boulanger, A., Mello, U., He, W., Wiggins, W., and Xu, L., 4-D Seismic Reservoir Simulation and Characterization Method and System, U.S. Patent, applied for, 2002.
Bertsekas, D.P., Tsitsiklis, J. N., Neuro-Dynamic Programming, Athena Scientific, 1996.
Neuneier, R., Optimal Strategies with density-Estimating Neural Networks, ICANN 95, Paris, 1995.
Peerenboom, J., Infrastructure interdependencies: Overview of concepts and terminology, Argonne National Laboratory, 2001.
Werbos, P.J., Elastic Fuzzy Logic System, U.S. Patent 5,751,915, 1998.
Werbos, P.J., Maximizing Long-Term Gas Industry Profits in two Minutes using Neural Network Methods, IEEE trans. On Systems, Man, and Cybernetics, Vol. 19, No. 2, 315-333, 1989. , U.S. Patent 5,924,085, 1999.
Werbos, P.J., 3-Brain Architecture for an Intelligent decision and Control System, U.S. Patent 6,169,981, 2001.
Yemini, S., Kliger, S., Mozes, E., Yemini, Y., and Ohsie, D., High Speed and Robust Event Correlation. IEEE Communications, May, 1996.
Yemini, Y., Yemini, S., Kliger, S., Apparatus and Method for Anaylzing and Correlating Events in a System using a Casualty Matrix, U. S. Patent 5,661,668, 1997.
Yemini, Y., Yemini, S., Kliger, S., Apparatus and Method for Event Correlation and Problem Reporting, U. S. Patent 6,249,755, 2001.


Comments