Three Big Band-Aids

Gay Gordon-Byrne | Aug 25, 2011

Share/Save  
These three terms are deeply deceptive marketing terms intended to hide the very real problems of electronic equipment in the field. The fact is that electronics break a lot. If they didn't break, networks wouldn't need to be self-healing, scada systems wouldn't need to be fault tolerant or fully redundant. That these marketing concepts even exist is testimony to the weaknesses of the equipment. No one should strive to seek these "features" in their equipment choices. The goal should be to deploy equipment that is so solid that equipment failure is a non-issue.

Self-Healing is a networking concept behind many designs intended to avoid loss of signal when a router or other relay device is not available to do its job. This works to get the data to the destination, but does not repair or replace the out of service node. Eventually the broken device has to be repaired and returned to service. Repair of each communications node, large or small, will require a technician and probably a "truck roll". The more inaccessible the locations (those atop power poles or radio towers), the more costly the repair.

Fault-Tolerant is synonym for Redundant. A fault tolerant system has additional components intended to take over in the event of hardware failure. As more and more applications are developed that require 7 x 24 availability, more and more equipment is being purchased with redundant parts. In these cases -- high availability is being achieved with buying increasingly large band-aids mitigating system downtime, but without improving reliability. Any fault tolerant system failure requires repair or replacement or the system is no longer buffered for failure. If the equipment is located in a data center, the IT department is paying a hefty premium for maintenance services by the vendor on the order of 12% -15% of original cost each year. If the equipment is located in the field, the O&M department will be dispatching a truck.

Truck rolls are extremely costly. NARUC estimated for purposes of discussion that each truck roll costs a utility $275 without considering the cost of parts. The more redundant parts that fail -- the more costly the system. We all understand that having a spare tire in the trunk is great backup, but once the spare is in use one still has to replace the spare. We would strongly prefer not to have a flat. Everyone considering deployment of self-healing or redundant systems must take into account that these systems do not physically repair themselves and to take repair costs into consideration just as if there were no fault-tolerant parts.

Locating equipment that is suitably solid is a tremendous challenge at this time. There are no "Consumer Reports" to guide choices and no central clearinghouse (yet) for utilities to share information. There are efforts afoot to create these essential collaborations using private resources, but in the meantime utilities are on their own to make good decisions. The best advice we can offer at this time is for utilities to evaluate all their current equipment as to the current failure rate of the hardware, and then continue to make the same calculations for every product in test or deployment to compare them on the basis of hardware failure rate. Even very small scale deployments will yield useful results when tracked over time.

Utilities interested in a wider collaboration and standardized sharing of equipment experience should contact their appropriate trade organization (APPA, NRECA, EEI, and regional), Grid21 -- a new non-profit outgrowth of the Gridwise Alliance, or TekTrakker.

Related Topics

Comments

Gay's article is sobering and very true.

Consider utility companies can't easily buy highly reliable equipment and don't have luxuries like Consumer Reports to compare products because in essence their situation is far worse than consumers going shopping.

Firstly utility companies are not regular shoppers like consumers are. At any given time the total number of utility companies actively shopping are very limited. In other words the market for their equipment is very spotty and unpredictable. Therefore no one is very interested in researching the products for the market like any sort of Consumer Reports.

Secondly, utility companies' resources for buying field electronics equipment must either come from rate base income or from government handouts. Both of these sources tend to be unpredictable at best, because of heavy regulation on rate increases, and politics set within the government of the day.

Thirdly, utility companies are usually worse than consumers shopping for field electronics because they tend to want the most for the least cost - i.e. they count every penny when buying from vendors. Hence they are not prepared to pay for highly reliability designed into field electronics (the way the military are routinely accustomed to do) because this would add unpalatable extra costs to them.

Good examples are smart meters in AMI systems. These meters are being designed to operate with a field life expectancy of 10 years or less, and can often have an initial failure rate of several percent on initial deployment. But this is not too surprising when they are being sold to utility companies for well under $100, where manufacturers must make them at a cost of less than half of that to make a profit.

So utilities better be prepared to make lots of truck rolls down the road if they want to get seriously into smart grid systems, or they alternatively must persuade governments to enable them the resources to pay a lot more for their field electronics equipment.

The goal should be to deploy equipment that is so solid that equipment failure is a non-issue.

As they say in China and Japan.

Rots of ruck.

http://www.google.com/#hl=en&cp=24&gs_id=20&xhr=t&q=embedded+controller+forth+for+the+8051+family&qe=ZW1iZWRkZWQgY29udHJvbGxlciBmb3Ry&qesig=xd60NIcfXg3p-2eCbNMGQA&pkc=AFgZ2tlxWgkd6sF3iW2r0ZqQdLVoWEZ4uOd0s-fGXvTyoshJJ0FvTBoZ14aUBBmvt0ylvy8cr7RfnVaGj6zv-oDAeihddK3JKw&pf=p&sclient=psy&rlz=1R2ADRA_enUS416&source=hp&pbx=1&oq=embedded+controller+fotr&aq=0l&aqi=g-l2g-lv2&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.&fp=b35c5f51d8a0f4a6&biw=1259&bih=836

I remember well the early ‘30s. I don’t even think I heard the word “electronics” until many years later. I guess it would have meant radio (vacuum tubes) to me then.

Outages were rare and brief. NW of Chicago in winter an outage would have been serious as during that era we had much winter weather below zero F and few furnaces worked without electricity. (Our radio ran on DC.) Only once do I remember that the gas oven was turned on to heat the house, sort of an adventure, with candles, for a little kid.

If Bob’s comment is cogent why are we even using electronics at utilities, and do real live engineers of today actually want to have such as a Consumers Reports for industrial equipment? Embarrassing.

So you can understand I am a bit puzzled about learning failing equipment has now become a problem. Seems it wasn’t 70-80 years ago.

Bill: If you must insist on cluttering up the threads with barely relevant links, please learn some rudimentary HTML.

To post s link, as follows but replace {LCaret} with a left angle bracket, {RCaret} with the right angle bracket.

{LCaret}a href="http:\\www.your.link"{RCaret} Clickable Title {Lcaret}/a{RCaret}

Don: Times change, eh?

Don,

I sympathize with the utility industry, and with yourself. But Len is absolutely right: "Times change, eh?" We're using electronics because smart grid is viewed as the best and only way to deal with the problems the grid is facing today, and in future.

So unless you can help the utility industry find a big pot of gold, I wouldn't hold your breath waiting for the grid going back to its past, or for smart grid electronics to become much more reliable.

BTW, if Len's IMEUC market proposals were implemented, a key feature is the ownership of all the smart meters would transfer from the local distribution utility company to its customers. Under this scenario customers could spend their own money on upgrading their meter hardware and software to whatever products appear on the CONSUMER markets. Unlike the resources of utility companies, there would be no limits to what consumers could spend on them.

IMEUC would effectively open the door to much more robust mass commercialization of smart meters and other smart grid components and systems. Many of the more mainstream high-tech companies would jump into it because the economic activity could / would become comparable to what routinely happens today with personal computers, smart phones, and gaming products. In essence this would happen because other smart grid systems would piggyback on the need for much greater smart meter functionality than the basic functions the utilities are buying in them today.

(Remember the smart meter is not only a metering device for energy billing, but also a communications portal into the grid from the customer, the customer’s personal computer or smart phone, smart loads like EV recharging and smart appliances, and of course small-scale distributed local generators like rooftop solar etc. etc. etc.)

As it stands today most large mainstream high-tech companies stay out of the game of smart grid because the market for its devices and products is too spotty, unpredictable, and evolves at a snails pace when the utility companies are its only customers.

Cheers Don! Stock up on your candles.

Bob, Len, Bill, Don: Great comments - please keep them coming.

A few nuances of utility purchasing to consider. Utilities will be buying a lot more "smart" equipment a lot more frequently than they realize. The "smart" equipment being purchased today is going to be off the market and replaced by a new model in just a few months. (The supply chain for electronics moves very quickly through the production lines - and yes - most production is in China)

The monolithic purchase and support model of the past is not going to work for smart products. Utilities will be buying replacements continuously and the support model is going to change as a result. I predict utilities will have to adopt support and rolling refresh programs that will closely resemble the desktop and laptop support model in business. The cost to roll a truck to repair a meter is the same as swapping a meter. An upgraded model is almost certainly going to be the replacement of choice.


Utilities do not need to remain helpless when it comes to smart products and poor reliability. There is more than enough experience of products already in the field that if shared, would immediately inform industry as to what is possible when it comes to reliability. I created the framework for execution this type of collaboration when I wrote TekTrakker - and after just a few months of data from multiple sources it was very clear that there is a huge spread of quality within smart products packaged as computers. I fully expect that there will be just as much variety with smart meters as there are with desktops.

With comparisons comes competition. Vendors will be anxious to market products with relatively higher reliability than their competitors. The race to improve products for reliability will happen - all that is missing is the comparison.

As to IMEUC - Consumer Reports type reporting would greatly inform this market as well.




ilitThe production of a viable database of smart hardware failure rates has to be far more dynamic and broad than could be provided through a publication - not to mention that the market

If utilities really need much higher reliability in their electronics than typical industrial electronics, then all they need do is arrange for "less functionallity from a given piece of hardware". A CPU which has a barely acceptable 3 to 5 yr predicted life in a consumer product at 2.5 GHz will likely last essentially forever at 500 MHz and lower operating voltages. I can't think of anything in the smart grid field equipment requirement which couldn't be done easily and redundantly with a 32 bit CPU at 500 MHz.

Len, agreed there is no need for a state-of-art high performance CPU in utility field electronics.

There are also many other design measures that could be taken to improve reliability too - like adding redundancy to circuits, and coating their circuit boards with an environmental seal known as conformal coating or complete encapsulation known as potting. The latter greatly enhances the field lifetime when circuit boards are exposed to harsh outdoor conditions over many years. Redundancy in circuit design, and coating and sometimes potting are routinely done for electronics in airplanes and trains and some automotive modules that go under the hood. But to save a few dollars on cost I'll bet many of the utility field equipment vendors are skipping this.

Len -Good point and true, however reliability of CPUs is only part of the problem. Smart meters all have power conditioning parts, capacitors, memory, communications links - all of which have their own reliability issues. Add to which electronics in general perform very poorly in conditions of power variability, heat, and moisture - all of which are present in most smart meter settings.

Capacitors are particularly short-lived. If they don't get fried by voltage surges, they will dry out and fail. One can over-spec capacitors to get additional life - but the price point of the product will increase.

As to redundancy - it helps keep systems up and running but does not prevent the truck roll for the repair. The best equipment is always the equipment that is the most reliable, even when buffered by redundancy.

Well all I know is that a smart grid component won't be treated anywhere near as harshly as a weigh scale controller in a food processing factory (high-pressure washdown and chemical scrubbing several times per day). Toledo and others have certainly figured that one out, as their and their competitors' products last essentially forever in those conditions. This issue is way overblown.

This would be a very bad time to be complacent about the durability and reliabillity of smart grid components. I may be "crying wolf" over reliability, but I'm in the reliability measurement business and have observed dramatic and sometimes catastrophic problems with electronics that are not being systematically measured.

For example, Dell produced nearly 12 million desktops for business use that had a capacitor leak problem that caused fires and data loss for thousands of units. They lost a couple of hundred million dollars in a legal settlement for having ignored the problem. The same type of problem could be buried in millions of smart meters and one would not know if that were "normal" or not without a baseline and continuous measurement.

Hoping, wishing, assuming that equipment will be suitably reliable for decades in the field is not a management strategy. Vendors, regulators, planners and financial partners are all holding back on implementations because of uncertainty over useful life and suitability. Removing the doubt, even if "overblown" would do much to accelerate adoption and innovation in deployments.

Measurement of reliability, and sharing of that information systematically is the best hope for the industry to grapple with the explosion of new equipment. This isn't my mantra exclusively - NETL/DOE is making a presentation at GridWeek in a few days on the topic of Performance Feedback Programs as a major element of smart grid implementation.

"One can over-spec capacitors to get additional life - but the price point of the product will increase."

Precisely Gay, exactly one of my points. Manufacturers who design the equipment intentionally avoid over-specifying components to save every penny they can on cost, due to price pressures from their customers the utility companies.

"Removing the doubt, even if "overblown" would do much to accelerate adoption and innovation in deployments."

Here in lies part of the problem with utility companies - if there is any doubt in their minds about reliability, they sit on the fence avoiding buying anything, sometimes for many years. They want to be able to buy all their field equipment with bulletproof reliability so they can forget about repairs AND forget about buying anything again for many years.

This situation often prevents new innovative technologies from being tested in large numbers in real-life field conditions and exposing any unforeseen reliability problems. More importantly it prevents the CONTINUOUS EVOLUTION of new technologies where manufacturers are allowed to continuously improve on them with regular new product design releases. Hence smart grid technology evolves at a snails pace relative to most other industries.

Agreed Bob. The utility industry needs to get up off their butts and start inovating instead of sitting around waiting for something like "99.9999% reliability proven in large scale field deployments over a ten year period" in their electronic products. Once a circuit is proven in commercial, industrial and military usage for a reasonable time period, it should be sufficiently reliable to start field trial size deployments in the utility industry. Even if then a few errors are made, worst case is a re-deployment of an improved design, and a few maintenance calls meantime.

It looks to me like utility executives are simply too gutless to take any risks at all. They should try running a maintenance department in a large private industry, where your job's on the line every morning if the plant doesn't start up, but still inovation is demanded / required else the plant will go obsolete in less time than a utility field trial takes.

"It looks to me like utility executives are simply too gutless to take any risks at all. They should try running a maintenance department in a large private industry, where your job's on the line every morning if the plant doesn't start up, but still innovation is demanded / required else the plant will go obsolete in less time than a utility field trial takes."

Wow !! I couldn't have said this any better myself Len. Right On !!

The second sentence is brutally true in many private industries. The huge difference in private industry is that companies usually have relentless competition to worry about, and if you don’t compete with them, your job often ends up going offshore somewhere. Distribution utility companies don't have ANY competition, as we all know too well, and typically won't move quickly on deploying anything new unless forced to do so by government mandates.

Strikes me that most existing distribution systems are pretty simple and reliable these days. I really do not see any good reason to complicate these systems with smart meters, which, as Gay pointed out will invariably require "care-and-feeding"

Seems to me, the government is once again the enemy of "good enough" with increased costs being created for the peculiar notion that utilities must have instant knowledge of the customer's power consumption. Of course some (including me) believe the smart meters are nothing more than a Trojan horse to allow utilities to make more money by gouging the customer with "time-of-day" rates, as aided and abetted by bureaucrats attempting to force the customer to comply with their social engineering efforts to curb CO2 emissions.

Michael,

Agreed the past system of metering using the reliable electromechanical meters and flat billing rates were “good enough” as long as there was sufficient peak demand generation capacities out there, and as long as there was money around to keep building generation to keep pace with demand growth. But we don't have these “luxuries” anymore, and recognizing this, our policymakers asked our utility industry how best to cope with this situation. Their answer was to charge higher rates during peak demand hours and thereby encourage customers to curtail demand during those times, in essence Time-Of-Use (TOU) billing.

To implement TOU billing however the utilities needed interval metering that smart meters are primarily designed for. Interval metering basically logs your energy consumption in as fine as 15-minute intervals, although most utilities implementing it are content to use 60-minute (hourly) interval consumption data. This among other things allows them to set TOU rates that can change from one level to the next at the top of a clock hour.

As for instantaneous power demand in Watts, smart meters can report this at anytime on (software commanded) request. But utilities generally aren't interested in a residential customer's power demand, they are only interested in their large usage customers like heavy industrial or commercial businesses. So they are not likely to eavesdrop on your home's instantaneous power demand.

(It's the residential consumer that is typically more interested in having a power meter to show in real time what their instantaneous power demand is, and what their cumulative energy usage and cumulative energy bill is at any given time.)

I will grant you also that the use of Time-Of-Use billing makes it much more difficult for consumers to resolve their energy bills, and thus opens the door for governments and the utilities to raise TOU rates such that your overall energy bills are higher than they were before on flat billing rates. Then if any customer complains about higher bills, they can simply be told to load shift their energy consumption to use more at off-peak hours in order to lower their bills. This may not be so easy though for many customers, so yes it in effect can hit some customers in the pocketbook much harder than others.

Save your pennies Michael, you're going to need them.

A strange twist on this article's subject is that smart metering and their AMI systems are not only meant to implement TOU billing, but also enable utility companies to increase the reliability of the grid itself. This is firstly because working smart meters are automatically a tool for outage management since they are real-time indicators of which customers have a power outage.

Secondly, the communication networks in the AMI system can potentially provide real-time status monitoring data of many other grid assets, including the real-time data for the grid to control / handle many more intermittent generation sources like wind and solar down the road.

One would think if utility companies are expecting to use smart meters and AMI systems for enhancing grid reliability, they would be willing to pay much more money for them to ensure smart meters and their AMI networks are ultra-reliable. Since this is apparently not the case today, I suspect we will only see TOU billing and perhaps some outage management implemented, and not many other uses for them until our utilities are prepared to spend a lot more money on their equipment.