News Stay informed about the latest enterprise technology news and product updates.

Rise in hidden software glitches caused by programmer retirements

Undiscovered software glitches in complex systems are common, and one of the primary drivers is the loss of mainframe knowledge of a retiring workforce. Software glitches are lurking in many large systems, particularly mainframe systems, and the COBOL programmers that understand the code best are retiring, according to Jeff Papows, author of the new book, "Glitch - The hidden impact of faulty software." Papows describes how faulty software caused a huge charge to debit card holder's account and why such mistakes are on the rise in this interview. Papows notes the three most pressing drivers for software glitches: loss of intellectual knowledge, market consolidation and the ubiquity of technology

Today we're here with Jeff Papows talking about his new book, "Glitch - The hidden impact of faulty software" published by Pearson/Prentice Hall Professional.

SSQ: Jeff, the book opens by describing a situation in July, 2009, where a New Hampshire resident found a charge of over $23 quadrillion on his debit card. As the book goes on, you describe other "glitches." What exactly is your definition of a "glitch"?

Essentially, anything in the way of a digital or a software-related snafu, another word that describes the denial of some form of service or worse. The example you reference in the July case, as opposed to a denial of service, you get an error in the transaction that causes some other form of harm. Unfortunately, they're becoming more and more prevalent. I just wondered if a 'glitch' was defined as something that was more serious – a certain loss of revenue or loss of life, for example.
Well, glitches run the gamut from minor annoyances, as in the case of the Verizon glitch last week as an example, where Verizon ended up shelling out over $50 million worth of rebates because of a glitch in people's handsets that was causing them to be charged for the data that they weren't actually accessing. I would call that an annoyance. As I talk about in the book, however, they are also on the other end of the spectrum with things like the software glitch in Varian's medical equipment which led to the over-radiation of dozens of cancer patients and ultimately very painful and tragic death for people entrusting medical equipment in the healthcare industry and paying the ultimate price for the glitches that caused them to be over-radiated. So, it can run that entire spectrum. Unfortunately, we've reached a point where we're seeing more glitches of the devastating kind, not necessarily like the Variant one, but the things that we've seen with Toyota and the denial of access to funds at TD Bank for 14 days; it just seems to be the prevalent trend these days. 

SSQ: You state that the three most pressing drivers are 1) loss of intellectual knowledge, 2) market consolidation and 3) the ubiquity of technology. On the first item, you primarily talk of the concern of COBOL developers retiring. I'm surprised this is such a concern. Are there a lot of these glitches occurring because of old legacy code?

There are. It's not a fashionable thing to talk about, actually, but the reality is about 72% of the world's financial transactions take place in legacy applications, to your point, that are resident on IBM mainframes and largely written in COBOL which is a fairly arcane language by today's standards compared to languages like Java and C#. The reality is, for the first time in my career and in the history of our industry, because I grew up with it, a lot of people with those COBOL skills are either retiring or, even more sad, dying in some cases, and leaving our workforce. That codified human understanding and knowledge is walking out the doors and not being replaced. 

SSQ: Are organizations looking into modernizing that code?

Mainframe modernization has been kind of a holy grail of concern for the information technology industry for a decade now, and the reality is, it's not happening. There's a certain sort of pragmatism associated with the fact that these large-scale mainframe and financial systems are going to be with us for a long time. But every effort I've ever been exposed to or read about secondhand, where we were going to port things from these older legacy applications and languages, open systems and whatnot, has failed miserably. The reality is the monumental scale of the amount of the world's application inventories that exist in these other paradigms is so significant, it turns out it's fairly naïve to think that we're going to find some magic, a technical key or genie that's going to allow us to transport that from it's existing paradigm to a new one. By and large, it hasn't happened. 

SSQ: Couldn't some of the same issues of loss of intellectual knowledge result, not just from employees retiring, but from outsourcing?

The outsourcing accelerates the trend, for sure. The other thing that's rather interesting is after the bubble bursting in the era the number of graduating computer science or even math majors from North American universities is down about 37%. So not only do you have outsourcing and retiring or at least more incestuous mobility of people associated with these legacy applications, we're not restocking that skills inventory or pool at the same rate and pace we were. And I can assure you the computer science and math and engineering graduates that are getting out of universities aren't apt to be studying things like COBOL and Fortran and Assembly. There's a whole range of much more hip and cool and romantic computer languages and social networking tools that probably are occupying our younger people's education curriculums as opposed to the ones that our financial infrastructure is dependent on. 

SSQ: However, there are a lot of older people who are unemployed now. I know we recently published an article that was concerned about whether the older generation would be able to get jobs because of the changes in technology.

Yeah, that's an interesting point because the two concerns are at odds with each other. Actually, people with those more esoteric skills like COBOL, should be in a seller's market. You would think there would be ample opportunities because whatever the current demands are in information technology organizations across the Fortune 2000, the reality is the majority of our budgets and staffing plans will continue to be directed at maintaining what we have and in the minority, 20-30% of our opportunity cycles are in reply to the new and cool stuff so the older generation who have those skills, particularly in an era where physical location should matter less, ought to be, I would argue, infinitely employable.

Dig Deeper on Topics Archive

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

I believe with all the trends out there, it is not a chance to consider those who retired to become more self-sufficient in the technology bubble , where it is a probable cause to understand the movement of code of COBOL which if you think about it you get the excitement of other codes like Utilitly management.
One doesn't have to be part of an early legacy system to see these issues. we are currently dealing with it working on a product that is only ten years old. Without giving too much away, a prominent language that was in prolific use fifteen years ago has fallen farther down the list of what's hip and happening. the net result is that we find it a challenge to replace programmers who leave and have a similar skill set, being knowledgeable in modern operating systems while still having an understanding of this one top dog but now lesser applied language. My point being, it's not just COBOL systems and organizations having these problems.
In my current organization, we are struggling with finding qualified engineers to work on our legacy code that is now only twelve years old, but featured a once top dog language that has fallen out of fashion with the "hipster" coders of today. Long story short, it's not just a problem for COBOL organizations, Web 2.0 companies are dealing with it, too. 
Retirements might be the catalysts but not the main cause. Lack of testing, rushing to production - that's what causing software glitches.