By: Jeremy Vaughan
October 08, 2020
Introductory Note from Helen Beal, Chief Ambassador at DevOps Institute:
At DevOps Institute we focus laser-like on the Humans of DevOps and I wanted to take a moment to highlight this story from our Ambassador, Jeremy, that really shines a light on the impact the technology we interact with has on our day to day lives and that it can mean the difference between life and losing a loved one. In DevOps we seek to balance throughput and stability and Jeremy teaches us here how critical it is to ensure we don’t compromise quality when seeking novelty – his daughter Jalen’s future depends on it. We are very grateful to Jeremy and his family for sharing their very personal experiences for the greater human good. Sharing these experiences also presents the opportunity to educate authoritative bodies of the benefits to collaborating and updating regulatory compliance guidelines aligned with our evolving product development practices.
Code rot is the slow deterioration of software performance over time or its diminishing responsiveness that will eventually lead to software becoming faulty, unusable, or in need of an upgrade.
Every day, my family puts our faith into the accuracy of healthcare software and the precision of medical devices. If they don’t work as promised, my daughter’s life is immediately at risk.
My daughter Jalen has been surviving Type 1 Diabetes since she was sixteen months old. Each day we rely on flawless communication between three pieces of technology: a continuous glucose monitor to report blood glucose levels, mobile apps on our devices that convert that data into precision dosing instructions, and an electronic insulin pump that administers the exact amount of insulin Jalen needs at exactly the right time. With Type 1 Diabetes, the margin of error is too microscopic to make mistakes.
Unfortunately, machines and software do fail
Over the last eight years, we’ve been on the receiving end of some unintended but life-threatening consequences, all of which stem from a lack of governance, ineffective auditing, and poor operational oversight of both software and hardware. This is why I’ve spent the last ten years debating with software engineers about moral coding philosophy, ethics in software, professional responsibility, and, well, the inconvenient truth about code rot.
Now, we rely on two sophisticated products to keep Jalen alive: an automated glucose monitor with corresponding mobile applications and an insulin pump. The future is a closed-loop system to continuously monitor her blood glucose, calculate needed insulin, and automatically administer it with better-than-human precision. These machines could be a modern miracle. Until they don’t work.
After a forced upgrade on our mobile phones and the app that controls the CGM, features disappeared, critical alerts failed, and notifications stopped. We didn’t realize it until Jalen’s blood sugar hovered at 47 BG for over an hour while she slept. No alarms went off. Jalen awoke to her body shaking in response and at nine years old, she relied on instincts to save her own life by “juicing” herself with lemonade. We wondered if we were alone, but a quick look at the reviews on Google Play and the Apple App stores confirmed a widespread loss of trust. Technically speaking, we were on the receiving end of code rot.
What is code rot? Code rot is the slow deterioration of software performance over time or its diminishing responsiveness that will eventually lead to software becoming faulty, unusable, or in need of an upgrade. If not upgraded, software can quickly become defective or completely obsolete.
Why does code rot happen?
Unfortunately, today’s IT teams are heavily burdened and preoccupied with perimeter level cybersecurity and data breaches. While they’re busy checking outward-facing compliance boxes that simply catalog the current inventory of existing tools, they lose the time and ability to focus on solving systemic operational problems happening inside the organization. This is a huge blind spot and it’s quickly becoming one of healthcare’s fastest-growing threats.
There are a few common challenges contributing to this growing threat:
- Growing code and technological complexity across digital environments and devices
- Delivery schedule and innovation pressures from the business
- Inexperienced or indifferent security and development teams working in silos
- Lack of FDA regulations for internal systemic resilience
Why aren’t the current regulations enough?
Progress in regulations is not translating to patient safety. The growing risk of Internet of Things (IoT)-powered medical devices, has spurred new legislation and collaboration amongst the US Food & Drug Administration, the American Hospital Association, and the US Department of Commerce’s National Telecom & Information Administration, however, that legislation does nothing to address mounting technical debt.
Technical debt is a concept in software development that reflects the implied cost of additional rework. Technical debt can be compared to monetary debt; when it’s not repaid, it can accumulate ‘interest,’ making it harder to implement changes later on. Unaddressed technical debt increases code rot. In medical software and devices, this technical debt translates to security and patient risk.
Doing the inside work. Addressing systemic operational health effectively.
Healthcare software and device manufacturers need to define what systemic operational health looks like for all stakeholders and then proactively work toward achieving it. We believe there are four pillars to a successful framework for operational health:
- Collaboration – Systemic operational health takes collaboration among security, IT, and engineering teams. It also takes treating internal risk with the same rigor and proactive discipline as external threats.
- Balance – Systemic operational health takes a commitment to balancing the pace of innovation with the ethical responsibility to monitor and maintain existing products that impact patients.
- Education – Systemic operational health takes educating policymakers and calling for real regulatory support from authoritative bodies.
- Automation – Finally, systemic operational health requires the right tools to keep up. Unfortunately, humans alone cannot keep pace. Systemic operational health requires smart automation tools that can surveil and detect issues in millions of lines of code — instantly.
As the promise of medical technology continues to enhance and even extend patient lives we will find ourselves even more dependent on the software and devices that keep us well. When they work properly, we won’t even know they’re there. However, as that dependency grows, so must our vigilance around creating and maintaining operational oversight and governance for both software and hardware. With a clear understanding of the operational health challenges we all face, and with a shared framework to help us move forward, we can (and should) build a culture of digital ethics, trust, and transparency.
POST WRITTEN BY Jeremy Vaughan
Vaughan is the CEO & Co-Founder of Tauruseer, a Continuous Assurance Platform automating Cognitive GRC through existing code pipelines and cloud solutions to proactively aggregate siloed data, identify risk, and recommend the right actions to protect organizations from preventable loss caused by human, process, and product threats.
AI & ML OPs