Home

Vendor's secret 'fix' made critical app unusable during business hours

On Call Welcome to another installment of On Call, The Register's Friday column that tries to improve the health of the tech support ecosystem by sharing readers' sickening stories of bringing broken tech back from the brink.

This week, meet a reader we'll Regomize as "Raoul" who told us about his former life as a commercial Linux support consultant.

"We had a customer call up to ask us to provide a health check on their large server running a critical business application," he told On Call. Raoul was given a week to do the job.

The customer was a medical facility and the software it ran was a complex Java web app that handled jobs including patient scheduling, booking management, and even payments. It ran in VMs, on a Postgres database, an external storage array, and across three servers – one of which was a warm spare for the database.

The app wasn't well.

"During peak load in the morning, the system would grind to a halt and become unresponsive for anything up to half an hour," Raoul explained. Medical specialists, admin staff, and patients were all left waiting for the app to perform, a decidedly unhealthy situation.

When Raoul arrived to diagnose the problem, he found the on-site techies bickering.

"The virtualization people blamed the storage system, the storage people blamed the application, and the application developers blamed the OS," he said. The customer's relationship with the application's vendor was also toxic as somebody from the medical facility had made unkind remarks on the vendor's support forums. Threats of legal action followed.

The next day, Raoul arrived in time to see the application grind to a halt at around 10:00 AM. He checked the application server – which was fine – but did notice the database server was very busy.

Raoul kept his head down and on his third day on the job waited until the system locked up again.

"I headed straight to the database server and saw somebody was running a lengthy update task that locked down table rows, meaning all other transactions had to wait," he wrote.

Raoul dug a little deeper and learned that the vendor of the application had found a bug in their wares, and was patching database errors on the live system, during business hours, without telling their customer.

"I collected the evidence, wrote up my report, and took it to the management," he told On Call. "The app's developers confessed that they had known of the issue for months, had a fix 'almost ready to go,' and everything would be OK."

Raoul soon learned that the situation was far from OK because during his last two days on the job, he decided to do a health check on the Postgres database.

"The production database stored medical data, personal information, and handled payments had no access controls," he told On Call. "It was configured 'ALL ALL ALL', so any user on any system could access any database as any user."

That revelation made Raoul feel ill.

"I nearly fell off my chair, reported that to the management, and they told me the developers said that was the required config, and were not concerned," he wrote.

"I went home with a mental note to never use that vendor," he concluded.

Has anyone hidden the cause of a technical problem from you? Don't hide your story from On Call! Click here to send email to On Call so we can share your story on a future Friday. ®

Source: The register

Previous

Next