Saturday, July 18, 2015

Pointing Fingers At Meetings

PACS is complicated. Really complicated. Really, REALLY complicated.

So when something goes wrong, there is a lot of investigation that needs to be done, and occasionally, there will be a lot of 'splainin' to do. And here and there, people point fingers at the perceived source of the trouble. Sometimes they are right, sometimes they aren't.

As you might guess, one of our systems is having a problem, and we aren't getting to the bottom of it. And there is some finger pointing. As we are in the midst of working it through, but we do not yet know what's wrong, I won't name the name of any entity, person, nation, President, planet, or vendor involved. You may feel free to guess, but I'm not telling. Wild horses couldn't drag the information out of me. And I hate horses. Well, I hate riding horses, which is close enough.

So. Just the facts, Ma'am.

The PACS in question has never operated quite perfectly. You might say that none of them do, and I would have to agree. But this one has had a few operational issues, including occasional complete outages, which fortunately are few and far between. And you might say all of them do that, and I would have to agree.

But the system in question has always had some little glitches I don't see from our other sites. In particular, we have noticed over the years that scrolling of large data-sets can be painfully slow. Surprisingly, turning off annotations will speed things up considerably. No one has been able to explain this adequately, although some have suggested that it is due to this particular architecture having to communicate back to the mother ship server for every command and mouse stroke and click. Could be. Also, SOME of us, and not all of us, experience very slow searches, worklist refreshes, and so on. This bad behavior has been attributed to bad individual user profiles, and rebuilding them may or may not cure the problem. And it has also been said that because some of us keep multiple worklists active, and all of those worklists have to have a little chat with the server upon each refresh, we are the ones slowing down our own workstations. Blame the victim. Love it. But it might still be true.

We've been putting along like this for several years. I've complained, others have complained, things are looked into, things are tweaked, and sometimes we see some improvement. I do have to say the system has been usable, even with the glitches, except on those occasions when it dies completely, and then it isn't very usable at all. In the meantime, I'm informed that we are about the last site in this quadrant of the galaxy on our particular version of the product. Fits like an old shoe, I guess. We need an update, although no one is certain that will solve any of our problems.

Unfortunately, over the past month or so we have been seeing a definite deterioration in function. Things load slightly more slowly, then a bit slower still, and in some areas, including an ED and a remote site on the network, the speed drop has reached the point that the stations are inoperable. But here's a little clue for you electronic Sherlocks out there: Stations accessing PACS via the Internet, outside the Enterprise, demonstrate adequate speed. We get better service from a home station on a 50M Time-Warner home connection that the gigabit Ethernet inside the hospital. Hmmmmm. Aside from the network connection, the only other difference between a station within the network and outside is that our main reading stations have digital voice (NOT speech-recognition) software integrated, although the ED stations do NOT.

To me, this all points to a network problem, possibly/probably compounded by the way this particular PACS architecture works with the network topology.

Now here's where we get into trouble. It seems that IT and the vendor have been working on the problem for a month or more. And they have come up with nothing. Well, I've been talking with folks on all sides of this, and that's not quite true.

The vendor has run tests from sample workstations at multiple sites, and lo and behold, the site with the most trouble has significantly slower transmission speeds back to the gateway than the site with less trouble. I'm leaving out a lot of detail, but that's the gist of it. Sounds like the answer, yes? No. IT has run tests on the network, and the report is that everything is perfect, nothing wrong, nothing to see here, move along. And so the finger points to the vendor.

The vendor, for its part, is bringing in everyone who knows anything about how the thing works, and promises to do everything possible to get to the bottom of the bottoming out.

And that, dear readers, is where we sit today. All sides are supposedly working furiously on the problem. I think (I hope) they all realize the mission-criticality of what it is they are fixing. Remember Dalai's First Law:  PACS IS the Radiology Department. Right now, our beloved department is impaired. I personally don't care whether this is the fault of the software vendor, the hardware vendor, IT, or if the janitor slopped a wet mop on a server. Our system is absolutely vital for patient care, and we cannot begin to tolerate anything less than 100% function. And we cannot tolerate anything less than 100% cooperation to get us back to 100% function.

I have had the good fortune of creating an international speaking career based on some of the foolishness I've seen over the years on every side of the PACS equation, and that includes ridiculous behavior on the part of vendors, IT types, and yes, even radiologists. As a crotchety, cantankerous, semi-retired curmudgeon, I can say with confidence that with most PACS problems, all parties have some degree of guilt. It is really bad when one side digs in its heels and declares: "it's not my problem!" But there is something even worse: MEETINGS!  Many folks out there, often but not exclusively IT types who have come up through the IT bureaucracy and not via Radiology, know of one and only one way to handle a situation: We need to have a MEETING! Thus time and resources are wasted trying to get the Important People in the same room at the same time so they can all explain why something isn't their fault, and agree on the time of the next meeting. Yes, I know that's how things are done, but in my experience, it's more how things don't get done. My very favorite example comes from my early days of dabbling in PACS, now over 20 years ago. I was sitting beside the one and only PACS administrator, a good friend of mine, while an IT person was droning on and on about how thus and so wouldn't work, how it couldn't possibly work, and how they should all meet again in a month to discuss why it was a bad idea. My friend whispered in my ear, "I did it and it works just fine!"

No doubt I've stepped on a few toes with this little rant. I mean no disrespect, and I dearly value the friendships I have made throughout the years among IT folks and vendors alike. But I have very little tolerance for things that get in the way of patient care. A failing PACS is right up there. Let's not add squabbling and finger-pointing to my sh*t list.

Besides, if I should disappear tomorrow, you all might have to deal with some of my former partners (now my bosses) instead, such as the fellow for whom Dalai's Twelfth Law was written:

At least I speak the IT language. Which might not be all that appreciated in the end.


sb said...

I don't know if this might be useful but we have had a similar problem with outages with our PACS (from a four letter vendor from the land of good chocolate). The fault was traced back to a corrupt ARP table on the T5120 PACS Database Server. This table links the MAC addresses to the host names of the various devices that the DB communicates to. We stil lhaven't found the root cuase of this.

This is on a background of very poor performance since we went live in November 2011. We operate over a city-wide network and despite a huge amount of work remain seriously constrained in RIS and PACS performance. Fingers have been pointed at network, servers, desktop policies, etc, etc. Their answer is of course to poor more money into upgrades... One of my colleagues hascoined the slogan "Complexity without virute" which perfectly describes the architecture we are forced to endure.

Anonymous said...

From the vendor....

"With our servers, the network team did insist on teaming the NIC cards. Initially this caused ARP poisoning. We saw in the database ARP table that the MAC addresses for the servers would "flip" back and forth.

To overcome this, we statically assigned the MAC addresses in the DB ARP table. We did think that this stil might be an issue and we have asked and are going to ask again to unteam the NIC cards for testing purposes.

You could share that information with that "unnamed" site ... maybe that can help them."

Iceman said...

So sorry that you continue to struggle with a half-assed PACS! It's been years now! Just crazy. Hang in there.