Methodology F: Software

Showing posts with label Software. Show all posts

Sunday, May 31, 2009

AVS is a really bad idea

For those not familiar with AVS, you may (or may not) know it as the Sun StorageTek Availability Suite. The basic idea is that you can have two storage devices attached to different hosts that stay in sync. This product forms the core of the CDP (no, not that CDP) capabilities of the StorageTek and Sun Unified Storage Server product ranges.

Too bad it is completely and totally useless when using some of the standout features of Sun's current opus, ZFS.

Let's take one feature that shows ZFS in all its brilliance - resilvering. As detailed previously, doing this is going to be very bad for anyone who uses big disks and/or less than 10GbE connectivity between AVS nodes. Jim Dunham responded to this, saying he doesn't see a way of making AVS "smarter" or "ZFS aware". The man makes good points, but both he and the original post's author have totally missed the point.

Sun have done the wrong thing here with AVS:

AVS is a paid-for add on for Solaris. Even though AVS is a Sun internal product, it's just as bad as VxFS or Veritas Volume Manager. Way to go, Sun!
AVS in OpenSolaris is outdated. Really - why bother?
AVS + ZFS just doesn't work, ever. It's a half-hearted effort that provides support for getting your data from one box to another, but no support from the vendor for failover unless you buy some black boxes. Take up religion if you hope to fail back gracefully.

ZFS is transactional and has intelligent operation management at its core. Send those transactional operations across the wire. A device resilvering should never have block rewrites replicated to the passive host and a scrub can happen in parallel to normal file system activities.

Take each of the above operations:

A scrub completing with corrections successfully on the active host says nothing about disc consistency on the passive host. Begin the operation on both hosts.
Even if you have two dedicated, nerdy and doomed-to-be-virgins-forever system administrators in your colo replacing disks in the active and passive nodes at the same time, there will be a difference between the two new disks as ZFS begins to resilver. Your disks are definitely inconsistant, unless you're using hardware RAID...WHICH MANAGES TO ROB YOU OF MANY OF THE BENFITS OF ZFS. For those in the audience that pledge an answer of "full AVS replication", see above. You fail.

If someone were to give me the means to satisfy my financial obligations and work on only ZFS, I would give you the solution to this problem inside of six months without a problem. Seriously Sun - if your (admittedly brilliant) software engineers can deliver the marvel that is RAID-Z in 599 lines, then I can do it.

For those who didn't read the intro of the blog, yes I am a software engineer and yes I am proficient in C, system programming and mathematics. My assertions are justified.

It is a horrible injustice that the greatest filesystem ever is treated in this way.

Sunday, February 8, 2009

Do less work

It's better for all involved.

All forms of optimisation involve doing less work. There is no debating this.

Quicksort is optimal. Sedgewick showed this mathematically (and his mentor, the great Knuth agreed) in terms of the amount of work done and movement of data elements required.

The rules are simple:

Don't poll - let the runtime / OS tell you when there is work to do
Respond to events, don't anticipate conditions.
Dispatch quickly
Decouple and hand off work units - seeing to completion only increases latency and reduces throughput

No number of threads and processors negate these rules. Sorry Sun - Niagra doesn't relieve us of thinking at least somewhat.

Sunday, July 13, 2008

malloc, free, repeat, repeat ad infinitum

In the current development climate, you'll be using one of two types of development environments:

An environment where the developer is responsible for allocating and freeing resources
An environment where the developer acquires resources, and (generally) the environment takes care of freeing them

Of course, it's not quite so clear-cut. Maintaining a static (or Shared for you VB.Net developers) list of allocated resources will stop any sort of reclaimation from happening, but the distinction is sufficient for this article.

Recently, I had to deal with a delightful issue involving a memory leak on every developer's favourite OS, Windows NT (2000, 2003, XP, Vista, 2008 etc).

For those not familiar with the way that the Windows kernel "works", here's a minimal introduction:

The Windows kernel has two important memory pools, both critical to the correct operation of the kernel. One is the paged pool, able to be swapped to disk as required. The other is the non-paged pool, permanently in RAM. If either of these pools is exhausted, stability will be impacted.

This is why the UNIX world embraces redirection into user space for most high-level tasks. Virus scanning, remote access, graphical interfaces, intrusion detection, complex firewall rules and logging (to name a few that come to mind) are all handled mainly in user space, where crashes and incorrect behaviour are handled with a process kill and respawn.

Windows handles things in a, how shall we say, different way. Drivers for the vast majority of remotely low-level tasks do complex work inside kernel space. There are multiple cases from Network Associates, Symantec, Computer Associates and Roxio of causing massive stability problems due to their drivers. My employer has various requirements mandating the use of various packages from vendors that install an assortment of drivers that do non-trivial things in kernel space. In every case, these drivers have resulted in major stability problems.

This gives rise to two very important questions:

How hard can it be to do a decent amount of testing to find these problems, given they require me to reboot my servers every week to "work around" them?
Why aren't these drivers a boring, stone-simple stub that pass the required data to a user space process that won't bring down the server in question due to a simple development error?

This week, I was faced with an agonising decision. A server running Windows 2003 Server (Standard Edition, 32 bit) that normally shows a Paged Pool usage of 30MB - 40MB was showing usage of 320MB.

After promptly panicing, I went looking for the cause. This has happened before, and I'd maximised the size of both pools. The idea wasn't to prevent the problem, but rather to at least get the chance to both see the problem and diagnose it. We've ensured that Pool Tagging was enabled on all of the Windows machines we take care of and that Poolmon was readily accessible.

As soon as I saw the problem I fired up Poolmon and saw that the tag for the biggest user of the pool (280 MB of usage) was SevI. A quick check showed this was Symantec's SymEvent driver. And this is where my frustration began:

The SymEvent driver isn't advertised as being included with or used by pcAnywhere. It's mainly associated with Norton AntiVirus and Symantec AntiVirus
Executing LiveUpdate against pcAnywhere doesn't update the driver
The latest version of pcAnywhere no longer uses the SymEvent driver
There isn't a program available that will clean up the paged pool usage to allow me to forgo reboot of a critical production server

As much as I disagree with Microsoft's business practices, they actually make very good tools for verifying drivers (PreFAST is actually quite fantastic). You could also use Purify and Valgrind (yes, it runs on Windows). Symantec are also a compiler developer.

Because of some programmer's inability to understand that you should free what you malloc, I was forced to reboot a critical server before the "acceptable" hours of 3am - 6am. I applied the patch, and rebooted to a situation where I am expected to believe the problem is solved.

There are several worrying morals to this awful story:

Windows is too complicated. As my boss does say (he's not a developer or sysadmin), you should always seek to make things more simple
Many eyes make all bugs shallow. Open source software is held up high by geeks for a reason.
Don't use pcAnywhere. There is no possible justification for using it.
The companies expecting us to trust them with our host-based security systems expect our uptime in return without consequence (NAI, Symantec and CA are all guilty)
We (those in the industry) are cynical because we're given valid reasons to be that way

So to those responsible for my pain, those at the biggest security companies in the world, thanks for bringing down our industry in the eyes of the outsiders. You owe we in the trenches an apology, and it's way overdue.

Monday, February 25, 2008

We are all connected - A tale of system failure

Over the last 48 hours, a coworker and I have had to deal with a number of system problems across the many hetrogenous setups we take care of.

This is not unusual in our line of work - we have in-house produced, third-party and software-as-service style systems to take care of. We've seen the symptoms manifest from the known causes in the past as well, but this incident served as a most excellent reminder.

One particular service we take care of does real-time information lookups against a third-party's systems. The interconnection between our gear and theirs is not all that complicated, but does require some understanding and care.

Anyway, the third-party in this case had some network problems that knocked out parts of their systems. The problems were such that no lookups could be done, thus the service on our side wouldn't work.

We have a number of companies we interact with via the same interconnection method, and a number of them share infrastructure on our end. As is so often the case, systems often have follow-on effects to the systems they are connected to. A busy database server can cause problems in an application's web tier for example.

In this case, the interconnectivity used means that, without the due care and understanding, a lack of responsiveness from one lookup source and impact the deliverability of the entire service.

All of this is due to one aspect of the system's overall design - a single choice made many years ago (before myself or any of my workmates were involved - the system was acquired from a competitor).

This single choice was the result of vendor loyalty and narrow thinking. Vendor loyalty, in that using technologies recommended by and/or created by a particular vendor without regard for what other systems and technologies will be involved. Narrow thinking in that a single technology, designed for a specific purpose was considered to be the best solution for the wide array of environments that were part of the project's brief from day one.

With so many systems connected to other systems, there are still worrying recurrances of old, well-worn mistakes. Three small facts should be kept in mind when designing any system that interacts with another:

Networks fail
Bandwidth is finite
Latency changes, constantly

Simple really.

The best move is no move at all

It's worryingly simple. In all of software development, system administration and database administration, the best move is so often no move at all. In fact, it often applies to life in general.

Yes, it's non-sensical and somewhat circular but is a statement that makes much sense. Thinking about it though:

As a software developer, it is extremely unlikely that you'll come across a problem that no one else has had to solve and solve repeatedly. The first rule of writing great code is not to - if libraries exist, use them!
As a sysadmin you are most probably administering systems that plenty of other people take care of, satisfying similar requirements to those that many of your contemporaries are burdened with. The chances that none of them have posted or blogged their solutions and thoughts are quite low.
As a DBA, unless you have an environment the rest of us envy, then you'll have an environment the rest of us have relevant experience on - please use it.

It was Newton who said "If I have seen further it is by standing on the shoulders of giants" - the man was stupidly intelligent and listening to him makes an awful lot of sense.

If you can find a library, a set of scripts, a whitepaper or a blog entry/post that describes a suitable solution, you'll benefit. One of two things shall happen:

Your problem will be solved, and you can do something else
A possible approach will have been explored and excluded, narrowing your search

Additionally, either outcome will hopefully further contribute to the knowledge at hand of yourself and those who do as you do. If you find a suitable "off-the-net" solution, the mere mention of the specifics of your environment as a comment or reply will extend the suitability of the solution for everyone else. If a solution you find does not fit, mentioning why will either result in the information you found being updated and adapted or people saving time through your contributed analysis.

Simple really - but as the great one did say, it's much harder to be simple.

Methodology F

Sunday, May 31, 2009

AVS is a really bad idea

Sunday, February 8, 2009

Do less work

Sunday, July 13, 2008

malloc, free, repeat, repeat ad infinitum

Monday, February 25, 2008

We are all connected - A tale of system failure

The best move is no move at all

About Me

Stuff Null looks at

The cheerfully misguided

Methodology F

Sunday, May 31, 2009

AVS is a really bad idea

Sunday, February 8, 2009

Do less work

Sunday, July 13, 2008

malloc, free, repeat, repeat ad infinitum

Monday, February 25, 2008

We are all connected - A tale of system failure

The best move is no move at all

Invite the rantings in

About Me

Stuff Null looks at

The cheerfully misguided