Hansel and Gretel Architecture

Expanding on a random tweet of mine…

Ever since we gained the ability to run multiple applications/system simultaneously, either on one machine or across machines, we’ve had the need to share data between applications. The general ledger system needs payroll data. The order processing system needs inventory data. The customer web site needs pricing data. And so it goes.

Over the decades various schemes and theories have been developed for solving this problem. Some of them were good long ago when no better solution was possible. Some of them are still good in limited scenarios. Some are the current state of the art. And as time goes on, I have no doubt someone will devise some method we haven’t considered or dreamed of yet.

One of the earlier techniques used was to have multiple applications use the same database. This allowed them to all see the same data, which made it possible for the general ledger system to get at the payroll data, and so forth. It saved costs, too, in the era where a server capable of hosting a database was expensive and difficult to maintain and administer. When only a few dozen people in a couple of accounting focused (or other similar domain) departments needed to access the data. In short, it was a really good idea from a technical and business perspective.

But then machines got faster and cheaper. More and more people needed to have direct access to data. And we created more and more applications for different groups. Some of these groups had different ideas what the different “entities” in the business domain meant. To some people, an order is a request to pull some item from inventory. To others it’s a request to generate a bill. Still others find it is historical information that can be used to determine future demand for marketing plans.

And then we figured out that if we needed the accounting folks to invoice a customer for an order, we could just put a InvoiceCustomer flag on the order table and set it to some value representing true. So we set it to “T” and hoped that they would then change it to something else when they generated the invoice. We never documented that we expected it to be “F”, we just assumed they would know.

Eventually, someone decided that we added a lot of flags. Maybe we should find a way to make it easier so we didn’t have to bother the DBAs with a new state every few weeks. So we created a new field. We called it “OrderState”. Now we decided that an order that needed to be invoiced would have a value of “NI” (for needs to be invoiced) in the “OrderState” field. But we can’t get rid of the InvoiceCustomer flag, because we’re not sure if any other application is using it.

Before long, there is a massive trail of poorly documented and specified fields. Nobody has a clear picture of why (or even if) they are required. The usage is across a bunch of applications and reports, perhaps even some applications that aren’t even being developed anymore. You don’t dare touch any of items on this trail. They are like the breadcrumbs from an old folk tale – for various reasons they disappear or degrade over time until they no longer are able to show you the way home.

No matter what the data elements, no matter how tightly you control the database, no matter how well you document it, some of these breadcrumbs will materialize. The best way to combat this problem is to have a separate data store for every application/service. That way, you can know for sure how the application makes use of the data, and be sure that no one else is using the data differently. Don’t even allow another application to have read access. Require other systems to interface with the data via your application/service using messaging, pub/sub, or web services.

In a world where IThis shouldn’t be interchangeable with IThat…

Uncategorized

Discovered something unexpected the other day. Not sure why I never saw it before, but I found out that if you are iterating through a generic collection of items where the collection is holding an interface, you can use a completely unrelated interface as your iterator.

Not sure what the heck I just said? Let’s look at some code:

   1: List<IAmAnInterface> stuff = new List<IAmAnInterface>

   2:                                  {

   3:                                      new AClass {AValue = "First"},

   4:                                      new AClass {AValue = "Second"}

   5:                                  };

6:

   7: foreach (IAmAnotherInterface item in stuff)

   8: {

   9:     item.DoSomething();

  10: }

Believe it or not, this code will compile. IAmAnInterface and IAmAnotherInterface have nothing to do with each other, and AClass only implements IAmAnInterface:

   1: interface IAmAnInterface

   2: {

   3:     void DoSomething();

4:

   5:     string AValue { get; set; }

   6: }

   1: interface IAmAnotherInterface

   2: {

   3:     void DoSomething();

4:

   5:     string AValue { get; set; }

   6: }

   1: class AClass : IAmAnInterface

   2: {

   3:     public void DoSomething()

   4:     {

   5:         Console.WriteLine("From AClass {0}", AValue);

   6:     }

7:

   8:     public string AValue { get; set; }

   9: }

So, it looks like this shouldn’t compile, right? No way that it will work in the real world. But the compiler doesn’t have any idea that the objects in the stuff collection in the first snippet doesn’t implement both interfaces. It apparently has to allow for that possibility, which seems strange, but I bet if you were coming at it from the other perspective and had an object from a class that implemented both interfaces, you might find it strange if this didn’t work.

I’m not sure why this decision was made: it seems like it would cause more harm than good, because you will only find out about the problem at runtime when it fails. Of course, if you have a unit test or two around it all, you’ll find that out pretty quickly. Regardless, now I know that the compiler won’t catch this for me. And now you know too.

Scrum 101 Materials

Uncategorized

Thanks everyone who attended my Pittsburgh Code Camp 2010.2 talk, “Scrum 101”. Great audience interactions – I think we all learned more than we would have than if it had just been me rattling on.

I’ve uploaded my slides and the Excel workbook off of which I based my sprint burn down chart. You can find them here:

Slides

Workbook

The Dangers of Data Binding (part 3456 of infinity)…

Uncategorized

So, when you have a form, it’s nice to bind data to the controls. In spite of my personal biases, there is nothing wrong with that.

But, you have to be careful about what you are binding to – if the object that you bind to the control does not go out of scope, the control will not, either. So, you will have a memory leak. The fun thing about this kind of memory leak is it will also leak graphics handles, which are a finite resource on a machine, and so if you have a lot of controls on the form, you may actually crash the application fairly easily, even thought there is memory to spare.

Here is an example of where this kind of data binding problem happened:

   1: uxComboBoxPackageTypes.DataSource = new BindingSource

   2:     (PackageTypeDictionary.PackageTypes, null);

Note that PackageTypeDictionary.PackageTypes is a static dictionary. It will never go out of scope, as long as the application is running. So, in this case, every Med Detail screen that gets popped up in Visual Assign will stay around forever.

The solution, in this case, is to upon form closing, set the DataSource to null, to break the binding:

   1: protected override void OnClosed(EventArgs e)

   2: {

   3:     base.OnClosed(e);

4:

   5:     …

6:

   7:     uxComboBoxPackageTypes.DataSource = null;

   8: }

To find this, I used YourKit (which helped me pin down what was leaking in the overall scheme) and WinDbg (where I got into the gritty details). There are lots of good resources on WinDbg out there, but I want to point out this excellent article which really helped me figure out how to track down event chain leakages (and led to figuring out where the real problem lie):

http://blogs.msdn.com/b/tess/archive/2006/01/23/net-memory-leak-case-study-the-event-handlers-that-made-the-memory-baloon.aspx

Pgh Code Camp 2010.2

Uncategorized

Its getting to be that time. Wait, what time? We always have Code Camp in the spring – the leaves are going to fall (or started falling, if your trees are lame like mine) – surely Eric is on some sort of hallucinogens. Nope. This year we’re doing Code Camp twice – Code Camp 2010.2 will be October 16, 2010, once again in Pitt’s compsci building. Details can be found here.

Our call for speakers has gone out a while back, but there’s still room for more, so if you want to present on something, there’s still time. You can go fill out the call for speakers form here: http://codecamppgh.wufoo.com/forms/code-camp-20102-call-for-speakers/.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Eric Kepes

Just some guy who cares about good software.

Uncategorized

Hansel and Gretel Architecture

In a world where IThis shouldn’t be interchangeable with IThat…

Scrum 101 Materials

The Dangers of Data Binding (part 3456 of infinity)…

Pgh Code Camp 2010.2