What makes a good software library?

Last Updated: June 1, 2016 | Created: January 5, 2015

If you are a professional software developer then you will have used lots of software libraries/framework to help you build an application. In fact nowadays one of the key skills a developer need is pick the right software libraries to use in a given situation. I also expect most development teams have built their own software libraries to help them with some specific aspect of the work they do.

In this article I explore what I think makes a good, even great, software library. I would appreciate your thoughts on this so do add comments at the end.

NOTE: By software libraries I mean a software subsystem with some sort of Application Programming Interface (API). In the Microsoft/.NET approach software libraries are normally a Class Library with a known API, which are downloaded as a .dll file or most likely using NuGet. JavaScript libraries are normally a series of JavaScript files, possibly provided by Bower or Node.js npm, that again have some sort of API.

What motivated me to write this article?

I am building a number of .NET software libraries, one of which is an open-source project called GenericServices. Recently I built a small, but fairly complex test application to check out GenericServices and various other supporting libraries. This ‘stress test’ was really interesting in ways I did not expect (you can read more about what I did in this article).

In writing the code I found some really complex uses ‘just worked’ – in fact in a couple of cases I wrote the Unit Tests that I though should fail but they passed first time, which really threw me for a while. In other parts I had a problem in someone else’s software library and even when looking at their code I hadn’t a clue how to fix the problem.

All of this got me thinking about “what makes a good software library?” This is especially important to me as I want my libraries to be easy to use by anybody. I therefore been thinking about this for some months and I thought I would write down my thoughts.

Categorising software libraries

To help me in considering this issue I came up with categories for the types of software libraries I have come across. This helped me work out what I thought worked well and what didn’t, and more importantly why the library was like that.

Over a few months I came up with six categorisations for software libraries. They are:

1. Normal software libraries

These are the normal, run of the mill software libraries. Their APIs are predictable and maybe a bit boring, but they get the job done. However they are great for simple problems and can save you time.

I would say that some loggers, like Log4Net, are like this. Log4Net is a fine package and it does great job. I understand the problem of logging and I could write it myself, but this library has already implemented a great solution so I use that. Job done.

2. Verbose software libraries

These are run-of-the-mill software libraries. They work, but the API’s might be a bit long-winded or verbose.

One example of a library I use that has some aspects of this is Kendo UI. I pay good money to use this library because of what it provides, but the API can be hard work. It uses a JQuery type approach where everything is configurable. That can be useful but in Kendo UI there can be lots of things that need to be configured. For instance of Charts there are over a thousand configuration options which makes finding the right one quite hard work. In contrast NVD3.js charting has separate methods for each chart type which exposes the configurations needed by that chart.

3. Clever software libraries

These are software libraries that do really clever things, but as a developer you really need to know what you are doing. You get comments like “the trick is…”

While not technically a software library I think SQL, T-SQL in this case, is an example of a ‘clever’ library/API. T-SQL is the language for talking to a SQL server, which is a form of API, but called a Domain Specific Language. Using SQL I have built some amazing queries which perform complex modelling really quickly, but it is NOT simple. That is because the problem isn’t simple and the SQL API doesn’t try to hide it from you.

4. ‘Magic’ software libraries

These are the software libraries that do things that are useful, but they are so intricate that it is hard to understand what they are doing – I call these ‘magic’ software libraries. If they are written well they can be great, but if the underlying problem is complex then they are hard to write.

I think Entity Framework (EF) is a really good example of a good ‘magic’ software library. EF is database front-end library, known as an ORM, which tries to hide the complexities of the T-SQL API mention above to make it easier to work with databases. It is a great library, but it does take a while to really understand EF because there is so much ‘magic’ going on inside it.

5. ‘Mysterious’ software libraries

These are software libraries are really hard to understand. This could simply be because it’s a very complex API. However ‘magic’ software libraries can often become ‘mysterious’ when their standard ‘magic’ doesn’t work.

Entity Framework (EF) can suffer from this in places. For instance if you have a many-to-many relationship in a database then EF inserts an extra database table which is looks after to make it really easy to use. The problem is this ‘magic’ has some limitations and when it doesn’t work you need to know how to fix it. In actual fact the link to Updating a many to many relationship in entity framework on my site gets the highest number of hits of all of my posts.

6. Elegant software libraries

These are software libraries that are a joy to use. They may do simple or clever things, but they are a) short and to the point and b) you are clear on what is happening. This what all of us strive to achieve when we write a software library.

One example of this is Microsoft’s LINQ. LINQ is a data query language which I think is really elegant. Yes, there are a few complicated bits in it, like Grouping, but most of the time it’s easy to string together a series of commands to achieve quite complex things. It is clean, clear and only mentions the things you need.

NOTE: I haven’t included a ‘must work’ category as I take that as a prerequisite. However some libraries, especially ones that have not been used much, can be suspect. Caveat emptor.

Having finished these software categories let us look at some particular issues that we should consider when building software libraries.

Think about how the developer will use your library

Here are some ideas to make you as the library writer think about the experience the end-user developer has when using your library.

1. Think from the user’s perspective

As the developer of a library we tend to be focused on the implementation, not on the user. That makes the library writer think about what he/she wants to achieve, not how the user wants to use it. Let me give you an example from my own experience of developing my GenericServices library.

When I started development of GenericServices I saw that the access to the database and the calling of business logic had some similarities. I therefore ‘saved time’ by writing code that handled both. Problem is the code was a little bit complex and a little bit opaque to the end user.

This became apparent when I started to us it. Even I might hesitate for a second or two about a certain term, so what would another developer feel like. I therefore split the two libraries, with GenericServices focused on the core problem of database access. Yes, there is some duplication but it is much clearer to the user/developer and two libraries are much simpler.

2. Focus on the most important use-cases

A Use Case is a term to define how a user, in this case a software developer, will use your tools, in this case the API of your software library, to achieve a specific action. I found it very useful to order the use cases.

The most obvious way to order is on how often a particular use case was likely to be used. Sometimes there isn’t this sort of split, but often there is – sometimes called the 80/20 rule. If you do have this split then you need to make the API for the ‘normal’ (80%) cases short and easy. The ‘special’ (20%) use cases still need to be supported, but the API can have a few more lines of code to archive it so that the ‘normal’ use cases stay simple. This can make a big difference to how many lines of code a developer will need to write in complete project.

In the case of my GenericServices library I had another ordering based on the architecture used in the data layer: a CRUD design or a Domain-Driven Design (DDD) design. These two design approaches have very different needs when it comes to adding or changing data. DDD is more difficult to support because the data layer imposes some rules that GenericServices has to abide by. I managed to support DDD while having no impact on the simpler CRUD usages, which was great. Maybe you have another perspective/ordering that will help you focus your API.

3. Principal of least astonishment

I came across the term ‘Principal of least astonishment’ in one of Addy Osmani blogs, although he didn’t coin the term. Addy explains this as follows:

Try not to violate the principle of least astonishment, i.e. the users of your component API shouldn’t be surprised by behaviour. Hold this true and you’ll have happier users and a happier team.

I really like this term. It sums up the need for the The API to be a) obvious b) consistent and c) in tune with what developers expects from a library.

To that end if you are developing for the Microsoft .NET arena I would recommend the book ‘Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries: Conventions, Idioms, and Patterns for Reuseable .NET Libraries (Microsoft .Net Development)’. This book is packed with well researched guidelines which will make your .NET library look more like the standard .NET libraries.

NOTE: This book was written in 2008 so it does not include LINQ Lambda and the newer async/await, both of which I think change some of the guidelines. However I still recommend this book.

Complex problems demand a lot of thought

If your problem is complex then you have to be really careful when contemplating writing a good library. What you are normally doing in a library is providing a set of commands (API) that help you do a job more quickly than if you wrote the actual code yourself. This often requires you to ‘abstract’ the underlying problem into a set of commands.

With small problems like a logger then the abstraction isn’t that great, but the more complex the problem the more you are abstracting the problem. The problems come if you abstraction is not perfect, or as it is called ‘a leaky abstraction’. In the post ‘The Law of Leaky Abstractions‘ Joel talks about abstractions. He says:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There’s leakage. Things go wrong. It happens all over the place when you have abstractions.

Get the abstraction right and you have a ‘magic’, possibly Elegant design. Get it wrong and you can have a ‘Mysterious’ software library that people will give up on. Here are a few ideas I have on helping with this problem:

  • Include ‘get out of jail’ access points
    By this I mean the library tries to abstract things but it still provides direct access to the underlying system. Entity Framework does this by including access to direct SQL commands for the times when its abstraction gets in the way.
  • Don’t bite off more than you can chew
    Sometimes a complex problem can be broken down into sub-problems. This allows you to build a few smaller, focused libraries rather than one big complex library. I have used this a few times and found it has worked well for me.
  • If all else fails, don’t abstract too much. The T-SQL mentioned earlier does this. It does not abstract the problem much at all which leaves it up to the developer to learn its complexities of the problem space. That way it minimises the leakiness of the abstraction.

Striving for Elegance

Many years ago I read a brilliant book by Eric Evans called Domain-Driven Design. In this book Eric devotes a whole section called “Refactoring towards deeper insight”. His message is we are used to refactoring to make the software better, but what about refactoring to improve what the software does? My experience is that making a good software library comes from a series of “deeper insights” following by some key refactoring.

The development of my GenericServices library proceeded with a mixture of hard works and “deeper insights”. Below is a list of the “deeper insight” moments that I think improved the library.

  1. The service layer has a lot of nearly duplicate code. I could build a library to do that.
  2. Generics would be a really good approach to use.
  3. The class signatures are hard to use, especially with dependency injection. I need to make them simpler.
  4. Having database and business logic together is confusing. I should split them.
  5. The stress test (see next section below) has shown up some issues I need to improve.
  6. I need computed parameters. Let’s use DelegateDecompiler to do that.

I already have my eyes on a few more changes:

  1. I could simplify even further the class signatures.
  2. Domain-Driven Design databases are important to me. I could improve the support for that.

Of course I would love to have been able to go immediately to the final design, but just I don’t think I could have done that. You have to live with a library and see it being used to find out what is wrong. In fact my eldest son Ben, who is a good programmer in his own right, says “Build one to throw away”. It does seem like you need to build something to see what is wrong before you can make it better – well I do at least.

Stress your library to find the pinch points

I built hundreds of Unit Tests and even build a whole demo web site to show GenericServices in action. This weeded out lots of issues but because I wrote the code it was subconsciously biased towards the GenericServices way of doing things.

Normally the ‘stress test’ is on a real project of some size. However I didn’t have a big project on the go at that stage so I decided to stress test GenericServices myself by using someone else’s database. I used the Microsoft AdventureWorks2012Lite database which is a) not designed by me and b) has some database features I had not come across before. That really pushed my library in ways I had not done before, and revealed some areas that needed more work.I recommend anyone who is trying to perfect a software library to ‘stress test’ their library.

The users of the library will test your library, but not as well as you can and they aren’t likely to feed back if they just find it too hard. For the investment of only two weeks I could really dig into what worked/did not work myself and I think it radically improved my library. Maybe you can stress your library on a real project you are working on, but allow time to tweak the library.

You can read about some of the things I found in the article called ‘Using Entity Framework with an Existing Database: User Interface’ on the Simple-Talk web site.

The end

I hope you have enjoyed reading this article and I expect you have your own views/experiences. I would love to hear them as it is only by sharing ideas do we get better at what we do.