John von Neumann, the “father” of computers as they are now, has said: There’s no point in being exact about something if you don’t even know what you’re talking about. I like that quote and I’ll tell you why.

I have delivered again and again courses about requirements management and requirements gathering. There is no surprise to that, as bad requirements are the main reason for failing projects. The ways to get “bad requirements” are countless. I am not going to detail them today. Today, I am interested in an interesting phenomenon about natural language. It happens that I had to work with the topic of natural language during my PhD and since then I keep an eye on it. As I am also interested in learning languages (I have learnt Hungarian out of curiosity and intellectual challenge) I keep a second eye on it. So, what do my eyes tell me?

I bought the other day “The Story of Writing” by Andrew Robinson at the British Museum (I love the British Museum and its library is killing my wallet every time I go). In the introduction it talks about the different writings over the world. We all know that learning Chinese is far more difficult than learning English. Of course, it is obvious but the explanation why is still interesting. I quote:

All scripts that are full writing – that is, a ‘system of graphic symbols that can be used to convey any and all thought’ (to quote John DeFrancis, a distinguished American student of Chinese) – operate on one basic principle, contrary to what most people think, some scholar included. Both alphabets and the Chinese and Japanese scripts use symbols to represent sounds (i.e. phonetic signs); and all writing systems use a mixture of phonetic and semantic signs. What differs – apart from the outward forms of the symbols, of course – is the proportion of phonetic to semantic signs. The higher the proportion the easier it is to guess the pronunciation of a word. In English the proportion is high, in Chinese it is low. Thus English spelling represents English speech sound by sound more accurately than Chinese characters represent mandarin speech; but Finnish spelling represents the Finnish language better than either of them. The Finnish script is highly efficient phonetically, while the Chinese (and Japanese) script is phonetically seriously deficient.

So, in short, you can read accurately Finnish when you know the rules and you cannot do that that easily with English. Here is a diagram (from DeFrancis and Unger) that shows on a theoretical continuum the different writing systems, between pure phonography and pure logography.

What is says is that the closer you are to Pure Phonography, the easier to read and produce the adequate sounds. Interestingly enough, I spent most of my life believing that French was harder to read than English when it is the opposite. And believe me, as a Frenchman adopting English as my working language I have learnt my lot of words that you cannot pronounce unless you know them already. But that is another story.

So, what we discover here is that, every language is made of phonetic and semantic signs, and that the phonetic is far from perfect. Now, what about the semantic? Back to our requirements, what is important to us is the information conveyed from one person to another, the semantic. In fact, it is even worse than the phonetic aspect! What we mean when producing a sentence is different depending on the context. It is different depending on your culture (with the same words). It is different depending on your age, experience, time of the day, etc. Let’s take a couple of examples:

Suppose you are at your doctor’s waiting room with your kid. You have been waiting for a while and your child is getting bored and looking for some fun to kill the time. She is now taking magazines, tearing them apart. Then she is moving chairs making noise and disturbing everybody trying their best to read something. You take your child apart and say: “That’s enough. I am tired of you. I don’t know you anymore!” Every one of you, readers, understands that we don’t mean “I don’t know you!” but “I am ashamed and I wish people would not know we are related.”

Now take the same sentence after being cheated by someone you really counted on. Someone who is disappointing you like never before. You now say: “I don’t know you anymore!” In this case you mean something like: “I do not want to have contact with you anymore. If I ever do, I will behave like if we were strangers.”

The example I often use in my trainings is the expression “I’m going to kill you!” Clearly, this is an expression that rarely means what the words are supposed to say; which is quite fortunate! We can read the meaning in the intonation, the facial expression, the body language, the precise context when it is said, etc.

So where is this leading? It is leading to the fact that most of projects failing due to poor requirements are counting on written natural language only; which means you remove intonation, facial expression, body language and so on. Only remains words. And words are so weak at being precise!… These projects are trying to prevent the unexpected by writing to death all the details of the project when in fact, the more you write, the more inaccuracy you add to the project. This is intrinsic to natural language. There is no way you can define all the details of a software project in writing only and be accurate. So, there must be other ways to do it. And indeed there are multiple solutions. I list without specific order: use of diagramming techniques like UML, increase the communication level between business stakeholders and IT team, implement documentation reviews, produce a project/company dictionary, get workers to know each other and understand each other’s roles, remove silos and ivory towers within the project and so on. I’ll surely pick some for a future topic.

Keep in mind: natural language is treacherous and not created for understanding without adding human context. If it were the case, after all these years trying to write laws with precision, we would not need any more to refer to jurisprudence. We still do; which means we still are unable to write law and mean precisely what we mean. So why would we for software?

Let’s think about IT!