Beyond C#7 - Your Definitive Guide
Following the development of the C# language over the last decades has been amazing. When Microsoft unveiled C# 1.0 at the dawn of the 21st Century, industry veterans were skeptical, seeing it as little more than a boring alternative to Java. James Gosling essentially called it a stupid imitation of his brainchild. Despite the odds, the team behind the C# language, lead by one Anders Hejlsberg (who doesn't even have a beard) has continued to impress us with their ingenuity and pragmatic direction.
Figure 1: programming language designers by facial hair. Image: Wired
Figure 2: StackOverflow's survey result showing C# as one of the most loved programming languages
With each new C# version, there's a clear trend toward functional programming. In fact, C#'s continued relevance over the years owes a lot to the ideas that it has adopted from the functional world. Imperative and OOP features still have their places in the language, but they're increasingly being retrofitted, or glossed over by functional and declarative constructs.
In this article, we will go over some important 'pillars' of functional programming to see how the C# team has incorporated them into their language and take a look at what they might be planning for the future.
Generics and type inference
Generics, or parametric polymorphism, is basically the ability for programmers to handle similar kinds of data with one code base. And doing more with less is always a good thing. Generics is the foundation upon which many advance features are implemented. It is the bread and butter of any self-respecting language with a type system.
Although generics is relatively simple to use from a working programmer's perspective, any language designers would know that it is a huge feat of engineering to implement. Java has generics a bit earlier than C#, but their implementation suffers from the problem of type erasure, which means the type parameter is known only at the compiler level, and it is lost (erased) when compiled to JVM bytecode.
C#'s generics was released with C# 2.0 in 2005, the result of the collaboration with Microsoft Research in Cambridge, UK - a place famous for functional researches. The team decided to make the effort to implement reified generics, which means type information is preserved all the way down to the CLR. It is said that, without that great undertaking to implement reified generics, later development of C# would be severely limited. For example; there would be no LINQ in C#3, no Task Parallel Library in C#4, and no async/await in C#5. Amusingly, Don Syme, who worked on C# generics went on to lead the design of F#.
C#3 in 2007 introduced the var keyword and better generic type inference in method invocation, where the compiler can infer the type based on what's being used. While not specific to generics, this feature alleviates a lot of verbosity when declaring generic variables. Amusingly, many developers mistook this for C# having a dynamic type system, which it totally doesn't.
C#4 in 2010 added covariance and contravariance, which allowed greater reuse of generic types.
Future development in this area includes default value for generic type parameter, and better type inference in specific cases. There are more exciting stuffs from the functional world like automatic generalization and Hindley Milner type inference that make it really ergonomic to use generic code in languages like F# or Haskell. However, due to its C++ and OOP lineage, there's little chance that C# will develop those two features.
Functions as first class citizens
Have you ever wondered, what would happen if we could declare functions as easily as declaring a string, or combine functions as naturally as adding two integers? Haskell users have been enjoying functions as a first class citizen like that for a long time. C# got the first part, declaring functions, with its lambda expressions at version 3, and it was instrumental to the success of LINQ. There was also method group conversion and static using in C#6, which helps passing functions around as delegates that much easier.
Haskell and F# also have dedicated operators to compose and chain functions together. They also have an automatic currying process (named after Haskell Curry), which can turn functions with any number of parameters into functions with one parameter for composing. Because of this feature duality, we can often see a style in functional code where the programmer prepares a bunch of small, testable functions, then weave them into a chain of computation, and finally pipe in the data at the last step:
This coding style is highly testable and tremendously reduces the need for elaborate mocking. This is in contrast to the OOP approach, where, in the words of Joe Armstrong of Erlang fame: "You wanted a banana but what you got was a gorilla holding the banana and the entire jungle."
Extension methods in C#3helps a bit in this regard in order for us to have code that resemble a chain of computation:
Unfortunately, once again, due to its OOP lineage with so many methods in the BCL having overloads, C# will probably never see anything like native currying and built-in functions composition. We will have to look to user-land libraries like the excellent lang-ext for that need.
Everything is an expression
Most constructs in a programming language can be divided into 2 kinds: statements and expressions. Statements are all about side effects, while expressions are all about values. Expressions are typically shorter and more composable than statements, which depends on specific ordering to make sense. In the far end of the functional spectrum, there is the LISP family of languages (Scheme, Clojure) where everything is an expression. Any program, however complex, can be composed from a tree of small,reasonable and testable expressions. A high degree of composability is also a good thing.
Obviously, C# will never go that far, and it will never become a full-fledged functional language. However, historically, the designers have been paying attention to making the language more expressive. There are a lot of features that enabled developers to replace statements with expressions where it makes sense:
C#3 introduced lambda expression, which allowed functions to be passed around freely, eliminating the need for boilerplates in the form of intermediary objects. This revolutionized the way APIs in libraries and frameworks were written for C#.
C#3 also added the object and collection initialization expressions, making it very succinct to declare complex data structure.
In C#6, we can start replacing method bodies with lambda expressions, making class declarations really terse.
C#7 introduced the is expression. This is an important foundation for powerful pattern matching down the road.
The out keyword, widely regarded as an ugly hack for multiple returns, can now have an assignment expression right inside, reducing the need for a clunky variable declaration in front of it.
For better or worse, throw is now also an expression in C#7.
For future versions, there are many proposals in the same vein:
As part of the pattern matching proposal, there will be a match expression, which will be able to replace if and switch statements in many cases.
For easy cloning and modifying of existing data structure, there will be a with expression as part of the record proposal.
All in all, we're looking at a very "expressive" C# in a near future.
Powerful type system
In functional languages, there is a trio of features that blend together in harmony so that working with data becomes a really pleasant experience:
The Algebraic type system allows users to succinctly compose a new type from existing ones, then creates instances of that type just as easily.
Pattern matching for testing if a piece of data matches a certain shape and size of a type
Destructuring to make extracting pieces of data from a container just as natural as putting them in.
Looking at the activities on github, we can see that the designers of C# are definitely working toward this goal.
Algebraic data type:
One criticism toward classical OOP and their type system is that it is too rigid and inflexible, in areas where the only natural way to re-use types is to extend them. Very often do we see the textbook OOP hierarchy of cats, dogs, birds and human inheriting from animals breaking spectacularly in a real world projects.
When faced with existential crisis about types, some folk came up with questions similar to ones that were asked about functions: just as we can add or multiply integers, can we do the same with types? It turns out that we can, and the results are product type and sum type.
Product type is the result of multiplying types together. Classes in traditional OOP languages are product types. However, they were originally designed to house many things aside from data like methods, events, indexers... so they can be unnecessarily verbose to be used as data type. The designers of C# are adding tuples and records to the language to help in this regard.
A tuple is basically an ordered sequence of values of different types. The concept of Tuples were introduced to C# back in version 4, but it was clunky and almost nobody uses it. Tuples in C#7 gets a lot of compiler love, making it really easy to put data in a tuple, passes it around and extract data from it. Expect lots of code where multiple returning values or intermediary classes can be eliminated and replaced with tuples in the future.
A Record is a simple container of named values. It's like classes, but really succinct and typically don't have any methods. There's a really promising proposal for records in future C#.
Sum type is the result of adding types together. It's also known as tagged union or discriminated union in F#, or case class in Scala. It is useful for expressing a lot of tricky data situation, for example:
A function returning some data or none
An operation that either evaluate to a value successfully or encountered an error
An API that returns a 301 redirect or JSON content
The various actions that can be performed on an interface (a button click, a key down...)
Traditional OOP languages also have sum type in the form of enum but they are often underused and underpowered. C#'s current enum implementation is severely crippled, as it cannot hold different data types. In contrast, new languages like Swift and Rust got their enum right and they can be used in place of tagged union.
There's a proposal for proper sum types under the name of discriminated union in C# and there's clear intent from the team to start work on it some time in the future.
Another aspect where functional type system really shine is the banishment of the billion-dollar mistake: null reference. Null is dangerous in Java and C# because the type system allows it to be a member of all reference types. Null claims to support the contract of a type but that claim is a lie. Null is like a timed bomb that blows up when we try to use it, often at the most embarrassing moment. In contrast, functional languages use the sum type Maybe or Option to handle missing values. An Option represent a value that can be either something or none. Unlike null, none is not a member of any other type, it doesn't claim to support any contract and will never blows up, as the compiler forces you to check for it whenever it encounters an Option.
C# can never go back on null without breaking backward compatibility. However, that doesn't mean the designers can't find ways to make handling of nulls easier. In C#6, there was the conditional operator?. for safely accessing nullable members. We've seen developers using this feature to great success in the wild. Future version of the C# compiler will likely have the ability to track usage of nulls and warn users where appropriate.
While only marginally useful when used on its own, this feature becomes much more powerful when paired with pattern matching. After we have tested that the incoming data match a certain shape and size, we can immediately break it apart and do useful things with its members. Deconstruction protocol landed in C#7, and any type can be deconstructed.
F# has it and C#'s gonna have this nice thing soon. When this feature lands, we would expect the gradual fading away of if and switch statements, as the coming match expression is a much more powerful and succinct construct:
For programmers looking to enter the field, C# is an good starting point in term of its practicality and paradigms it has adopted. C# has evolved since its inception and we can learn a lot from its journey. It's beneficial for your long-term development as a programmer, whether you're using C# or not, or you are planning to make the leap to other "real" functional languages like F#, Scala or Elm....
Do you Agree or Disagree? Tell us what you think below...