.NET Mutterings: Const and Immutable types

Recently on the DevelopMentor CLR list there has been a discussion about the string type. Initially it was about why was it immutable and what does immutable mean, the discussion then spiraled into how can I create my own immutable type.

Well after someone kindly posted an example someone countered it saying well with reflection they can get around this...and thus the type is not strictly immutable.

It is at this point where I started to scratch my head in amazement to why this discussion was happening, why ? Well sure they are correct, but you have to ask yourself as a programmer how far do you go to show intent. I mean the const keyword in C++ was great from my opinion I could actually return a const reference if I so wished hinting to the consumer of that reference that they can only safely use this object in a read only context, allowing my implementation to perhaps share this object with multiple clients.

Now if that C++ programmer wishes to cast it to a mutable ie non const reference let them do it, me as the developer of that component washes my hands, Ive done everything reasonably possible to enforce correct usage of my type and the user has decided they know best. In other words the contract that I defined has been broken.

In my opinion when defining a type that should be immutable a far greater failing would be to say, well I can\'t guarantee immutableness via the interface so I won\'t bother. Having the flexibility that I defined above for C++ can make implementations far simpler. For example a type that is read only means that there are no thread safe issues to worry about, as well as having the opportunity to share the object with multiple consumers.

As far as I know there is virtually few useful languages were the contract can\'t be broken in some way at runtime, infact Jon Skeet posted a very nice example of how the most immutable of types the string can in fact be compromised in a very interesting fashion.

using System

using System.Text;

class Test

{

static void Main()

{

string x = “hello”;

unsafe

{

fixed (char *c = x)

{

for (int i=0; i < 5; i++)

{

c[i] = \'!\';

}

//Hello doesn\'t get printed...

Console.WriteLine (“hello”);

}

To understand why the following example produces the less than obvious results...the following code may through some light onto it

By all means not everything but two obvious tests,

char[] chars = new char[] { 'a’, 'b', 'c', 'd' };

string lhs = new String(chars);

string rhs = new String(chars);

Console.WriteLine(“{0} and {1} are shared : {2}” , lhs , rhs , Object.ReferenceEquals(lhs, rhs) );

lhs = “foo”;

rhs = “foo”;

Console.WriteLine(“{0} and {1} are shared : {2}”; , lhs , rhs , Object.ReferenceEquals(lhs, rhs));

The above snippet, gives false for the first result and true for the second. Static/literal strings are burned into the assembly and the same object is used each time ( for the same assembly).

Even placing the same string in a different source file for the same assembly results in sharing the reference. The process of building an assembly is smart enough to recognise the same literal string is being used and ensures that only one entry is placed into the String object is created.

You can see this via ILDASM, fire up ILDASM, and hit CTRL-M, scroll to the bottom and you will find the User String table

User Strings

-------------------------------------------------------

70000001 : (29) L”{0} and {1} are shared : {2}”

7000003d : ( 3) L”foo”

I wonder if Jon's trick could be used to hide the true values of some sensitive strings you are burning into your assembly...Since the IL for the last print statement looks like

IL_0037: nop

IL_0038: ldstr “hello”

IL_003d: call void [mscorlib]System.Console::WriteLine(string)

IL_0042: nop