dotnetconfigurationgithub

I hate configuration files! They contain highly sensitive information and there is always a risk of them leaking out. Especially when I read horror stories like this I remind myself to be more careful about it. It is particularly important when starting off with a private repository and converting it to open-source.

There are a few approaches and I will try to make pros and cons list for each of them which hopefully would make it easier for me to make a decision.

Method 1: Ignore config files completely

Simply add to gitignore before checking in the config files. This is not really a viable option but to cover all basis I wanted to add this one as well.

Pros

  • Very easy to implement
  • Easy to be consistent (use the same gitignore everywhere)

Cons

  • When someone checks out the project initially it wouldn’t compile because of the missing config file.
  • For a new user there is no way to figure out what the actual config should look like
  • For internal applications, you have to maintain local copies and handle distribution among developers manually.

Method 2: Check in configuration with placeholders

Check in the initial file then run the following command:

git update-index --assume-unchanged <file>

After this you can add the actual values and use the config as usual. But those values will be ignored. This way when someone checks out they can at least compile the project.

Pros

  • Less disruptive as the project can be compiled

Cons

  • When you make a change to the config file (e.g. add/rename keys) you can’t check in those changes as well

Method 3: Ignore config file, check in a sample file instead

This is essentially a merger of Method 1 and 2. Maintain two files (e.g. App.config and App.config.sample). Ignore the app.config from the getgo and only check in .sample file. Structurally it will be exactly the same as app.config without the confidential values.

Pros

  • Won’t compile by default, extra step for the people who check out (small one but still)

Cons

  • Both files need to be kept in sync manually
  • Still no online copy of the actual config file available

Method 4: Reference to another configuration file

A slight variation of the previous method. .NET allows cascading and linking with configuration files. For example say this is my app.config file:

<configuration>
	<appSettings file="Actual.config">
		<add key="Twilio.AccountSID" value="SAMPLE SID" />
		<add key="Twilio.AuthToken" value="SAMPLE TOKEN" />
		<add key="Non-sensitive-info" value="some value" />
	</appSettings>
</configuration>

I can link the appSettings section to another file named actual.config which would look like this:

<appSettings>
  <add key="Twilio.AccountSID" value="REAL 123456" />
  <add key="Twilio.AuthToken" value="REAL Token" />
  <add key="Non-sensitive-info" value="some value" />
</appSettings>

Actual.config never goes into source control. When the application cannot find the actual.config it just uses the values in the app.config. When the actual.config is present, those values override the sample values.

Pros

  • Only keys with sensitive values can be hidden

Cons

  • Doesn’t work with custom config sections
  • Still no online copy of the actual config file available

The idea is to maintain a private repository that contains the config files and add it as a submodule:

git submodule add git@github.com:volkanpaksoy/private-config.git

And link to that configuration file in the app.config as in Method 4.

Pros

  • Actual values are online and accessible

Cons

  • Requires a paid account for private repositories
  • More complicated

Verdict

As I already have a paid GitHub account the con for Method 5 is not quite valid for me so I think I will go with that one. Otherwise the actual values will be stored locally on the disk only and will eventually get lost. Looks like there is no convenient and easy way of doing it after all. If you have any suggestions I’m all ears.

As I said, I hate configuration files!

Resources

csharpdevelopment

Generally we don’t need to understand how every single line of code is compiled but sometimes it helps to have a look at what’s going on under the hood. The code we write doesn’t always translate directly to IL (Intermediate Language) by the compiler (assuming optimizations come into play). It also helps what really happens when we use shorthand notations in the language. So I decided to explore the IL and see how simple constructs are compiled. Some of them are very simple and commonly known but some of them were new to me so compiled all of them in this post.

Toolkit

I used a disassembler and decompilers to analyze the output. When we compile a .NET program, it’s converted into IL code. Using a disassembler we can view the IL. Decompiling is the process of regenerating C# code. It’s hard to read IL so getting the C# code back helps to analyse easier. I used the following tools to view the IL and C# code:

  • ILDASM: Main tool I used is ILDASM (IL Disassembler). It’s installed with Visual Studio so doesn’t require anything special. It’s hard to make sense of IL but generally helpful to get an idea about what’s going on.
  • Telerik JustDecompile: Decompiler takes an assembly and re-generates .NET code from it. It may not fully return the original source code and sometimes it optimizes stuff so it may skew the results a little bit. But it’s generally helpful when you need to see a cleaner representation
  • ILSpy: Same functionality as JustDecompile. I used it to compare the decompiled results.

So without further ado, here’s how some language basics we use are compiled:

01. Constructor

When you create a new class the template in Visual Studio creates a simple blank class like this:

namespace DecompileWorkout
{
    class TestClass
    {
    }
}

and we can create instances of this class as usual:

var newInstance = new TestClass();

It works even though we don’t have a constructor in our source code. The reason it works is that the compiler adds the default constructor for us. When we decompile the executable we can see it’s in there:

But when we add a constructor that accepts parameters the default constructor is removed:

It makes sense otherwise all classes would have parameterless constructor no matter what you do.

02. Properties

Properties have been supported since forever and they are extremely handy. In the olden days we used to create a field and create getter and setter methods that access that field such as:

private string _name;
public void SetName(string name)
{
    _name = name;
}
public string GetName()
{
    return _name;
}

Now we can get the same results by

public string Name { get; set; }

What happens under the hood is the compiler generates the backing field and getter/setter methods. For example if we disassemble the class with the above we’d get something like this:

Apart from the weird naming (I guess it’s to avoid naming collisions) it’s exactly the same code.

The backing field is only generated when this shorthand form is used. If we implement the getter/setter like this

public string Name 
{
    get
    {
        throw new NotImplementedException();
    }
    set
    {
        throw new NotImplementedException();
    }
}

Then the generated IL only contains the methods and not the backing field

One last thing to note is that we cannot have fields on interfaces but we are allowed to have properties. It’s because when used on interfaces the compiler again generates only the getter and setter methods.

03. Using statement

A handy shortcut to use when dealing with IDisposable objects is the using statement. To see what happens when we use using I came up with this simple code:

After decompiling:

So basically it just surrounds the code with a try-finally block and calls Dispose method of the object if it’s not null. Instead of writing this block of code it’s very handy to use the using statement. But at the end there is nothing magic about it, it’s just a shortcut to dispose objects securely.

04. var keyword and anonymous types

var keyword is used to implicitly specify a type. The compiler infers the type from the context and generates the code accordingly. Variables defined with var keyword are still strongly typed because of this feature. Initially I resisted using it as a shortcut because it’s main purpose is not to provide the luxury not to create implicitly typed variables. It’s introduced at the same time as anonymous types which makes sense because without having such a keyword you cannot store anonymous objects. But I find myself using it more and more for implicitly specifying the types. Looking around I see that I’m not the only one so if it’s just laziness at least I’m not the only one to blame!

So let’s check out how this simple code with one var and an anonymous type:

public class UsingVarAndAnonymousTypes
{
    public UsingVarAndAnonymousTypes()
    {
        var someType = new SomeType();
        var someAnonymousType = new { id = 1, name = "whatevs" };
    }
}

Decompiling the code doesn’t show the anonymous type side of the story but we can see that var keyword is simply replaced with the class type.

To see what happens with anonymous types let’s have a look at the IL:

For the anonymous type the compiler created a class for us with a very user-friendly name: <>f__AnonymousType0`2. It also generated readonly private fields and only getters that means they are immutable and we cannot set those values once the object is initialized.

06. Async and await

Async/await helps asynchronous programming much easier for us. It’s hiding a ton of complexity and I think it makes sense to invest some time into investigating how it works to use it properly. First of all, marking a method async doesn’t make everything asynchronous automagically. If you don’t use await inside the function it will run synchronously (the compiler will generate a warning about it). When you call an async method the execution continues without blocking the main thread. If it’s not awaited it returns a Task instead of the actual expected result. This means it's an ongoing process and it will run in the background without blocking the main flow. This is great so you can fire up a bunch of tasks you need and then wait for the results whenever you need all of them. For example:

public async void DownloadImageAsync(string imageUrl, string localPath)
{
    using (var webClient = new WebClient())
    {
        Task task = webClient.DownloadFileTaskAsync(imageUrl, localPath);

        var hashCalculator = new SynchronousHashCalculator();

        // Do some more work while download is running

        await task;
        using (var fs = new FileStream(localPath, FileMode.Open, FileAccess.Read))
        {
            byte[] testData = new byte[new FileInfo(localPath).Length];
            int bytesRead = await fs.ReadAsync(testData, 0, testData.Length);
            
            // Do something with testData
        }
    }
}

In this example, a file is downloaded asynchronously. The execution continues after the call has been made so we can do other unrelated work while the download is in progress. Only when we need the actual value, to open the file in this instance, we await the task. If we didn’t await we would try to access the file before it’s completely written and closed resulting in an exception.

One confusing aspect maybe calling await on the same line as in fs.ReadAsync in the above example. It looks like it’s synchronous as we are still waiting on the same line but the difference is the main thread is not blocked. If it’s a GUI-based application for exmaple it would still be responsive.

So let’s decompile the assembly to see how it works behind the scenes. This is one of the times I like having redundancy. Because Just Decompile failed to generate C# code for the class using async/await so I had to switch to ILSpy.

It generates a struct implementing IAsyncStateMachine interface. The “meaty” part is the MoveNext method:

It’s very hard to read compiler-generated code but I think this part is interesting:

switch (this.<>1__state)
{
case 0:
	taskAwaiter = this.<>u__$awaiter7;
	this.<>u__$awaiter7 = default(TaskAwaiter);
	this.<>1__state = -1;
	break;
case 1:
	goto IL_102;
default:
	this.<task>5__2 = this.<webClient>5__1.DownloadFileTaskAsync(this.imageUrl, this.localPath);
	this.<hashCalculator>5__3 = new SynchronousHashCalculator();
	taskAwaiter = this.<task>5__2.GetAwaiter();
	if (!taskAwaiter.IsCompleted)
	{
		this.<>1__state = 0;
		this.<>u__$awaiter7 = taskAwaiter;
		this.<>t__builder.AwaitUnsafeOnCompleted<TaskAwaiter, AsyncMethods.<DownloadImageAsync>d__0>(ref taskAwaiter, ref this);
		flag = false;
		return;
	}
	break;
}
taskAwaiter.GetResult();
taskAwaiter = default(TaskAwaiter);

If the task has not been completed it sets the state to 0 which then comes back to default so it goes back and forth until taskAwaiter.IsCompleted is true.

07. New C# 6.0 Features

C# 6.0 has just been released a few days ago. There aren’t many groundbreaking features as the team at Microsoft stated. Mostly the new features are shothand notations to make the code easier to read and write. I decided to have a look at the decompiled code of some of the new structs

Null-conditional operators

This is a great remedy to ease the neverending null-checks. For example in the code below

var people = new List<Person>();
var name = people.FirstOrDefault()?.FullName;

If the list is empty name will be null. If we didn’t have this operator it would throw an exception therefore we would have to do the null checks on our own. The decompiled code for the above block is like this:

private static void Main(string[] args)
{
    string fullName;
    Program.Person person = (new List<Program.Person>()).FirstOrDefault<Program.Person>();
    if (person != null)
    {
        fullName = person.FullName;
    }
    else
    {
        fullName = null;
    }
}

As we can see, the generated code is graciously carrying out the null check for us!

String interpolation

This is one of my favourites: Now we can simply place put the parameters directly inside a string instead of placeholders. Expanding on the example above, imagine the Person class is like this:

class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string FullName => $"{FirstName} {LastName}";
}

The decompiled code looks familiar though:

public string FullName
{
    get
    {
        return string.Format("{0} {1}", this.FirstName, this.LastName);
    }
}

This is exactly how I would do it if we didn’t have this new feature.

Auto-property initializers

We can set the default value for a property on the declaration line now, like this:

public string FirstName { get; set; } = "Volkan";

And the IL generated for the code is like this:

Decompiled code varies. Just Decompile doesn’t just displays the exact same statement as above. ILSpy on the other hand is closer to the IL:

public Person()
{
	this.<FirstName>k__BackingField = "Volkan";
	this.<BirthDate>k__BackingField = new DateTime(1966, 6, 6);
	base..ctor();
}

So what it does is, in the constructor it sets the private backing field with this default value we assign.

Expression bodied members

We can now provide the body of the function as an expression following the declaration like this:

public int GetAge() => (int)((DateTime.Now - BirthDate).TotalDays / 365);

The decompiled code is not very fancy:

public int GetAge()
{
    TimeSpan now = DateTime.Now - this.BirthDate;
    return (int)(now.TotalDays / 365);
}

It simply puts the expression back in the body.

Index initializers

It’s a shorthand for index initialization such as

Dictionary<int, string> dict = new Dictionary<int, string>
{
    [1] = "string1",
    [2] = "string2",
    [3] = "string3" 
};

And the decompiled code is no different than what it is now:

Dictionary<int, string> dictionary = new Dictionary<int, string>();
dictionary[1] = "string1";
dictionary[2] = "string2";
dictionary[3] = "string3";

As I mentioned there is nothing fancy about the new features in terms of the generated IL but some of them will help save a lot of time for sure.

08. Compiler code optimization

It is interesting to see how decompilers use optimization on their own. Before I went into this I thought they would just reverse engineer whatever is in the assembly but turns out that’s not the case.

First, let’s have a look at the C# compiler’s behaviour. In Visual Studio, Optimize code flag can be found under Project Properties -> Debug menu

In order to test code optimization I created a dummy method like this:

public class UnusedVariables
{
    public void VeryImportantMethod()
    {
        var testClass = new TestClass("some value");
		TestClass testClass2 = null;
        var x = "xyz";
        var n = 123;
    }
}

First I compiled it by default values:

Then disassembled the output:

I’m no expert in reading IL code but it’s obvious in the locals section there are 2 TestClass instances, a string and a integer.

When I compiled it with the optimization option (/o):

I got the following when disassembled:

There’s no trace of the value types (string and the int). Also the TestClass instance with null value is gone. But the TestClass instance, which is created in the heap rather than the stack, is still there even though there is nothing referencing it. It’s interesting to see that difference.

When I decompiled both versions with ILSpy and JustDecompile unused local variables were removed all the time. Apparently they do their own optimization.

Conclusion

It was a fun exercise to investigate and see the actual generated code. It was helpful to understand better complex constructs like async/await and also helpful to see there’s nothing to fear about the new features of C# 6.0!

Resources

csharpdevelopmentawss3rss

Recently I got burned by another Windows update and my favorite podcatcher on the desktop, Miro, stopped functioning. I was already planning to develop something on my own so that I wouldn’t have to manually backup OPMLs (I’m sure there must be neat solutions already but again couldn’t resist the temptation of DIY!). So started exploring RSS feeds and the ways to produce and consume them. My first toy project is a feed generator.

Implementation

I developed a console application in C# that can be scheduled to generate the feed and upload it to AWS S3. Source code is here

Apparently .NET Framework has a System.ServiceModel.Syndication namespace since version 3.5 that contains all the tools need to consume and create an RSS feed with a few lines of code. The core part of the application is the part that generates the actual feed:

public SyndicationFeed GetFeed(List<Article> articles)
{
    SyndicationFeed feed = new SyndicationFeed(_feedServiceSettings.Title, _feedServiceSettings.Description, new Uri(_feedServiceSettings.BaseUri));
    feed.Title = new TextSyndicationContent(_feedServiceSettings.Title);
    feed.Description = new TextSyndicationContent(_feedServiceSettings.Description);
    feed.BaseUri = new Uri(_feedServiceSettings.BaseUri);
    feed.Categories.Add(new SyndicationCategory(_feedServiceSettings.Category));

    var items = new List<SyndicationItem>();
    foreach (var article in articles
        .Where(a => a.ispublished)
        .Where(a => a.ispubliclyvisible)
        .OrderByDescending(a => a.publisheddate))
    {
        var item = new SyndicationItem(article.title, 
            article.bodysnippet,
            new Uri (string.Format(_feedServiceSettings.ArticleUrl, article.slug)),
            article.articleid.ToString(),
            article.publisheddate);
        
        item.Authors.Add(new SyndicationPerson("", article.authorname, string.Format(_feedServiceSettings.UserUrl, article.user.username)));
        items.Add(item);
    }
    
    feed.Items = items;
    return feed;
}

The feed itself is independent from the format (RSS or Atom). The classes that do the actual formatting are derived from the abstract SyndicationFeedFormatter class: Rss20FeedFormatter and Atom10FeedFormatter. The format is read from the config file so the application supports both formats.

public SyndicationFeedFormatter CreateFeedFormatter(SyndicationFeed feed)
{
    string feedFormat = _feedSettings.FeedFormat;
    switch (feedFormat.ToLower())
    {
        case "atom": return new Atom10FeedFormatter(feed);
        case "rss": return new Rss20FeedFormatter(feed);
        default: throw new ArgumentException("Unknown feed format");
    }
}

and in the publisher service it gets the output feed:

var memStream = new MemoryStream();
var settings = new XmlWriterSettings(){ Encoding = Encoding.UTF8 };
using (var writer = XmlWriter.Create(memStream, settings))
{
    feedFormatter.WriteTo(writer);
}

I added the output of the API call as a JSON file under samples. Also implemented a fake API client called OfflineFedClient. It reads the response from a file instead of actually making an API call. It comes in handy if you don’t have an Internet connection or a valid API key. To use it in offline mode you have to change the the line that creates the client from this

var client = new FeedClient(configFactory.GetApiSettings());

to this

var client = new OfflineFeedClient(configFactory.GetOfflineClientSettings());

Lessons learned / Implementation Notes

So since the motivation behind the project is to learn more about manipulating RSS by myself here’s a few things that I’ve learned and used:

  • I created an IAM account that has write-only access to the bucket the feed will be stored in. It works with default permissions which is private access. But since RSS reader service will need access to the feed I had to upload the file with public access, Apparently changing ACL requires different permissions, namely s3:PutObjectAcl. Weird thing is just replacing s3:PutObject with s3:PutObjectAcl didn’t work either. They had to be both allowed. So after a few retries the final policy shaped up to be like this:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1418647210000",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::{BUCKET_NAME}/*"
            ]
        }
    ]
}
  • In this implementation I used Visual Studio’s neat feature Paste JSON As Classes.

Edit -> Paste Special -> Paste JSON As Classes

First I captured the API response with Fiddler. Then created a blank .cs file and using this option created the class to deserialize the response. Using strongly typed objects can easily be a daunting task if you are wrapping a whole API so I’d prefer to use dynamic objects like this:

var response = client.Execute<dynamic>(request);
var latestArticles = ((IEnumerable) response.Data.payload.articles).Cast<dynamic>()
                        .OrderByDescending(a => a.publisheddate)
                        .Select(a => new 
                        { 
                            slug = a.slug,
                            id = a.articleid,
                            title = a.title,
                            publishedDate = a.publisheddate,
                            ispublished = a.ispublished,
                            isvisible = a.ispubliclyvisile
                        });

This works fine but the problem in this case was the hyphens in some of the JSON property names which are not supported in C#. I can get around it if I use the strongly typed objects and specify the property name explicitly, such as:

[JsonProperty("is-published?")]
public bool ispublished { get; set; }

But I cannot do it in the dynamic version. I’ll put a pin into it and move on for now but have a feeling it will come back and haunt me in the future!

  • Default output of the RSS feed passes the validation but get 3 warnings. I’m sure they can be safely ignored but just of curiosity researched a little bit to see if I could pass with flying colors. Two of the three warnings were

    line 1, column 39: Avoid Namespace Prefix: a10 [help] <rss xmlns:a10=”http://www.w3.org/200 …

    line 12, column 5302: Missing atom:link with rel=”self” [help] … encompassing the Ch</description></item></channel></rss>

I found the solution on StackOverflow (not surprisingly!)

I made a few changes in the formatter factory

case "rss":
{
    var formatter = new Rss20FeedFormatter(feed);
    formatter.SerializeExtensionsAsAtom = false;
    XNamespace atom = "http://www.w3.org/2005/Atom";
    feed.AttributeExtensions.Add(new XmlQualifiedName("atom", XNamespace.Xmlns.NamespaceName), atom.NamespaceName);
    feed.ElementExtensions.Add(new XElement(atom + "link", new XAttribute("href", _feedSettings.FeedUrl), new XAttribute("rel", "self"), new XAttribute("type", "application/rss+xml")));
    return formatter;
}

I liked the fact I only had to make changes in one place so the factory could return a customized formatter instead of the default one and the rest of the application didn’t care at all. But unfortunately the fix required the publish URL or the feed. I got around it by adding to FeedSettings in the config but now the S3 settings and Feed settings need to be changed at the same time.

My idea was to make it like a pipeline so that the feed generator didn’t have to care how and where it’s published but this fix contradicted with that approach a little bit. Unfortunately it doesn’t look possible to use variables in the config files so that I could generate Feed.Url using the other settings.

  • The 3rd warning was encoding-related. If don’t explicitly specify the API uses ISO-8859-1 charset. I tried playing around a with a few headers to get the response in UTF-8 but the solution came from a friend: Accept-Charset header. So adding the header fixed that issue as well:
request.AddHeader("Accept-Charset", "UTF-8");

Conclusion

The genereated Atom feed doesn’t pass the validation but I will handle it later on. Since Atom is a newer format I think I’ll go with that in the future but so far it’s good to know that it’s fairly easy to play with RSS/Atom feeds with C# so it was a fun experiment after all…

Resources