dev csharp, neo4j, cypher, json

I have a simple JSON file that contained a bunch of users with their followers which looked like this:

  {
    "user-id": 2,
    "username": "user_2",
    "avatar": "URL",
    "following": [
      "user_10",
      "user_6",
      "user_1"
    ],
    "followers": [
      "user_10"
    ]
  }

It felt like a good exercise would be to import that data into a graph database as it wasn’t something I had done before.

As my go-to language is C# and I had some experience with Cypher before, my initial instinct was to develop a tool to generate Cypher statements from the JSON.

First I create the nodes:

private void GenerateNodesCypher()
{
    string filename = @"..\..\output\nodes.cql";
    var output = new StringBuilder();

    foreach (var user in _userList)
    {
        string s = $"CREATE ({user.username}:User {{ userid: {user.userid} , username: '{user.username}', avatar: '{user.avatar}'  }} ); ";
        output.AppendLine(s);
    }

    File.WriteAllText(filename, output.ToString());
}

and then the relationships:

private void GenerateRelationshipsCypher()
{
    string filename = @"..\..\output\relationships.cql";
    var output = new StringBuilder();
    int n = 0;
    foreach (var user in _userList)
    {
        foreach (var following in user.following)
        {
            string s = $"MATCH (a), (b) WHERE a.username = '{user.username}' AND b.username = '{following}' CREATE (a)-[:FOLLOWING]->(b); ";
            output.AppendLine(s);
            n++;
        }
    }

    File.WriteAllText(filename, output.ToString());
}

So I ended up with two cql files that looked like this

CREATE (user_1:User { userid: 1 , username: 'user_1', avatar: 'URL' }); 
CREATE (user_2:User { userid: 2 , username: 'user_2', avatar: 'URL' }); 
CREATE (user_3:User { userid: 3 , username: 'user_3', avatar: 'URL' }); 

and

MATCH (a), (b) WHERE a.username = 'user_1' AND b.username = 'user_26' CREATE (a)-[:FOLLOWING]->(b); 
MATCH (a), (b) WHERE a.username = 'user_1' AND b.username = 'user_53' CREATE (a)-[:FOLLOWING]->(b); 
MATCH (a), (b) WHERE a.username = 'user_2' AND b.username = 'user_6' CREATE (a)-[:FOLLOWING]->(b); 

and I used neo4j-shell to execute the files and import the data. But it was no bed of roses. I’ll list the problems I faced along the way and how I got around them so that this experience might be helpful for other people as well:

Issues along the way and lessons learned

Running multiple statements on Neo4J browser

First I tried to run the create statements using the Neo4J browser which turned out to be problematic because it cannot run multiple statements that end with semi-colons. So I removed the semi-colons but then it started giving me this error

WITH is required between CREATE and MATCH

I found a workaround for that on SO. So the following works:

MATCH (a), (b) WHERE a.username = 'user_1' AND b.username = 'user_14' CREATE (a)-[:FOLLOWING]->(b); 
WITH 1 as dummy
MATCH (a), (b) WHERE a.username = 'user_1' AND b.username = 'user_22' CREATE (a)-[:FOLLOWING]->(b); 

Now the problem was, if the data was a bit dirty, for instance if user_14 didn’t exist, it stopped executing the rest and no other relationships were created). I had a few nodes like that so this method didn’t work for me after all.

Starting the shell was not as easy as I’d imagined

I installed Neo4J using the default settings and to start the shell just navigated to that directory and ran the batch file. Got an error instead of my shell:

C:\Program Files\Neo4j Community\bin>Neo4jShell.bat

The system cannot find the path specified.

Error: Could not find or load main class org.neo4j.shell.StartClient

Apparently the way to run the shell is using the Neo4J server application and clicking Options -> Command Prompt

Neo4J command prompt

This launches the Neo4J Command Prompt then the rest is easy:

neo4jshell -file nodes.cql

neo4jshell -file relationships.cql

“Exiting with unterminated multi-line input”

Final issue was the statements in a sql file must be terminated with a semi-colon in order for the shell to execute them. Otherwise it gives the above warning and quits

So after all this hassle my data is in the Neo4J database ready to be queried to death!:

Next I’ll investigate converting the data into CSV and achieve the same results by using LOAD CSV command and look into visualization of this data.

Resources

dev csharp, asp_net, mvc

In ASP.NET, TempData is one of the mechanisms used to pass data from controller to the view. In this post I’ll dive into its source code to investigate its behaviour.

What is TempData

TempData is an instance of TempDataDictionary that is used to pass data from the controller to the view.

Lifespan

The lifespan of TempData is rather unusual: It lives for one request only. In order to achieve this it maintains two HashSets to manage keys as well as the data dictionary:

private Dictionary<string, object> _data;
private HashSet<string> _initialKeys = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
private HashSet<string> _retainedKeys = new HashSet<string>(StringComparer.OrdinalIgnoreCase);

When we read some data using an indexer or TryGetValue method it removes that key from _initalKeys collection.

public bool TryGetValue(string key, out object value)
{
    _initialKeys.Remove(key);
    return _data.TryGetValue(key, out value);
}

The actual dictionary that holds the data is intact at this point. That’s why we can read same data consecutively without any issues. It only removes the key from _initialKeys collection, basically marking it to be deleted when the data is persisted.

Peek and keep

If we want the values in TempData last longer we can use Peek and Keep methods. What Peek does is return the value without removing it from the _initialKeys:

public object Peek(string key)
{
    object value;
    _data.TryGetValue(key, out value);
    return value;
}

Alternatively, we can call Keep method. Similarly it doesn’t manipulate the data directly but just marks the key to be persisted by adding it to the _retainedKeys collection.

public void Keep(string key)
{
    _retainedKeys.Add(key);
}

Parameterless overload of Keep method add all keys in the _data dictionary to _retainedKeys.

Which keys to persist

So as seen above when we get values and call Peek/Keep methods, operations are carried out on _initialKeys and _retainedKeys collections and nothing happens to the actual data. These operations take effect when the _data is actually saved:

public void Save(ControllerContext controllerContext, ITempDataProvider tempDataProvider)
{
    _data.RemoveFromDictionary((KeyValuePair<string, object> entry, TempDataDictionary tempData) =>
        {
            string key = entry.Key;
            return !tempData._initialKeys.Contains(key) 
                && !tempData._retainedKeys.Contains(key);
        }, this);

    tempDataProvider.SaveTempData(controllerContext, _data);
}

Before the data is passed on to the provider it’s pruned. The keys that don’t exist in the _retainedKeys (the keys we explicitly told to keep) and _initialKeys (the keys that have not been touched so far or accessed through Peek method) collections are removed.

Providers

By default, TempData uses session variables to persist data from one request to the next. Serializing and deserializing data is carried out via an object implementing ITempDataProvider. By default SessionStateTempDataProvider class is used to provide this functionality. It occurs in the CreateTempDataProvider method in Controller.cs class:

protected virtual ITempDataProvider CreateTempDataProvider()
{
    return Resolver.GetService<ITempDataProvider>() ?? new SessionStateTempDataProvider();
}

This also means we can replace the provider with our own custom class. For demonstration purposes I wrote my own provider which uses a simple text file to persist TempData:

public class TextFileTempDataProvider : ITempDataProvider
{
    internal readonly string FileName = Path.Combine(HttpContext.Current.Server.MapPath(@"~/App_Data"), @"TempData.txt");

    public IDictionary<string, object> LoadTempData(ControllerContext controllerContext)
    {
        if (File.Exists(FileName))
        {
            string json = File.ReadAllText(FileName);
            return Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, object>>(json);
        }

        return new Dictionary<string, object>(StringComparer.OrdinalIgnoreCase);
    }
    
    public void SaveTempData(ControllerContext controllerContext, IDictionary<string, object> values)
    {
        string json = Newtonsoft.Json.JsonConvert.SerializeObject(values);
        File.WriteAllText(FileName, json);
    }
}

In order to use this class it needs to be assigned to TempDataProvider in the controller’s constructor

public FirstController()
{
    TempDataProvider = new TextFileTempDataProvider();
}

Of course it’s not a bright idea to use disk for such operations, this is just for demonstration purposes and makes it easier to observe the behaviour.

Conclusion

Often times I’ve found knowledge about the internals of a construct useful. Applications and frameworks are getting more complex each day, adding more layers and hiding the complexity from the consumers. It’s great because we can focus on the actual business logic and application we are building but when we get stuck it takes quite a while to figure out what’s going on. Having in-depth knowledge on the internals can save a lot if time.

dev tfs, team_foundation_server, ci, continuous_integration, alm, application_lifecycle_management

Setting up continuous integration environment with TFS is quite easy and free. In this post I’ll go over the details of setting up a CI environment. The tools I will use are:

  • Visual Studio 2015 Community Edition
  • TFS 2015 Express
  • AWS EC2 instance (To install TFS)
  • AWS SES (To send notification mails)

Step 01: Download and install TFS Express 2015

I always prefer the offline installer so download the whole ISO just in case (it’s 891MB so it might be a good time for a coffee break!)

Installation is standard next-next-finish so nothing particular about it. Accepting all defaults yields a working TFS instance.

Step 02: Connect to TFS

From the Team menu select Manage Connections. This will open Team Explorer on which we can choose the TFS instance.

It can connect to TFS Online or Github. For this example I will use the hosted TFS we have just installed. We first need to add it to TFS server list.

If you are using EC2 like I did don’t forget to allow inbound traffic on port 8080 beforehand.

Note that we don’t have a Team Project yet, what we have connected is the Project Collection. A collection is an additional abstraction layer used to group related projects. Using Default Collection generally works out just fine for me.

Step 03: Configure workspace

Since a local copy needs to be stored locally we have to map it to the project

Just pick a local empty folder. Leaving “$/” means the entire collection will be mapped to this folder.

Click Map & Get and you will have a blank workspace.

Step 04: Create team project

Now we can create a team project on TFS. click on Home -> Projects and My Teams then New Team Project.

Just following the wizard selects Scrum methodology and Team Foundation Version Control as the source control system. Starting with TFS 2013, Git is also supported now.

After several minutes later, our team project is ready.

Step 05: Clone and start developing

I chose Git as the version control system. If you use TFSVC the terminology you’ll see is a bit different but since the main focus of this post is establishing continuous integration it doesn’t matter much as long you as you can check-in the source code.

So now that we have a blank canvas let’s start painting! I added a class library with a single class like the one below:

public class ComplexMathClass
{
    public int Add(int x, int y)
    {
        return x + y;
    }

    public double Divide(int x, int y)
    {
        return x / y;
    }
}

and 3 unit test projects (NUnit, XUnit and MSTest).

NUnit tests:

public class ComplexMathClassTests
{
    [TestCase(1, 2, ExpectedResult = 3)]
    [TestCase(0, 5, ExpectedResult = 5)]
    [TestCase(-1, 1, ExpectedResult = 0)]
    public int Add_TwoIntegers_ShouldCalculateCorrectly(int x, int y)
    {
        var cmc = new ComplexMathClass();
        return cmc.Add(x, y);
    }

    [Test]
    // [ExpectedException(typeof(DivideByZeroException))]
    public void Divide_DenominatorZero_ShouldThrowDivideByZeroException()
    {
        var cmc = new ComplexMathClass();
        double result = cmc.Divide(5, 0);
    }
}

XUnit tests:

public class ComplexMathClassTests
{
    [Theory]
    [InlineData(1, 2, 3)]
    [InlineData(0, 5, 5)]
    [InlineData(-1, 1, 0)]
    public void Add_TwoIntegers_ShouldCalculateCorrectly(int x, int y, int expectedResult)
    {
        var cmc = new ComplexMathClass();
        int actualResult = cmc.Add(x, y);
        Assert.Equal<int>(expectedResult, actualResult);
    }

    [Fact]
    public void Divide_DenominatorZero_ShouldThrowDivideByZeroException()
    {
        var cmc = new ComplexMathClass();
        cmc.Divide(5, 0);
    //  Assert.Throws<DivideByZeroException>(() => cmc.Divide(5, 0));
    }
}

MSTest tests:

[TestClass]
public class ComplexMathClassTests
{
    [TestMethod]
    // [ExpectedException(typeof(DivideByZeroException))]
    public void Divide_DenominatorZero_ShouldThrowDivideByZeroException()
    {
        var cmc = new ComplexMathClass();
        cmc.Divide(5, 0);
    }
}

Unfortunately MSTest still doesn’t support parametrized tests which is a shame IMHO. That’s why I was never a big fan of it but added to this project for the sake of completeness.

I commented out the lines that expect exception in all tests to fail the tests. So now the setup is complete: We have a working project with a failing test. Our goal is to get alert notifications about the failing test whenever we check in our code. Let’s proceed to the next steps to accomplish this.

Step 06: Building on the server

As of TFS 2015 they renamed the Build Configuration to XAML Build Configuration for some reason and moved it under Additional Tools and Components node but everything else seems to be the same.

Default configuration installs one build controller and one agent for the Default Collection so for this project we don’t have to do add or change anything.

Each build controller is dedicated to a team project collection. Controller performs lightweight tasks such as determining the name and reporting the status of the build. Agents are the heavy-lifters and carry out processor-intensive work of the build process. Each agent is controlled by a single controller.

In order to build our project on the build server we need a build definition first. We can create one by selecting Build -> New build Definition

  • One important setting is the trigger: By default the build is not triggered automatically which means soon enough it will wither and die all forgotten! To automate the process we have to select Continuous Integration option.

  • In order to automatically trigger the build, the branch that is to be built must be added to monitored branches list in the Source Settings tab.

  • Last required setting is failing the build when a test fails. It’s a bit buried so you have to go 3.1.1 Test Source and set “Fail build on test failure” to true.

Step 07: Notifications

At this point, our build definition is triggered automatically but we don’t get any notifications if the build fails (due to a compilation error for example). TFS comes with an application called Build Notifications. It pops up an alert but it requires to be installed on the developer machine so I don’t like this solution.

A better approach is enabling E-Mail notifications. In order to do that first we need to set up Email Alert Settings for TFS. In the Team Foundation Server Express Administration Console select Application Tier and scroll down to the “Email Alert Settings” section. enter the SMTP credentials and server info here.

You can also send a test email to verify your settings.

Second and final stage is to enable the project-based alerts. In Visual Studio Team Explorer window select Home -> Settings -> Project Alerts. This will pop up a browser and redirect to alert management page. Here select “Advanced Alert Management Page”. In the advanced settings page there are a few pre-set notification conditions and the build failure is at the top of the list!

I intentionally broke the build and checked in my code and in a minute, as expected I received the following email:

With that we have automated the notifications. We already set the build to fail upon test failure in the previous step. Final step is to run the tests on the build server to complete the circuit.

Step 08: Run tests on build server

Running tests on the server is very simple. For NUnit all we have to do is install the NUnitTestAdapter package:

Install-Package NUnitTestAdapter

After I committed and pushed my code with the failing test I got the notification in a minute:

Uncommented the ExpectedException line and the build succeeded.

For xUnit the solution is similar, just install the xUnit runner NuGet package and checkin the code

Install-Package xunit.runner.visualstudio

For MSTest it works out of the box so you don’t have to install anything.

In the previous version of TFS I had to install Visual Studio on the build server as advised in this MSDN article. Seems like in TFS 2015 you don’t have to do that. The only benefit of using MSTest (as far as I can see at least) is that it’s baked in so you don’t have to install extra packages. You create a Unit Test project and all your tests are immediately visible in the test explorer and automatically run on the server.

Wrapping up

Continuous Integration is a crucial part of the development process. Team Foundation Server is a powerful tool with lots of features. In this post I tried to cover basics from installation to setting up a basic CI environment. I’ll try to cover more features in the future but my first priority will be new AWS services CodeDeploy, CodeCommit and CodePipeline. As I’m trying to migrate everything to AWS having all projects hosted and managed on AWS would make more sense in my case.

Resources