dev csharp, fake, test, data

Generating high-quality test data can have an impact on the accuracy of the tests overall. In this post I’ll show using a helpful C# library called Bogus

Showcase project: Bank Statement Generator

In this example I’ll generate fake bank statements. Normally they come in CSV files and have the following model:

public class BankStatementLine
{
    public DateTime TransactionDate { get; set; }
    public string TransactionType { get; set; }
    public string SortCode { get; set; }
    public string AccountNumber { get; set; }
    public string TransactionDescription { get; set; }
    public decimal? DebitAmount { get; set; }
    public decimal? CreditAmount { get; set; }
    public decimal Balance { get; set; }
}

I’ll use Bogus to generate realistic fake statement lines and finally save it as a CSV and see if it looks real.

Rules and restrictions

I want the fields in the model above conform to certain set of rules to be realistic:

  • Transaction Date must be within a certain range I provide as bank statements are generated for a date range.
  • Dates should be incremental and not random
  • Sort Code must be in the following format: NN-NN-NN and must be the same for the entire statement.
  • Account number must be an 8-digit number and same for the entire statement.
  • Transaction Description must be free text
  • Debit Amount and Credit Amount must be decimal numbers but only one of them can be present at any given line
  • Transaction Type must be one of the pre-defined values and also some types can be for credit and some for debit only.
  • Balance should be sum of all debit and credit amounts plus the first balance in the statement. So this value is dependent on the values that come before it.
  • The number of lines in a statement should be random.

Rule implementations

Some rules stated above are very straightforward and easy to implement. These are some samples of what Bogus is capable of. For the full documentation check out the GitHub repository.

Date range support

Generating a date between a range is simple:

.RuleFor(x => x.TransactionDate, f => f.Date.Between(startDate, endDate))

Enum and array support

For Transaction Type I want to select a random value from a list of set values. This can be done in 2 ways: By using an enum or an IEnumerable.

var transactionTypes = new[] { "FPO", "DEB", "DB", "FPI" };

and in the rule description it can be used as

.RuleFor(x => x.TransactionType, f => f.PickRandom(transactionTypes) )

Another way is using enums such as:

public enum TransactionType
{
    FPO,
    DEB,
    DB,
    FPI
}

and the rule becomes:

 .RuleFor(x => x.TransactionType, f => f.PickRandom<TransactionType>().ToString())

In my final implementation I used selecting from a list of objects. You can check out the sample code to see that version.

Number range

For the account number I need an 8-digit number which can be achieved with something like this rule:

.RuleFor(x => x.AccountNumber, f => f.Random.Long(100000000, 99999999).ToString())

Bogus API also has builtin support for account number so the following is a more elegant and expressive way of achieving the same:

.RuleFor(x => x.AccountNumber, f => f.Finance.Account())

Formatting string

Formatting Sort Code can be achieved by Field.Random.Replace method

.RuleFor(x => x.SortCode, f => f.Random.Replace("##-##-##"))

Similar to account number, it also has built-in support for sort code:

.RuleFor(x => x.SortCode, f => f.Finance.SortCode())

Null values

In my case in some fields I’d like to have null values too. This can be achieved by OrNull extension method. For example, in the code below it generates %20 of DebitAmount values null.

.RuleFor(x => x.DebitAmount, f => f.Random.Decimal(0.00m, 9999.00m).OrNull(f, 0.2f))

Common fields

In my case some values in each statement line repeat throughout the entire statement such as account number and sort code. To achieve that I created a “base” statement line and every fake statement line used these shared fields instead of generating new ones.

var commonFields = new Faker<BankStatementLine>()
    .RuleFor(x => x.AccountNumber, f => f.Finance.Account())
    .RuleFor(x => x.SortCode, f => f.Finance.SortCode())
    .Generate();


var fakeTransactions = new Faker<BankStatementLine>()
    .StrictMode(true)
    .RuleFor(x => x.AccountNumber, commonFields.AccountNumber)
    .RuleFor(x => x.SortCode, f => commonFields.SortCode)
	...
	...

Random number of objects

It’s more realistic to have varying number of lines in statements. With Generate method you can specify the exact number of items you want to generate which is good for unit tests. For my purposes I just wanted to create random of rows in each statement as I only needed the data to be imported. This can be achieved by GenerateBetween:

var statementLines = fakeTransactions.GenerateBetween(10, 20);

Dependent values

The tricky part in this scenario was the dependent values. Normally when you use RuleFor extension method it generates the value for that field alone in isolation. In my case, one restriction was Debit Amount and Credit Amount could not both have values in the same line. Also Balance depends on these values and needs to be calculated in each line.

As far as I can tell there’s no built-in support to define these dependencies. Based on my tests I was able to achieve this in 2 ways

  1. Update the values accordingly in FinishWith extension method
  2. Use Rules extension method to define multiple rules at once and implement the restrictions inside it.

I think the latter is a better solution as FinishWith sounds more like clean up, logging or similar extra activity where Rules sound more like actual business logic implementation.

So with that in mind my rules for Debit Amount, Credit Amount and Balance fields looked like this:

.Rules((f, x) =>
{
    var debitAmount = (decimal?)f.Random.Decimal(1, 100).OrNull(f, 1.0f - statementconfig.DebitTransactionRatio);
    if (debitAmount.HasValue) // Is it a debit transaction?
    {
        x.CreditAmount = null;
        x.DebitAmount = debitAmount.Value;
        balance -= x.DebitAmount.Value;

        x.TransactionType = f.PickRandom(TransactionType.AllTransactionTypes
            .Where(tt => tt.Direction == TransactionDirection.Debit || tt.Direction == TransactionDirection.DebitOrCredit)
            .Select(tt => tt.Code));
    }
    else
    {
        var creditAmount = f.Random.Decimal(1, 100);
        x.DebitAmount = null;
        x.CreditAmount = creditAmount;

        balance += x.CreditAmount.Value;

        x.TransactionType = f.PickRandom(TransactionType.AllTransactionTypes
            .Where(tt => tt.Direction == TransactionDirection.Credit || tt.Direction == TransactionDirection.DebitOrCredit)
            .Select(tt => tt.Code));
    }

    x.Balance = balance;
});

A caveat with this approach is that I cannot use StrictMode anymore as it complains about those 3 fields having null values. It specifically mentions that in the exception. If you use Rules you’re on your own to ensure that all fields are populated properly.

Another drawback of setting multiple rules at once is that it can easily make the code harder to read. Fortunately for me, the author of the library Brian Chavez kindly reviewed the code and suggested some refactorings one of which proved it was still possible to use RuleFor method and strict mode. I’ve updated the final source code with these refactorings. So with individual rules the implementation looks like this:

.RuleFor(x => x.DebitAmount, f =>
{
    return (decimal?)f.Random.Decimal(1, 100).OrNull(f, 1.0f - statementconfig.DebitTransactionRatio);
})
.RuleFor(x => x.CreditAmount, (f, x) =>
{
    return x.IsCredit() ? (decimal?)f.Random.Decimal(1, 100) : null;
})
.RuleFor(x => x.TransactionType, (f, x) =>
{
    if (x.IsCredit())
    {
        return RandomTxCode(TransactionDirection.Credit); ;
    }
    else
    {
        return RandomTxCode(TransactionDirection.Debit);
    }

    string RandomTxCode(TransactionDirection direction)
    {
        return f.PickRandom(TransactionType.AllTransactionTypes
            .Where(tt => tt.Direction == direction || tt.Direction == TransactionDirection.DebitOrCredit)
            .Select(tt => tt.Code));
    }
})
.RuleFor(x => x.Balance, (f, x) =>
{
    if (x.IsCredit())
        balance += x.CreditAmount.Value;
    else
        balance -= x.DebitAmount.Value;

    return balance;
});

IsDebit and IsCredit methods referred to above are extension methods defined like this:

public static class Extensions
{
   public static bool IsCredit(this BankStatementLine bsl)
   {
      return bsl.DebitAmount is null;
   }
   public static bool IsDebit(this BankStatementLine bsl)
   {
      return !IsCredit(bsl);
   }
}

Random text

For the transaction description for now I’ll go with random Lorem Ipsum texts. Bogus has support for this too

.RuleFor(x => x.TransactionDescription, f => f.Lorem.Sentence(3))

I probably will need to use a fixed list of descriptions soon but for the time being it’s fine. Also as shown below it’s very easy to switch to that too.

Incremental values

Similar to balance being dependent on the previous values, transaction date is also dependent as it needs to go in an incremental fashion. I couldn’t find built-in support for this so implemented it using my own shared variable like this:

.RuleFor(x => x.TransactionDate, f =>
{
    lastDate = lastDate.AddDays(f.Random.Double(0, statementconfig.TransactionDateInterval));
    if (lastDate.Date > statementconfig.EndDate)
    {
        lastDate = statementconfig.EndDate;
    }
    return lastDate;
})

Putting It All Together

So let’s see the output with the help of another nice library called Console Tables

Source Code

Sample application can be found under blog/GeneratingTestDataWithBogus folder in the repository.

Resources

dockerdevops

In this post I’d like to cover a wider variery of Docker commands and features. My goal is to be able to use it as a quick reference while working with Docker without duplicating the official documentation. It consists of the notes I took at various times.

Docker commands

  • docker ps: Shows the list of running containers
  • docker ps -a: Shows all containers including the stopped ones
  • docker images: Shows the list of local images
  • docker inspect {container id}: Gets information about the container. It can be filtered by providing –format flag.

    Example:

        Command: docker inspect --format="{{ .Image }}" abcdabcdabcd
    	
      Output: sha256:abcdabcdabcdabcdabcdabcd.....
    

    The data is in JSON format so it can be filtered by piping to jq.

    Example:

      docker inspect 7c1297c30159 | jq -r '.[0].Path'
    
  • docker rm {container id or name}: Deletes the container

    run command accepts –rm parameter which deletes the container right after it finishes.

    To delete all the containers the following command can be used:

      docker rm $(docker ps -aq) 
    

    where docker ps -aq only returns the container ids.

  • docker rmi {image id or name}: Seletes the image (provided that there are no containers created off that image)

    To delete all images the following command can be used:

      docker rmi $(docker images -q)
    

    This command is equivalent of

      docker image rm {image name or id}
    
  • docker build: Creates a new image from Dockerfile

    It is equivalent of

      docker image build ...
    

    It can be used to build images straight from GitHub:

    Example:

      docker build https://github.com/docker/rootfs.git#container:docker
    
  • docker history {image name}: Shows the history of an image

  • docker run -it {image name} bash: Runs a container in interactive mode with bash shell

    Same as:

      docker container run -it ...
    
  • docker start/stop: Starts/stops an existing container

    If a command is specified the first time the container was run, then it uses the same command when it’s started.

    Example:

      docker run -it {image name} bash
    

    This creates a container in interactive mode

    We can exit the container by running exit in the terminal (leaving the container running in the background) ???? (or Ctrl P+Q??). We can then stop the container:

      docker stop {container id}
    

    Stop command sends a signal to the process with pid = 1 in the container to stop gracefully. Docker gives the container 10 seconds to stop before it forcefully stops it itself.

    We can start it again with interactive flag to run bash again docker run -i {container id}

    This runs bash automatically because that was the command we used while creating the container.

  • docker run -p {host port}:{container port}: Forwards a host port to container

  • docker run -v {host path}:{container path}: Maps a path on host to a path on container.

    If the container path exists it will be overwritten by the contents of host path. In other words, the container will see the host files while accessing that path incestead of its local files.

  • docker diff: Lists the differences between the container and the base image. Shows the added, deleted and changed files.

  • docker commit: Creates a new image based on the container

  • docker logs {container id}: Shows the logs of the container.

    Docker logging mechanism listens to stdout and stderr. We can link other logs to stdout and stderr so that we can view them using docker logs {container id}

    Example:

      RUN ln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log
    
  • docker network:

    docker network create {network name}: Creates a new network

    docker network ls: Lists all networks

    docker network inspect {network name}: Gets information of a specific network

    docker run –network {network name}: Creates a container inside the specified network

    Containers can address each other by name within the network.

  • docker container exec: Can be used to run commands in a running container. Also can be used to connect to the container by running a shell

    Example:

    docker container exec {container id} sh

It starts a new process inside the container

Docker Compose

A tool for defining and running multi-container Docker applications.

  • Instead of stand-alone containers, “services” are defined in “docker-compose.yml” file
  • docker-compose up command will create the entire application with all the components (containers, networks, volumes etc)
  • docker-compose config: It is used to verify a docker-compose yml file.
  • docker-compose up -d: Starts the services in detached mode. Uses the current directory name to prepend the objects (networks, volumes, containers etc) created.
  • docker-compose ps: Shows the full stack with names, commands, states and ports
  • docker-compose logs -f: Shows logs of the whole stack. tails and shows the updates.
  • docker-compose down: Stops and removes the containers and networks. Volumes are NOT deleted.

Environment variables can be used in docker-compose.yml files in ${VAR_NAME} format.

  • docker-compose -f {filename}: Used to specify non-default config files.
  • Multiple YML files can be used and can be extended using “extends:” key in the following format:

      extends: 
          file: original.yml
          service: service name
    

General Concepts

Container: Isolated area of an OS with resource usage limits applied. VMs virtualize hardware resources, containers virtualize OS resources (file system, network stack etc)

Namespaces: Isolation (like hypervisors)

Control groups: Grouping objects and setting limits. In Windows they are also called Job Objects.

Daemon: Includes REST API.

containerd: Handles container operations like start, stop etc.. on Windows, Compute Services, includes runc

runtime: OCI (Linux -> runc) runc creates the container than exits

containerd cannot create containers by itself. That capability is with the OCI layer.

Windows containers are 2 kinds:

  • Native containers
  • Hyper-V

  • Layering
    • Base Layer: At the bottom (OS Files & objects)
    • App Code: Built on top of base layer

    Each layer is located under /var/lib/docker/aufs/diff/{layer id}

    Layer id is different from the hash value of the layer

  • Images live in registries. Docker defaults to Docker Hub

  • Docker Trusted Registry comes with Docker Enterprise Edition

  • Official images can be addressed as docker.io/image name or just the image name

  • Naming convention: Registry / Repo / Image (Tag)

  • Registry: docker.io is default
  • Image: latest is default

  • Image layers are read-only. Containers have their read-write layers. Container adds a writable layer on top of the read-only layers from the image. If they need to change something on a read-only layer, they use copy-on-write operation and create a copy of the existing layer and apply the change on that new layer. Anytime container needs to change a file, creates a copy of the file in its own writable layer and changes it there.

  • Cmd vs entrypoint: runtime arguments overrides default arguments. Entrypoint runtime arguments are appended

  • Dockerfile commands
    • FROM is always the first instruction
    • Good practice to label with maintainer email address
    • RUN: Executes command and creates a layer
    • COPY: Copies files into image in a new layer
    • EXPOSE: Specifies which port to liste to. Doesn’t create a new layer. Adds metadata.
    • ENTRYPOINT: Specifies what to run when a container starts.
    • WORKDIR: Specifies the default directory. Doesn’t create layer.
  • Multi-stage builds
    • Helps to remove unnecessary packages / files from the final production image.
    • It can get specific files from intermediary builds

Notes

  • A container will run one process only
  • If the command is a background process, the container will stop right after it ran. It needs to be something like a shell and not like a web server.
  • As of v1.12 Docker runs in either single-image mode or swarm mode.
  • Recent Docker versions support logging drivers. They forward the logs to external systems such as dusk of, splunk etc (–log-driver and –log-opts)

Resources

dev asp.net, dotnet_core

Developing projects is fun and a great way to learn new stuff. The things is everything moves so fast, you constantly need to maintain the projects as well. In that spirit, I decided to re-write my old AngularJS DomChk53.

First, I developed an API. This way I won’t have to maintain my AWS Lambda functions. Even though nobody uses them, still opening a public API poses risks and maintenance hassle. Now everybody can run their own system locally using their own AWS accounts.

API Implementation

Full source code of the API can be found in DomChk53 repo under aspnet-api folder. There’s not much benefit in covering the code line by line but I want to talk about some of the stuff that stood out in one way or the other:

CS5001: Program does not contain a static ‘Main’ method” error

I started this API with the intention of deploying it to Docker so right after I created a new project I tested Docker deployment but it kept giving the error above. After some searching I found that the Dockerfile was in the wrong folder! I moved it one level up as suggested in this answer and it worked like a charm!

Reading config values

I appreciated the ease of using configuration files. IConfiguration provider is provided out-of-the-box which handles appsettings.json config file.

So for example I was able to inject it easily:

public TldService(IConfiguration configuration, IFetchHtmlService fetchHtmlService)
{
    this.configuration = configuration;
    this.fetchHtmlService = fetchHtmlService;
}

and use it like this:

var awsDocumentationUrl = configuration["AWS.DocumentationURL"];

Using dependency injection

ASP.NET Core Web API has built-in dependency injection so when I needed to use it all I had to do was register my classes in ConfigureServices method:

// This method gets called by the runtime. Use this method to add services to the container.
public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_1);

    services.AddTransient<ITldService, TldService>();
    services.AddTransient<IFetchHtmlService, FetchHtmlService>();
}

Documentation

Adding Swagger documentation was a breeze. I just followed the steps in the documentation and now I have a pretty documentation. More on that in the usage section below.

AWS Implementation

Even though the code will run locally in a Docker container, you still need to setup an IAM user with the appropriate permissions. So make sure the policy below is attached to the user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "route53domains:CheckDomainAvailability",
            "Resource": "*"
        }
    ]
}

Usage

Nice thing about Swagger is it can used as a full-blown API client as well. So instead of developing a dummy client I just used Swagger UI for simple domain availability enquiries. This is until I develop a better frontend on my own. For now at least it works.

To use Swagger, simply run the Domchk53.API application and change the URL to https://localhost:{port number}/docs, port number being whatever your local web server is running on.

curl usage:

Windows:

curl -i -X POST -H "Content-Type: application/json" -d "{\"domainList\": [\"volkan\"], \"tldList\": [\"com\", \"net\"]}" https://localhost:5001/api/domain

Mac:

curl --insecure -X POST -H "Content-Type: application/json" -d '{"domainList": ["volkan"], "tldList": ["com", "net"]}' https://localhost:5001/api/domain

Example Test data:

This is just a quick reference to be used as template:

{
  "domainList": ["domain1", "domain2"],
  "tldList": ["com", "net"]
}
{ "domainList": ["volkan"], "tldList": ["com", "net"] }

Resources