...

Creating a simple Blog Engine

We used to have a blog on WordPress for some time and wordpress is OK. My only problem was that every time I'd like to write about something I wouldn't because of WordPress and it's special behavior. Inserting code snippets is extremely hard even with rich plugin ecosystem. Writing posts is hard - I like composing my posts offline and there is no good desktop client which would play nicely with WordPress. I could make some work more or less OK i.e. Windows LiveWriter, MS Word, but they still wouldn't render the posts the way I would see them, and I would end up publishing them as a draft and then go through a boring process of editing them and looking fine. Maybe I'm just a control freak, I'm sure most of the people don't have this problem.

What I wanted is something really simple, not requiring any heavy "blog engines" with incompatible plugins and constant upgrades, and where I can understand what's going on.

The Art of Simplicity

I do edit a lot of documentation in Markdown and think it's a beautiful format. Simple yet powerful. It's sort of like writing in notepad, there are no distractions and only focus is on content and text. Yet you do have formatting options if you need it. They are very basic, like make a text bold, underline, etc. but how many times you've used anything else in powerful editors like Word for writing your thoughts out? Hey, it even supports tables and if you're developer - code highlighting. Hyperlinks and pictures too, and some stuff I never even used yet.

I'm writing this very post in Markdown format, using Visual Studio 2017 with Markdown Editor plugin by Mads Kristensen. You'd think it's an overkill, but I tried many and couldn't find a more functional and powerful option.

Here is a screenshot of what it looks like:

Apparently those markdown files are just text files on disk in a folder under Dropbox or any other cloud storage (I use OneDrive for Business). So what I'd like to have ideally is just write it in local files and have that published on my company's website nicely with formatting, just like GitHub does it. That was the original idea and here is how I've implemented it.

KISS OneDrive

Our website is hosted in Microsoft Azure, built with ASP.NET Core and running on Linux. Why Linux? Because we can. .NET Core is new and I wanted to play with it, and because I've never hosted a website on Linux I thought I'd give it a go. Anyway, this was already done some time ago when .NET Core was in early preview, so I won't go into details, other that it was an amazing experience.

The first problem was getting the OneDrive folder somehow picked up by the webisite and getting those two in sync. I thought about using OneDrive API but that wasn't simple which wasn't part of my plan. OneDrive API requires user interaction to authenticate the API and storing auth tokens somewhere so I discarded that option. In addition to that it was slow comparing to my final solution.

I've decided to sync my folder to Azure Storage account under a separate container by simply mirroring the folder structure. It looks like this:

where:

  • post.md file is the actual post content
  • title.jpg is a title of the post apprearing in post preview
  • the rest of the files are referenced from post.md as embedded images

the resulting post preview on the "blogs" page looks like this:

and the actual blog post page:

the resulting blob storage:

To keep them in sync I wrote a simple C# console program which simply reads local folder and pushes files to the blob storage.

That was brilliant, because working with Azure from .NET is a piece of cake, so many SDKs and options available from Microsoft.

Of course that wasn't enough, because pages would load slow. How do I get the list of blog posts and pictures without iterating throught that tree? The simple solution (and it turned out to be the one) would be to use Tables withing Azure Storage. And because I already have the console program to sync blobs I would simply create an index in the table storage as well. This is a part of console utility to sync local folder and the resulting table looks similar to the picture:

A few notes here:

  • I partition the table by year and month. This might not be an optimal strategy but it works in this case.
  • RowKey is a date of a post. We will simply assume there won't be more than one blog post in a day which is absolutely true!
  • Title of the post is extracted from post.md, from the first line of the text file
  • Tags is a line 2 of every post i write, it's simply a comma-separated list of keywords
  • PreviewText is first 300 character of blob post content to display on blog list page (see details on this later).

The Website

Now that all information is in the cloud in our private storage account I need to render it somehow. Again, I remembered that this cool guy Mads Kristensen probably have written his extension in .NET, so I went into his GitHub repo to find what he used for dealing with Markdown. Turned out to be a very nice library Markdig which is exactly what I've used.

Here is a workflow:

  1. When you open the "blog" page which displays the list of blog posts I simply read the table storage where the index of posts is stored. I read them all at once, because it's unlikely we will create too many blog posts to fetch more than one page of blogs. Otherwise I'd have to come up with a better partitioning policy.
  2. When you open the actual blog post I load the table row entry, the blog post main file, parse it with Mardig and do a bit of post-processing (replace image links to absolute links to the files in blob storage).

Basically that's it, I never thought it would be that easy and just work!

Technical details

Sync Utility

First of all, i've created a small shared library to use in sync utility and website. It's a .NET Core library with blog management interface, model class and implementation. The interface is:

public interface IBlogService
{
    /// <summary>
    /// Sync Azure Storage with local folder (push in one direction)
    /// </summary>
    void Sync();

    /// <summary>
    /// Retreive all blog posts
    /// </summary>
    IEnumerable<BlogPost> GetLatest(int pageNo, int pageSize);

    /// <summary>
    /// Get one post by ID
    /// </summary>
    BlogPost GetPost(PostId id);

    /// <summary>
    /// Get post HTML content
    /// </summary>
    string GetHtmlContent(PostId id);
}

Before we go into details it's worth mentioning that I've used Storage.Net library which my company developed to abstract storage operations no matter which provider it is. In my case it supports local disk and Azure as storage providers on the same interface so things are becoming trivial.

When I call Sync blobs are copied from local disk to azure blobs by simply copying data streams from one interface to another. Also appropriate information is extracted from the post to populate the table data.

Website

The website implementation is trivial - it simply references the library and calls .List() on blog list page and .GetPost() + .GetHtmlContent() on blog post page.

Full Source Code

I'm attaching the full source code here. Remember it was bodged together in a day and probably looks much nicer by now.

BlogPost.cs

public class BlogPost
{
    public BlogPost(string title, DateTime posted, string previewText, string[] tags)
    {
        Title = title;
        Posted = posted;
        PreviewText = previewText;
        Tags = tags;
    }

    internal BlogPost(TableRow row)
    {
        Title = row["Title"];
        Posted = row["Posted"];
        PreviewText = row["PreviewText"];
        Tags = ((string)row["Tags"]).Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
    }

    internal TableRow ToTableRow()
    {
        var row = new TableRow(
        $"{Posted.Year}-{Posted.Month,2:D2}",
        $"{Posted.Day,2:D2}");

        row["Title"] = Title;
        row["Posted"] = Posted;
        row["PreviewText"] = PreviewText;
        row["Tags"] = string.Join(",", Tags);

        return row;
    }

    public string Title { get; }

    public DateTime Posted { get; }

    public string PreviewText { get; }

    public string[] Tags { get; }

    #region [ Helpers ]

    public string TitleUrl => $"http://i.isolineltd.com/blog/{Posted.Year}/{Posted.Month,2:D2}/{Posted.Day,2:D2}/title.jpg";

    public string PostUrl => $"/blog/{Posted.Year}/{Posted.Month,2:D2}/{Posted.Day,2:D2}/{Title.Replace(" ", "-")}";

    public string PostedFormattted => Posted.ToString("D");

    public string[] TagsFormatted => new[] { "Uncategorized" };

    #endregion
}

PostId.cs

public class PostId
{
    public PostId(int year, int month, int day)
    {
        this.Year = year;
        this.Month = month;
        this.Day = day;
    }

    public int Year { get; }

    public int Month { get; }

    public int Day { get; }
}

AzureStorageBlogService.cs

   public class AzureStorageBlogService : IBlogService
   {
      private static readonly LogMagic.ILog log = LogMagic.L.G(typeof(AzureStorageBlogService));
      private const string RemoteImageUriFormat = "http://i.isolineltd.com/blog/{0}/{1,2:D2}/{2,2:D2}/{3}";
      private readonly ITableStorage _tables;
      private readonly IBlobStorage _destBlobs;
      private readonly IBlobStorage _srcBlobs;
      private const string EntityName = "blog";

      public AzureStorageBlogService(NetworkCredential azureStorageCredential, string localFolder)
      {
         _destBlobs = StorageFactory.Blobs.AzureBlobStorage(azureStorageCredential.UserName, azureStorageCredential.Password, EntityName);
         _tables = StorageFactory.Tables.AzureTableStorage(azureStorageCredential.UserName, azureStorageCredential.Password);

         _srcBlobs = localFolder == null ? null : StorageFactory.Blobs.DirectoryFiles(new DirectoryInfo(localFolder));
      }

      public void Sync()
      {
         //detect local files
         var localFiles = _srcBlobs.List(null)
            .Where(id => GetDateTuple(id) != null)
            .GroupBy(GetDateTuple)
            .ToList();

         //uplaod files
         foreach (var group in localFiles)
         {
            string dateTuple = group.Key;
            List<string> files = group.ToList();

            //upload all the files remotely
            foreach (string fileId in files)
            {
               BlobMeta fromMeta = _srcBlobs.GetMeta(fileId);
               BlobMeta toMeta = _destBlobs.GetMeta(fileId);

               if (toMeta != null && toMeta.Size == fromMeta.Size)
               {
                  log.D("sizes match, skipping upload");
               }
               else
               {
                  log.D("uploading {0}...", fileId);

                  using (Stream sl = _srcBlobs.OpenStreamToRead(fileId))
                  {
                     _destBlobs.UploadFromStream(fileId, sl);
                  }
               }
            }
         }

         //update table
         var rows = new List<TableRow>();
         foreach (var group in localFiles)
         {
            string[] tuple = group.Key.Split('/');

            string postId = group.First(f => f.EndsWith("post.md"));
            string title, preview;
            string[] tags;
            string content = _srcBlobs.DownloadText(postId);

            GetMarkdownParts(content, out title, out preview, out tags);

            var post = new BlogPost(
               title,
               new DateTime(int.Parse(tuple[0]), int.Parse(tuple[1]), int.Parse(tuple[2])),
               preview,
               tags);

            rows.Add(post.ToTableRow());
         }
         log.D("updating table...");
         _tables.InsertOrReplace(EntityName, rows);

         log.D("all done");
      }

      private static string GetDateTuple(string blobId)
      {
         int slashCount = blobId.Count(ch => ch == '/');

         if (slashCount != 3) return null;

         return blobId.Substring(0, blobId.LastIndexOf('/')).Trim('/');
      }

      private static void GetMarkdownParts(string content, out string title, out string preview, out string[] tags)
      {
         MarkdownDocument doc = Markdown.Parse(content);

         SourceSpan headerSpan = doc[0].Span;
         title = content.Substring(headerSpan.Start, headerSpan.Length).Trim(new[] { ' ', '#' });

         SourceSpan tagsSpan = doc[1].Span;
         string tagsString = content.Substring(tagsSpan.Start, tagsSpan.Length).Trim();
         if(tagsString.StartsWith("tags:"))
         {
            tags = tagsString.Substring(5).Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries).Select(tag => tag.Trim()).ToArray();
         }
         else
         {
            tags = null;
         }

         string previewMd = content.Substring(headerSpan.Length + tagsString.Length + Environment.NewLine.Length * 2, 300).Trim();
         preview = Markdown.ToHtml(previewMd).StripHtml().Trim();
      }

      public IEnumerable<BlogPost> GetLatest(int pageNo, int pageSize)
      {
         int year = DateTime.UtcNow.Year;
         int month = 12;

         const int minYear = 2015;

         var result = new List<TableRow>();

         while (year > minYear)
         {
            //get records
            string pk = GetPostPrimaryKey(year, month);

            //get rows
            IEnumerable<TableRow> rows = _tables.Get(EntityName, pk);
            result.AddRange(rows);

            //iterate down
            month -= 1;
            if (month == 0)
            {
               month = 12;
               year -= 1;
            }
         }

         return result.Select(r => new BlogPost(r)).OrderByDescending(p => p.Posted).ToList();
      }

      public BlogPost GetPost(PostId id)
      {
         TableRow row = _tables.Get(EntityName, GetPostPrimaryKey(id.Year, id.Month), GetPostRowKey(id.Day));

         return new BlogPost(row);
      }

      private static string GetPostPrimaryKey(int year, int month)
      {
         return $"{year}-{month,2:D2}";
      }

      private static string GetPostRowKey(int day)
      {
         return $"{day,2:D2}";
      }

      public string GetHtmlContent(PostId id)
      {
         string blobId = $"{id.Year}/{id.Month,2:D2}/{id.Day,2:D2}/post.md";

         string markdown = _destBlobs.DownloadText(blobId);

         //the first line is always a title, we already have it so strip it out!
         markdown = markdown.Substring(markdown.IndexOf(Environment.NewLine) + Environment.NewLine.Length);

         //parse markdown
         MarkdownDocument doc = Markdown.Parse(markdown);

         //process images
         FixImageLinks(doc, id);
         FixExternalLinks(doc);

         //convert to HTML
         var sb = new StringBuilder();
         using (var writer = new StringWriter(sb))
         {
            var renderer = new HtmlRenderer(writer);
            renderer.Render(doc);
         }

         return sb.ToString();
      }

      private static void FixImageLinks(MarkdownDocument doc, PostId id)
      {
         foreach (LinkInline link in doc.Descendants().OfType<LinkInline>().Where(link => link.IsImage))
         {
            //for non-absolute URIs link to blob storage
            if (!Uri.IsWellFormedUriString(link.Url, UriKind.Absolute))
            {
               string uri = string.Format(RemoteImageUriFormat, id.Year, id.Month, id.Day, link.Url);
               link.Url = uri;

               HtmlAttributes attributes = link.GetAttributes();
               if (attributes == null)
               {
                  attributes = new HtmlAttributes();
               }

               if(attributes.Classes == null)
               {
                  attributes.Classes = new List<string>();
               }

               attributes.Classes.Add("img-responsive");
               attributes.Classes.Add("shadow3");

               link.SetAttributes(attributes);
            }
         }
      }

      private static void FixExternalLinks(MarkdownDocument doc)
      {
         //basically i'd like external links to open in a new window
         foreach(LinkInline link in doc.Descendants().OfType<LinkInline>().Where(link => !link.IsImage))
         {
            HtmlAttributes attributes = link.GetAttributes();
            if (attributes == null)
            {
               attributes = new HtmlAttributes();
            }

            attributes.AddProperty("target", "__blank");
            link.SetAttributes(attributes);
         }
      }
   }

Hope you find this useful.

Our Blog

Blogging is one of the oldest publishing platforms. Although we don't blog much there is an article once in a while published here.