17 July 2009

The evils of BOMs

With my application currently using over 20,000 lines of Javascript spread over dozens of files, I figured it was about time to do some concatenation and minification and timestamping and contraflabulation. Concatenated and squished, I end up with a half-meg javscript file. Smegging hell.

So this is all well and good. Except that during concatenation using Windows command line tools (type and echo and copy) the utf8 BOM (byte order mark) that Visual Studio frivolously squirts into every file gets concatenated too. And web browsers just choke on it.

So after pissing around for many hours trying to get around the issue from the command line, I threw in the towl and just wrote a little command line tool of my own to ditch the BOM from a file, and optionally append the resulting file to another. Code is below. It's C#, and can be compiled to most platforms using Mono (probably). If requested, I can post a compiled Windows command line executable.

The bit that does the work:

string SourceFileName = "source.js";
string DestFileName = "withoutBom.js";
bool Append = true;

var text = File.ReadAllText(SourceFileName);
var streamWriter = new StreamWriter(DestFileName, Append, new UTF8Encoding(false));
streamWriter.Write(text);
streamWriter.Close();
}


And here it is wrapped in lots of fluff to make it into a command line app (yes, it looks like crap. Use http://couponmeister.com/beautify.aspx to uncrapify it):

using System;
using System.IO;
using System.Text;

namespace RemoveBom
{
class Program
{
private static void CommandHelp(string message)
{
Console.WriteLine("\n " + message + "\n");
Console.WriteLine(" -----------------------------------------------------------------------------");
Console.WriteLine();
Console.WriteLine(" RemoveBom.exe will clean the UTF-8 BOM marker from a file");
Console.WriteLine();
Console.WriteLine(" Usage: RemoveBom.exe [-a] sourceFileName [destinationFileName]");
Console.WriteLine();
Console.WriteLine(" sourceFileName Name of file containing BOM character");
Console.WriteLine();
Console.WriteLine(" destinationFileName Name of file to write to");
Console.WriteLine(" If not specified, the source file is modified");
Console.WriteLine();
Console.WriteLine(" -a Append the cleaned sourceFile to the destination file");
Console.WriteLine(" If not specified, the destination file is overwritten");
Console.WriteLine();
Console.WriteLine(" -----------------------------------------------------------------------------");
}


private class Arguments
{
public string SourceFileName;
public string DestFileName;
public bool Append;

public Arguments(string[] args)
{
switch (args.Length)
{

// RemoveBom.exe sourceFileName
case 1:
this.SourceFileName = args[0];
this.DestFileName = args[0];
this.Append = false;
break;

// RemoveBom.exe sourceFileName destFileName
case 2:
if (args[0] == "-a")
throw new Exception("When specifying -a switch, a destination filename must be given");
this.SourceFileName = args[0];
this.DestFileName = args[1];
this.Append = false;
break;

// RemoveBom.exe -a sourceFileName destFileName
case 3:
if (args[0] != "-a")
throw new Exception("Invalid parameters");
this.SourceFileName = args[1];
this.DestFileName = args[2];
this.Append = true;
break;

default:
throw new Exception("No arguments were provided");
}
}
}


private static void RemoveBom(Arguments args)
{
var text = File.ReadAllText(args.SourceFileName);
var streamWriter = new StreamWriter(args.DestFileName, args.Append, new UTF8Encoding(false));
streamWriter.Write(text);
streamWriter.Close();
}


static void Main(string[] args)
{
try {
var arguments = new Arguments(args);
RemoveBom(arguments);
}
catch (Exception e)
{
CommandHelp(e.Message);
}
}

}
}