I’ve long wanted to write a Bencoding library. Bencoding is an encoding format for encoding objects like text, lists, dictionaries, etc., into a single piece of text. It’s often used for transporting configuration files.
Bencoding is most famous for its use in the Bittorrent protocol—it forms the basis for .torrent
files. I wanted to create a C# library to support all known Bencoding object types. You can read more about the encoding here.
As a bonus, I also tried out the Team Foundation Server on Codeplex while coding the project.
The Objects
There are four Bencode object types:
- Byte strings – e.g.,
"Hello World"
- Integers – e.g.,
1
,42
,-999
- Lists – ordered elements like
("hello", 42)
- Dictionaries – key/value mappings, e.g.,
{ "Hello" => "Mike" }
Each object has a specific format:
- String:
5:Hello
(5-byte string) - Integer:
i42e
- List:
l5:helloi42ee
(list of “hello” and 42) - Nested List:
l5:Hellol12:second level16:second string :Pee
- Dictionary:
d5:Hello4:Mikee
- Lexicographical dictionary:
d1:ai52e1:b8:Object 1e
The next task was designing classes to support these structures.
The Class Layout
To maintain good OOP practices, I created a base type that all Bencode types derive from. This allowed lists and dictionaries to store base-type objects. The base is abstract, requiring child classes to override methods.
Encoding
Each object overrides a virtual Encode()
method, allowing recursive encoding across nested structures.
Decoding
I added a static Decode()
method in the base type to parse a string recursively, decoding one object at a time.
Error Handling
To handle malformed inputs, I defined these exceptions:
Except_Error
– generic errorsExcept_Error_String
Except_Error_Int
Except_Error_Dict
Except_Error_List
All inherit from Except_Error
and expose the failing string segment.
The Code
The code is available at Codeplex source. Click “Browse” to see the latest revision. Look for the BencodeLibrary
project.
Notable files:
Base.cs
– the base typeBencode.Dict.cs
– dictionary handling
Operator Overloads
I implemented overloads for ==
, !=
, and Equals()
to compare values meaningfully:
public static bool operator ==(BString first, BString second) => first.Value == second.Value;
public static bool operator !=(BString first, BString second) => first.Value != second.Value;
public override bool Equals(object obj)
{
if (obj == null || GetType() != obj.GetType()) return false;
return this == (BString)obj;
}
For lists:
public static bool operator ==(BList first, BList second)
{
if (first.Value.Count != second.Value.Count) return false;
for (int x = 0; x < first.Value.Count; x++)
{
if (!Equals(first.Value[x], second.Value[x])) return false;
}
return true;
}
Encode
Each object implements its Encode()
method:
Integer:
public override string Encode() => $"i{_data.ToString(BaseType.GlobalizationCultureInfo)}e";
String:
public override string Encode() => $"{_data.Length}:{_data}";
List:
public override string Encode()
{
var res = new StringBuilder("l");
foreach (var item in _childs) res.Append(item.Encode());
res.Append("e");
return res.ToString();
}
Dictionary:
public override string Encode()
{
var res = new StringBuilder("d");
foreach (var item in _childs.OrderBy(kv => kv.Key.Value))
{
res.Append(item.Key.Encode());
res.Append(item.Value.Encode());
}
res.Append("e");
return res.ToString();
}
This supports recursive encoding of complex structures.
Decode
The Decode(ref string Input)
method checks the first character of the string and parses accordingly:
'i'
for integers'l'
for lists'd'
for dictionaries- digits for strings
Each branch recursively decodes any child objects. Errors are caught and rethrown as specific exceptions with context.
The Result
I now have a fully functional library to encode and decode Bencoded strings. It’s object-oriented and comes with Visual Studio unit tests.
The code is hosted on TFS via Codeplex. The project is in Alpha, and I’m watching for feedback in the discussion forums.