
Going Native - C#
"I belong to the warrior in whom the old ways have joined the new."
Inscription on the sword wielded by Captain Nathan Algren, The Last Samurai
From the JVM to the CLR
This is the third part in a series on calling native code from high-level languages. I've been interested in making useful code locked away in native libraries more widely available, and took this opportunity to finally look into how it's done.
Here is a description of the native library I'm calling in this series.
After struggling through getting the Java's Foreign Function and Memory API to work, I wasn't sure to expect from .NET. Nevertheless, that's the next language I'm most familiar with it, so I went ahead and plunged in.
The approach I followed is Explicit PInvoke, outlined on the Microsfoft Learn website. That provides good background and outline of the process and alternatives. In reality it was so easy that I got by just with conversations with ChatGPT.
The Basics
I started by declaring structs that mirrored the (public) structs in the native libraries:
1[StructLayout(LayoutKind.Sequential)]
2private struct Rashunal
3{
4 public int numerator;
5 public int denominator;
6}
7[StructLayout(LayoutKind.Sequential)]
8private struct GaussFactorization
9{
10 public IntPtr PInverse;
11 public IntPtr Lower;
12 public IntPtr Diagonal;
13 public IntPtr Upper;
14}
The attributes indicate that the structs are laid out in memory with one field directly following on the previous one. IntPtr
is a generic .NET class for a pointer to some memory location. You'll see it again!
Then the native functions are declared in a simple fashion that matches C#'s variable types, with attributes that declare what library to find it in and what the native method is. The methods (and the class) are declared partial because the implementation is provided by the native code. According to convention the C# function and the native function have the same name, but that's not required.
1[LibraryImport("rashunal", EntryPoint = "n_Rashunal")]
2private static partial IntPtr n_Rashunal(int numerator, int denominator);
3
4[LibraryImport("rmatrix", EntryPoint = "new_RMatrix")]
5private static partial IntPtr new_RMatrix(int height, int width, IntPtr data);
6
7[LibraryImport("rmatrix", EntryPoint = "RMatrix_height")]
8private static partial int RMatrix_height(IntPtr m);
9
10[LibraryImport("rmatrix", EntryPoint = "RMatrix_width")]
11private static partial int RMatrix_width(IntPtr m);
12
13[LibraryImport("rmatrix", EntryPoint = "RMatrix_get")]
14private static partial IntPtr RMatrix_get(IntPtr m, int row, int col);
15
16[LibraryImport("rmatrix", EntryPoint = "RMatrix_gelim")]
17private static partial IntPtr RMatrix_gelim(IntPtr m);
Then the native methods can be called alongside normal C# code. I'll go in reverse of the actual process of factoring a matrix using the native code.
1public static CsGaussFactorization Factor(Model.CsRMatrix m)
2{
3 var nativeMPtr = AllocateNativeRMatrix(m);
4 var fPtr = RMatrix_gelim(nativeMPtr);
5 var f = Marshal.PtrToStructure<GaussFactorization>(fPtr);
6 var csF = new CsGaussFactorization
7 {
8 PInverse = AllocateManagedRMatrix(f.PInverse),
9 Lower = AllocateManagedRMatrix(f.Lower),
10 Diagonal = AllocateManagedRMatrix(f.Diagonal),
11 Upper = AllocateManagedRMatrix(f.Upper),
12 };
13 NativeStdLib.Free(nativeMPtr);
14 NativeStdLib.Free(fPtr);
15 return csF;
16}
First I call a method to allocate a native matrix (below), and then I call RMatrix_gelim
on it, which returns a pointer to a native struct. Since the struct is part of the public native interface it can be unmarshaled into a C# object with the Marshal.PtrToStructure<GaussFactorization>
call. Then the native matrix pointers are used to construct managed matrices through the AllocateManagedRMatrix
calls (also below). Finally, since the native matrix pointer and the factorization pointer are allocated by the native code, they have to be freed by a call to the native free
method. Also see below.
1private static IntPtr AllocRashunal(int num, int den)
2{
3 IntPtr ptr = NativeStdLib.Malloc((UIntPtr)Marshal.SizeOf<Rashunal>());
4 var value = new Rashunal { numerator = num, denominator = den };
5 Marshal.StructureToPtr(value, ptr, false);
6 return ptr;
7}
8
9private static IntPtr AllocateNativeRMatrix(Model.CsRMatrix m)
10{
11 int elementCount = m.Height * m.Width;
12 IntPtr elementArray = NativeStdLib.Malloc((UIntPtr)(IntPtr.Size * elementCount));
13 unsafe
14 {
15 var pArray = (IntPtr*)elementArray;
16 for (int i = 0; i < elementCount; ++i)
17 {
18 var element = m.Data[i];
19 var elementPtr = AllocRashunal(element.Numerator, element.Denominator);
20 pArray[i] = elementPtr;
21 }
22 var rMatrixPtr = new_RMatrix(m.Height, m.Width, elementArray);
23 for (int i = 0; i < elementCount; ++i)
24 {
25 NativeStdLib.Free(pArray[i]);
26 }
27 NativeStdLib.Free(elementArray);
28 return rMatrixPtr;
29 }
30}
Allocating a native RMatrix required native memory allocations, both for individual Rashunals and also for an array of Rashunal pointers. In a pattern that seems familiar now, I wrapped those calls in a NativeStdLib class that I promise to get to very soon. Allocating a Rashunal involves declaring a managed Rashunal struct, a pointer to a native Rashunal, and marshaling the struct to the pointer in native memory. The unsafe
block is needed to treat the block of memory allocated for the pointer array as an actual array, instead of a block of unstructured memory. To get this to compile I had to add <AllowUnsafeBlocks>True</AllowUnsafeBlocks>
to the PropertyGroup in the project file. Finally, I have to free both the individual allocated native Rashunals and the array of pointers to them, since new_RMatrix
makes copies of them all.
1private static Model.CsRMatrix AllocateManagedRMatrix(IntPtr m)
2{
3 int height = RMatrix_height(m);
4 int width = RMatrix_width(m);
5 var data = new CsRashunal[height * width];
6 for (int i = 1; i <= height; ++i)
7 {
8 for (int j = 1; j <= width; ++j)
9 {
10 var rPtr = RMatrix_get(m, i, j);
11 var r = Marshal.PtrToStructure<Rashunal>(rPtr);
12 data[(i - 1) * width + (j - 1)] = new CsRashunal { Numerator = r.numerator, Denominator = r.denominator };
13 NativeStdLib.Free(rPtr);
14 }
15 }
16 return new Model.CsRMatrix { Height = height, Width = width, Data = data, };
17}
After all that, allocating a native RMatrix is not very interesting. The native RMatrix_get
method returns a newly-allocated copy of the Rashunal at a position in the RMatrix, so it has to be freed the same way as before.
Ok, finally, as promised, here is the interface to loading the native standard library methods:
1using System.Reflection;
2using System.Runtime.InteropServices;
3
4namespace CsRMatrix.Engine;
5
6public static partial class NativeStdLib
7{
8 static NativeStdLib()
9 {
10 NativeLibrary.SetDllImportResolver(typeof(NativeStdLib).Assembly, ResolveLib);
11 }
12
13 private static IntPtr ResolveLib(string libraryName, Assembly assembly, DllImportSearchPath? searchPath)
14 {
15 if (libraryName == "c")
16 {
17 if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
18 return NativeLibrary.Load("ucrtbase.dll", assembly, searchPath);
19 if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
20 return NativeLibrary.Load("libc.so.6", assembly, searchPath);
21 if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
22 return NativeLibrary.Load("libSystem.dylib", assembly, searchPath);
23 }
24 return IntPtr.Zero;
25 }
26
27 [LibraryImport("c", EntryPoint = "free")]
28 internal static partial void Free(IntPtr ptr);
29
30 [LibraryImport("c", EntryPoint = "malloc")]
31 internal static partial IntPtr Malloc(UIntPtr size);
32}
The platform-specific switching and filenames are pretty ugly, but neither ChatGPT nor I could find a way around it. At least it's confined to a single method in a single class in the project.
ChatGPT really wanted there to be library-specific ways to free Rashunals and factorizations. Then those methods could be declared and called the same way as the new_*
methods. But I remained stubborn and said I didn't want to change the source code of the libraries. I was willing to recompile them as needed, but not to change the source code or the CMake files. Eventually, we found this way of handling the standard native library calls.
Getting the name of the file on Windows and getting this to compile and work was a little challenging. The C# code and the native code need to match exactly in the operating system (obviously), architecture (64-bit vs. 32-bit), and configuration (Debug vs. Release). It took a few more details than what I went through when compiling the JNI code.
Compiling on Windows
Windows is very careful about which library can free memory: it can only free memory that was allocated by the same library. Practically, that meant I needed to make sure I was allocating and freeing memory from the same runtime with the same C runtime model. That meant I needed to compile with the multi-threaded DLL (/MD) instead of the default multi-threaded (/MT) compiler flag. I also needed to use the right filename to link the libraries to, which ChatGPT thought it was mscvrt
initially. So I modified the steps to compile the library and checked its headers, imports, and dependencies. This again is in an x64 Native Tools Command Prompt for VS 2022.
1>cmake .. -G "NMake Makefiles" ^
2 -DCMAKE_BUILD_TYPE=Release ^
3 -DCMAKE_INSTALL_PREFIX=C:/Users/john.todd/local/rashunal ^
4 -DCMAKE_C_FLAGS_RELEASE="/MD /O2 /DNDEBUG"
5>nmake
6>nmake install
7>cd /Users/john.todd/local/rashunal/bin
8>dumpbin /headers rashunall.dll | findstr machine
9 8664 machine (x64)
10
11>dumpbin /imports rashunal.dll | findstr free
12 18 free
13
14>dumpbin /dependents rashunal.dll
I didn't see msvcrt.dll, but did see VCRUNTIME140.DLL instead. ChatGPT said, "Ah, that's okay, that's actually better. msvcrt is the old way, ucrt (Universal CRT) is the new way." Then linking to "ucrtbase" in the NativeStdLib utility class (as shown above) worked.
Like with JNI, I had to add the Rashunal and RMatrix libraries to the PATH, and then it worked!
1> $env:PATH += ";C:\Users\john.todd\local\rashunal\bin\rashunal.dll;C:\Users\john.todd\local\rmatrix\bin\rmatrix.dll"
2> dotnet run C:\Users\john.todd\source\repos\rmatrix\driver\example.txt
3Reading matrix from C:/Users/john.todd/source/repos/rmatrix/driver/example.txt
4Starting Matrix:
5[ {-2/1} {1/3} {-3/4} ]
6[ {6/1} {-1/1} {8/1} ]
7[ {8/1} {3/2} {-7/1} ]
8
9
10PInverse:
11[ {1/1} {0/1} {0/1} ]
12[ {0/1} {0/1} {1/1} ]
13[ {0/1} {1/1} {0/1} ]
14
15
16Lower:
17[ {1/1} {0/1} {0/1} ]
18[ {-3/1} {1/1} {0/1} ]
19[ {-4/1} {0/1} {1/1} ]
20
21
22Diagonal:
23[ {-2/1} {0/1} {0/1} ]
24[ {0/1} {17/6} {0/1} ]
25[ {0/1} {0/1} {23/4} ]
26
27
28Upper:
29[ {1/1} {-1/6} {3/8} ]
30[ {0/1} {1/1} {-60/17} ]
31[ {0/1} {0/1} {1/1} ]
What's even more exciting is that when I committed this to Github and pulled it down in Linux and MacOS, it also just worked (for MacOS after adding the install directories to DYLIB_LD_PATH
, similarly to what I had to do with JNI.)
Optimization
Remembering to free pointers allocated by native code isn't so bad. I had to do it in Java with the FFM and when writing the libraries in the first place. But ChatGPT suggested an optimization to have the CLR do it automatically. After reassuring it many times that the new_*
, RMatrix_get
, and RMatrix_gelim
native methods returned pointers to newly-allocated copies of the relevant entities and not pointers to the entities themselves, it said this was the perfect application of the handler pattern. Who can pass that up?
First I wrote some wrapper classes for the pointers returned from the native code:
1internal abstract class NativeHandle : SafeHandle
2{
3 protected NativeHandle() : base(IntPtr.Zero, ownsHandle: true) { }
4
5 protected NativeHandle(IntPtr existing, bool ownsHandle)
6 : base(IntPtr.Zero, ownsHandle)
7 => SetHandle(existing);
8
9 public override bool IsInvalid => handle == IntPtr.Zero;
10
11 protected override bool ReleaseHandle()
12 {
13 NativeStdLib.Free(handle);
14 return true;
15 }
16}
17
18internal sealed class RashunalHandle : NativeHandle
19{
20 internal RashunalHandle() : base() { }
21
22 internal RashunalHandle(IntPtr existing, bool ownsHandle)
23 : base(existing, ownsHandle) { }
24}
25
26internal sealed class RMatrixHandle : NativeHandle
27{
28 internal RMatrixHandle() : base() { }
29
30 internal RMatrixHandle(IntPtr existing, bool ownsHandle)
31 : base(existing, ownsHandle) { }
32}
33
34internal sealed class GaussFactorizationHandle : NativeHandle
35{
36 internal GaussFactorizationHandle() : base() { }
37
38 internal GaussFactorizationHandle(IntPtr existing, bool ownsHandle)
39 : base(existing, ownsHandle) { }
40}
Then I had most of the native and managed code use the handles as parameters and return values instead of the pointers returned by the native code:
1[DllImport("rashunal", EntryPoint = "n_Rashunal")]
2private static extern RashunalHandle n_Rashunal(int numerator, int denominator);
3
4[DllImport("rmatrix", EntryPoint = "new_RMatrix")]
5private static extern RMatrixHandle new_RMatrix(int height, int width, IntPtr data);
6
7[DllImport("rmatrix", EntryPoint = "RMatrix_height")]
8private static extern int RMatrix_height(RMatrixHandle m);
9
10[DllImport("rmatrix", EntryPoint = "RMatrix_width")]
11private static extern int RMatrix_width(RMatrixHandle m);
12
13[DllImport("rmatrix", EntryPoint = "RMatrix_get")]
14private static extern RashunalHandle RMatrix_get(RMatrixHandle m, int row, int col);
15
16[DllImport("rmatrix", EntryPoint = "RMatrix_gelim")]
17private static extern GaussFactorizationHandle RMatrix_gelim(RMatrixHandle m);
18
19private static Model.CsRMatrix AllocateManagedRMatrix(RMatrixHandle m)
20{
21 int height = RMatrix_height(m);
22 int width = RMatrix_width(m);
23 var data = new CsRashunal[height * width];
24 for (int i = 1; i <= height; ++i)
25 {
26 for (int j = 1; j <= width; ++j)
27 {
28 using var rPtr = RMatrix_get(m, i, j);
29 var r = Marshal.PtrToStructure<Rashunal>(rPtr.DangerousGetHandle());
30 data[(i - 1) * width + (j - 1)] = new CsRashunal { Numerator = r.numerator, Denominator = r.denominator };
31 }
32 }
33 return new Model.CsRMatrix { Height = height, Width = width, Data = data, };
34}
Note the switch from LibraryImport
to DllImport
on the struct declarations. LibraryImport is newer and more preferred, but for some reason it can't do the automatic marshaling of pointers into handles like DllImport can.
I didn't like the repeated boilerplate code in the concrete subclasses of NativeHandle. I wanted to just use NativeHandle as a generic, i.e. NativeHandle<Rashunal>
, but that didn't work. ChatGPT said I needed a concrete class to marshal the native struct into, and that the structs I declared in the adapter wouldn't do it. That's also why the parameterless constructors are needed, for the marshaling code, even though they don't do anything but defer to the base class. So be it.
Now there's no need to explicitly free the pointers returned from RMatrix_get
, n_Rashunal
, n_RMatrix
, and RMatrix_gelim
. There are still some places where I have to remember to free memory, such as when the array of Rashunal pointers is allocated in AllocRashunal
. I tried to get rid of the calls to ptr.DangerousGetHandle()
when I need to marshal a pointer to a struct, but apparently they are unavoidable.
Reflection
After struggling so much with FFM, I was pleasantly surprised by how easy it was to work with C# and its method of calling native code. Interspersing the native calls with the managed code was pretty fun and easy, especially after refactoring to use handles to automatically dispose of allocated memory. It was a little tricky figuring out when I still had to marshal pointers into structs or vice versa, but the compiler and ChatGPT helped me figure it out pretty quickly.
So far, if given the choice of how to call my native libraries, C# and the CLR is definitely how I would do it.