Feature Exchangeанаbetter
Total Page:16
File Type:pdf, Size:1020Kb
Feature Exchange Better FFI Background The biggest new part of LiveCode 8 is LiveCode Builder, a new Englishlike programming language which is designed to be used to extend the feature set of LiveCode itself. LiveCode Builder has been designed in such a way to ensure that it can eventually replace C++ as the implementation language for all the LiveCode features you use and enjoy today. Compared to LiveCode Script, LiveCode Builder is a stricter language with a more expressive type system allowing it to be used to build the veryhigh level features you use every day. It has been designed to ensure that, in the future, LiveCode Builder code can be compiled to native code with similar performance to C++ code that does the same thing. One of the key features of LiveCode Builder is its ‘foreign function interface’, or FFI. A ‘foreign function’ is a function which has been written in C and compiled into a loadable shared library (or DLL, on Windows). LiveCode Builder lets you define ‘foreign handlers’ that let you call these functions directly from LiveCode Builder code. This feature is used extensively in the implementation of LiveCode Builder itself! Most of the features exposed through syntax in LiveCode Builder are implemented in C++ (with C function signatures) and then referenced from the LiveCode Builder standard library modules. The ability to hook into foreign code in this fashion enables LiveCode Builder to be used to implement very highlevel features (which you access using LiveCode Script) which directly use operating system APIs or preexisting thirdparty libraries. Foreign functions are designed to be called by the language they are implemented in. This means that they know nothing about the highlevel types which LiveCode Builder uses, and you are familiar with from using LiveCode Script. To make such functions easily usable requires a bridge between the types of value which Builder understands to the types which the foreign language understands. A good example of this can be seen by considering perhaps the most used value type you see in LiveCode Script and Builder: the string. In the LiveCode world a string is a sequence of characters which you can easily manipulate you don’t have to worry about representation or memory management as the LiveCode VM takes care of all of that for you. In contrast, many C functions expect strings as a sequence of bytes or shorts with either a separatelyspecified length, or a NUL terminating byte. So, if you have a C function which expects a string as a parameter, you need to map the highlevel string abstraction LiveCode uses to the specific kind of string that particular C function understands. The C function might also want to ‘keep’ the string you give it rather than just look at it temporarily, and this means ideas of lifetime and ownership of the underlying representation come into play. In addition to these considerations, the conventions in terms of the actual type of string a C function might expect, or what it expects to do with it can’t be detected automatically by looking at the C function! The information often only exists elsewhere, such as in the reference documentation for that particular function, or the documented conventions established for the library as a whole. The mapping of types of value from one programming language’s world to another programming language’s (or ‘bridging’) is perhaps the hardest problem to solve when you want this level of interoperability. Currently, LiveCode Builder tries to solve the bridging problem to some extent but the reality is that you still have to do a lot of work to utilise all but the simplest of C functions. The foreign function interface in LiveCode Builder is perfectly usable, but it is still a lot harder to use then we’d like, and a lot harder than it should be in a programming language which bears the name LiveCode. It is this ‘bridging’ problem we would like to work on to solve, and which has prompted this Feature Exchange proposal. Bridging to C Here are some examples of C APIs and how they ideally would map to Builder. libuuid libuuid_generate This really useful function generates a UUID and is present on most UNIX based systems in a library called ‘libuuid’. In C, the signature (description of parameter types and return type) of ‘libuuid_generate’ is: typedef unsigned char uuid_t[16]; void libuuid_generate(uuid_t uuid); These two lines of C basically do the following: ● Declare a new type ‘uuid_t’ which is a sequence of exactly 16 ‘unsigned chars’. ● Declare that the ‘libuuid_generate function’ takes a parameter of type ‘uuid_t’ and has no return value. This sounds all well and good, and surely that should be all you need to know to be able to call the function from a higherlevel language such as Builder. However, unfortunately, it is not! There are two pieces of missing information. 1. What is the content of those 16 ‘unsigned chars’? It is a matter of informal convention that you typically use ‘unsigned char’ in C to denote a byte, and as such the ‘uuid_t’ type is most likely describing a sequence of 16 bytes. However, this is just an informal convention: whether this is the case or not depends on what the intent of the function is and the conventions that the writers of libuuid decided on. They could have equally well have chosen ‘char’ and not ‘unsigned char’ and it would be just as correct. The intended meaning will be part of the documentation of the function within the library. It is not part of the language syntax itself. 2. What does the function actually do with its uuid parameter? Does it want to read from it, does it want to write to it, or does it want to do both? Again, this is something which is only explained it its documention. In this case it is quite easy to figure out. Because this is generating a UUID, it would be reasonable to expect that the function will write a new uuid to the parameter it has been given. With the actual C syntax and these two extra pieces of information, it becomes clear that the ideal Builder handler would look like:: handler libuuid_generate() returns Data Once you understand both sides — the handler we’d like in Builder, and the C function definition and extra information — you can work out how to ‘glue’ the two sides together so Builder can call the C function seamlessly and safely. libsqlite sqlite3_prepare This complex function prepares a piece of SQL for execution by SQLite. It has the following C signature: typedef struct sqlite3 sqlite3; typedef struct sqlite3_stmt sqlite3_stmt; int sqlite3_prepare( sqlite3 *db, const char *zSql, int nByte, sqlite3_stmt **ppStmt, const char **pzTail); There are several pieces of extra information which can be foind in the documentation for the function: ● The zSql parameter should be a string which has been encoded using UTF8 ● The nByte parameter is the maximum number of bytes to consider in the zSql parameter, up to the first zero byte (if present) ● ppStmt is the memory address where the created sqlite3_stmt reference should be stored ● pzTail is the memory addresswhere the point in zSql that the parser got to should be stored, because the function only compiles the first SQL statement in zSql ● The return value is an error code. If it is SQLITE_OK then it succeeded, and otherwise it failed and the return parameters will not have been touched. The trickiest part here is pzTail return value. At the C level, this return value will be a memory address inside the zSql string, but since you don’t really want to be dealing with memory addresses directly this needs to end up being an offset (in characters) into the input SQL string. Taking all this information into account, a reasonable Builder signature for the function could be: handler sqlite3_prepare( in db as Pointer, in zSql as String, out ppStmt as Pointer, out pzTail as Integer) returns Integer Of course, this still leaves the question of how to glue the two levels (Builder and C) together. Windows API GetCurrentDirectory The conventions used in different foreign libraries vary and many have their own specific patterns which are usually uniformly applied. A good example of this is the way the Windows Win32 API typically handles returning variable length strings. Consider the GetCurrentDirectoryW function from the Win32 API: typedef uint32_t DWORD; typedef wchar_t *LPWSTR; DWORD GetCurrentDirectoryW(DWORD nBufferLength, LPWSTR lpBuffer); As before, there are a few bits of further information you need to understand how this function works: ● The caller must provide a buffer big enough to include the NUL terminating char to lpBuffer, and pass its length in nBufferLength. ● The caller can determine the size of the buffer needed by calling the function with nBufferLength of 0 and lpBuffer as NULL. ● If the function fails, 0 is returned and the error code can be gained by calling GetLastError(); otherwise it returns the number of chars written into the buffer. There is a pattern here which is used a lot in the Win32 API. You call an API which returns a variable length buffer with 0 for the length and NULL for the buffer, and the function tells you how big it needs to be. This essentially leads to a ‘doublecall’ approach for many of these APIs you first call them to find out how much memory you need to allocate, and then you call again to actually get the result. Thus, a nice Builder signature for this function would be this: handler GetCurrentDirectoryW(out rDir as String) returns Boolean We would want the need to call the foreign function twice to be hidden at the Builder level, and the return value to simply indicate whether the call was successful.