(A cheesy homepage for Justin Collins)
gdbm for JRuby

Motivation

(Warning: boring story ahead. Skip to next section if you want.)

Quite a while ago, I thought it would be kind of nice if my Ruby MUD server would run on JRuby, since JRuby might be better at dealing with long-running server processes. Unfortunately, at the time, EventMachine did not have a Java version, so I decided to wait for that to work.

A while later, there was some kind of Java release for EventMachine. I tried it out, but then ran into another problem: gdbm is provided as a C extension for the standard Ruby distribution. So, I gave up again.

Then, along came Ruby FFI and I thought, “Hm…wonder if I could use that…” and then I gave up because I am not very good at C, even just trying to read it.

THEN, I saw this blog post about the next version of JRuby (1.4) and decided maybe I should do something about it, after all. So I did.

ffi-gdbm

The result is a gdbm library which should be fully compatible with the standard C extension distributed with Ruby, including reading and writing gdbm files which can be read/written by either implementation.

Requirements

Although I had hoped to make this library compatible with any Ruby FFI implementation, unfortunately it only works with JRuby, and it needs at least version 1.4. Perhaps this will change as the other Ruby FFI implementations get fancier.

Installation

You can checkout ffi-gdbm on GitHub. All you actually need is the gdbm.rb file.

Alternatively, you can install it as a gem:

jgem install gdbm --source http://gemcutter.org

Make sure you require "rubygems" if you do use the gem version.

Usage

You can use this library just as you would the standard library that comes with Ruby. See here for documentation.

Aftermath

Unfortunately, the Java version of EventMachine is still incomplete, so it’s a bit of letdown, although kams does work at least nominally using it. Hopefully, EventMachine will one day be completely stable under JRuby.


What I have enjoyed lately

Lately, there has been a particular activity which I consistently look forward to with excitement. That is writing libraries for my own programming language. Even if it is just wrapping existing libraries, there is something really cool about enabling a language to do “real world” type things, such as communicating with other processes or generating HTML or making a full-blown GUI.


Announcing the Fledgling Languages List

Since I have started working on my own little programming language, I have repeatedly made attempts to take a look around at other little languages that people may be working on. However, either my Google-fu is weak or it is just difficult to find these little guys.

Part of the difficulty is that most people do not want to see your little half-finished, half-working language. A recent discussion on Reddit showed that people can be very reactionary to such languages. Now, there is Esolang, but that site is more focused on languages that are purposefully unusable.

Thus, I have created a new site to be the home of fledgling languages, appropriately entitled The Fledgling Languages List. Anyone is welcome to submit their own language or languages they know about for listing. Each language has its own little page with a description and people can leave comments or feedback for each language.

I am very open to suggestions or feedback concerning the site, especially if something is broken.


Motivation

Soon I will beginning my fourth year of graduate school. The thought of that makes me a bit nervous. With only a single paper published and the summer rapidly slipping away, I am a bit apprehensive about my progress. While I do have a good idea of my research area and am at least getting a bit of stuff down for it, I worry that I will be slow in getting stuff done, and therefore not have time to do something I can be proud of. My hope, truly, is to be able to do something that matters. Otherwise, what is the point?

I have recently seen a couple videos which have gotten me a little more excited about things, though. The first was this interview with Neil deGrasse Tyson. He talks a considerable amount about how he has managed to get to where he is now (well-known astrophysicist). The other video is this one of Leonard Kleinrock (“Father of the Internet”) giving a talk entitled “My Life and My Work.” Really interesting and inspiring, in my opinion. (You can download it from here like I had to, since it kept stopping and restarting on me.)

I really wish to do “big stuff,” something that will matter. But it seems most of the people around me just want to graduate, get their degree, and get out of here. Of course, so do I, but I would like to be leaving while having accomplished something of importance, even if it is just a little bit of importance.


Operator Precedence and the Shunting Yard Algorithm

Previously, I had pretty much avoided implementing any kind of operator precedence for Brat and covered up my reluctance by saying that was just how it was. So binary operations were strictly left-to-right affairs. But that makes things like 1 + 3 * 4 very disappointing. Since I keep feeling like Brat should at least pretend to be a real language, I decided I really needed to do something about this.

What is so hard about it? It doesn’t seem difficult at first. It seems like there ought to be a nice recursive solution to the problem. And there probably is, but if you are using a typical recursive descent parser, this recursion will not be possible.

There are a few ways to go at this issue, but I wanted the simplest and easiest to deal with. A while ago, I had been thinking about this and ran across a reference to the shunting yard algorithm. At the time, I couldn’t really see how it was related. This time around, I looked at it a bit more closely.

The purpose of the algorithm is basically to convert equations which use infix notation (typical math notation with operators in between their arguments) into Reverse Polish Notation. Why would you want to do that? Well, because RPN is explicit about which order operations are evaluated in. It is also pretty straight-forward to implement (which is the important part, really.)

Let’s take a look at the two main parsing rules, written in Treetop, which deal with binary operations. I have elided the related code:

  rule binary_operation
        lhs:binary_operation_chain rhs:expression { ... }
  end

  rule binary_operation_chain
        (lhs:binary_lhs_expression ~space op:operator ~space)+  { ... }
   end

The first rule will match any binary operations. The second rule is a helper, which matches a list of binary operations, all except for the far right expression. There may have been some explicit reason why I separated these before, but it remains convenient in the code to do so.

The second rule will essentially return a list of alternating values and operators, just what the shunting yard algorithm needs.(Well, actually it’s simpler than the algorithm needs.) This list will be fed into the algorithm, which will spit out a new list of values and operators, now in the proper RPN form.

The parser then “evaluates” the RPN expression. It essentially runs through the algorithm for evaluating RPN, outputting code to evaluate expressions and using temporary values. In other words, the RPN algorithm is not explicitly emitted in the code, but the intent is in the order that things are evaluated.

The whole thing ends up being about 50 lines of code, which is far less than I thought I would be looking at for this.

Anyhow, just wanted to share. I’m pretty excited that Brat has this feature now, especially since it can be implemented in such a flexible way. Operator precedence is stored in a hash table which just associates operators with an integer representing its precedence level, so it is quite simple to add more operators (operators not in the table are just given the lowest precedence, though, so arbitrary operators are still completely supported).


My weird project setup

When I started with kams, it was just going to be my own little MUD server that I puttered around with. So I set up a little Subversion repository and went at it. That repository slowly turned into my main server for the game. That way, I could do version control and keep a backup copy of everything. That worked pretty well for quite a while, until the decision was made to release some code. To do so, I created a branch, stripped out all of the stuff I didn’t want publicly released, and then released that code. Well, that is the way it was supposed to work. Actually, I had to fix things along the way as I was preparing the release code. This meant copying/merging back and forth between the branch and the trunk versions of the code.

This became annoying.

Now that I have been using Git for a while, I was hoping to do something so that I can continue my current path (keeping my private stuff private, but in the Subversion repository), but also maintaining an on-going public code repository.

To compound the issue, I (very foolishly?) do most of the development work on the live server. (Keep in mind, no one is actually using the live server, so this really does not matter.)

This is what I came up with. First, I created a GitHub repository for the public code. Then I copied my latest branch into the Git repository, committed it. Next, I created a new branch, deleted all the files, and then did an svn checkout to the new branch, and finally committed the private files to that Git branch.

Now what I can do is make code updates on my trunk code (kept in Subversion), then do an svn update on the private Git branch, then do a Git commit of the specific changes I want put in the public repository, then do a git cherry-pick of that commit into my master branch. Finally, a git push puts it all up on Github.

This works in reverse, as well. Changes made in the Git repository can be propagated to the private branch, from there to the Subversion repository of the trunk.

Works fairly well, actually, and it will make it considerably easier to create future releases. Yay.


Squish-ins in Brat

As I mentioned before, there is no real reason or goal behind developing Brat, I just saw potion and decided I wanted to be cool, too.

What that means is that features in Brat are sort of based on two things: 1) How easy is it to implement in Neko and 2) Is it something I expect to have because I am spoiled by Ruby.

For example, closures/functions in Brat started off being identical to Neko’s, because obviously that was the easiest thing to do. And I hadn’t planned on taking the ‘prototyping’ approach, but it was easy and is likely the brattiest way to go.

Anyhow, when I was doing stuff for array objects, I was thinking I should really copy Ruby’s approach, which is to have an enumerable module and then mix it in to array and hash and so on. But, of course, that would mean implementing mixins somehow. I knew I didn’t want to add in any new things (like modules) and it’s all about the lowest resistance, so I created squish-ins.

I have no idea if this concept has been used before (probably), but essentially you can take an object and copy all the methods of another object into it.

a = new
b = new
a.something = { p "do something" }
b.squish a
b.something  #prints "do something" 

Note that I said it “copies” the methods from the other object. This is and isn’t true. Squishing in another object’s methods does not create copies of them, only references. However, it is a reference to the object’s actual methods at the time of the squishing. In other words, this is a copying action rather than an inheritance or subclassing action. If the squished-in object’s methods change, the…squisher(?) will still reference the old methods.

a = new
b = new
a.something = { p "do something" }
b.squish a
b.something
a.something = { p "now do something else" }
b.something   #still prints "do something" 

So we get the code-sharing of mixins but not the inheritance properties. For now, I’m fine with that. Of course, it is actually not that difficult to do both:

a = new
b = new
a.something = { p "do something" }
b.something = { a.something }
a.something = { p "now do something else" }
b.something   #prints "now do something else" 

I am not sure which is preferable.


First C Extension Done (sorta)

I’ve been working on Brat quite a bit, and I remembered that I wanted to add in BigNum support. There is a GNU library for this called gmplib which seemed fairly straightforward. In fact, it was considerably more straightforward than connecting it up with Neko.

By the way, I realized I have never done anything in C. My undergraduate courses started off with C++, but I have never had to use plain old C.

Anyhow, this project required wired up three things: gmplib to Neko, then Neko to Brat. gmplib to Neko required creating, manipulating, and wrapping up C code and values using Neko’s FFI. Then I needed to replace all the math code I had written for Brat (in Neko). The last step (unfinished as of yet) will be to seamlessly switch between using Neko’s number values and the structures used by gmplib. There is no point in wasting time with BigNums when you are doing 1 + 1.

Neko FFI

First of all, everything passed in or out of Neko will be a value. This is a Neko datatype, which basically contains a kind and the data itself. There are several kinds that are predefined (basically, a kind is a type), but you will likely need to make your own to pass C values in and out of Neko. This is what the top of my number.c file looks like:
#include <stdio.h>
#include <neko.h>
#include <gmp.h>

DEFINE_KIND(k_mint);
DEFINE_KIND(k_mfloat);

I include the proper header files and then define two kinds, one for integers and another for floats.

I then made a couple functions (one for integers, one for floats) to allocate and wrap up the proper structures for me. I think this is probably the way it would typically look when using Neko’s FFI:
value new_float() {
        mpf_t * new_float;
        new_float = (mpf_t*)alloc(sizeof(mpf_t));
        mpf_init(*new_float);
        value store = alloc_abstract(k_mfloat, new_float);
        val_gc(store, free_float);
        return store;
}
The gmplib structures need to be cleared when you are done with them, so I have a function which is used as a ‘finalizer’, which is assigned as shown above. When the Neko value is garbage collected, it first calls this method:
void free_float(value floatval) {
        mpf_clear(val_data(floatval));
}

This function demonstrates another item: getting the data out of a value. This is performed using val_data, which returns whatever data you have wrapped up in the Neko value (using alloc_abstract).

Most of the functions I wrapped up are pretty straightfoward:
value float_add(value float1, value float2) {
        value result = new_float();
        mpf_add(val_data(result), val_data(float1), val_data(float2));
        return result;
}
To make these functions available to Neko, all you have to do is declare them to be ‘primitives’ and provide the number of parameters:
DEFINE_PRIM(num_add, 2);
Note that all values returned to Neko need to be converted into Neko values, even things like ints and booleans. This is fairly simple, though, as the API provides functions for these, as well as for checking the types of values passed in from Neko. For example, here is a simple boolean check:
value is_int(value num) {
        return alloc_bool(val_is_kind(num, k_mint));
}

Once I caught the hang of it, wrapping up the library functions wasn’t too hard. But good thing I have a fairly large test suite to make sure I’m doing everything right! This is the first project where I have used tests like this, but I believe I understand why people are stressing tests so much. It’s amazing how much they help, both in finding bugs and in providing some peace of mind.


Find largest Mandriva packages

This is more to remind myself how to do this, but this is pretty cool (from here ):

rpm -qa --queryformat '%{name} %{size}\n' | sort -n +1 | column -t

That lists your installed packages by size, largest last. Very nice if you just want to get rid of large packages you no longer need.


More about Brat

Now, I won’t claim that Brat is a well-planned, coherent language, but I have tried to keep a few things in mind.

1. Common things are shorter, simpler

For example, in Python you pass a function as a parameter by removing parentheses (meth), but invoke it using parentheses (meth()). In Ruby, this is a very weird process and makes named callbacks a bit of a pain (easier to pass a block which calls the method instead). But I invoke methods more often than I pass them as parameters, so a bare variable name means to invoke the method, while an arrow (->meth) means to use it as a value.

Another example. In Ruby, you use puts and gets to print a line (plus newline) and get a line (with the newline attached). In Java, this is System.out.println and…well, input is more of a pain. In Brat, you use p and g, and g strips the newline by default. The less common case (wanting to print without a newline) is just print.

Yet another example. If you want to do string replacement in Ruby, you use gsub to replace all matches, and sub to replace just the first. But I have pretty much never used sub, so in Brat sub replaces all matches and sub_first just does the first (and is clearer about its semantics, too.)

2. As few special cases as possible

In Brat, there are only objects and functions (and I was seriously considering making functions be objects, too, but it was too much of a pain and ruined some things). That is it. There are no keywords, and only a few “special” symbols, like ->. Instead of having separate literals for arrays and hash tables, they share the [] syntax. Operators work like in Ruby, where they are actually converted to method calls, but in Brat all operators are treated equally. Unfortunately, this means math is a little bit of a pain. I’m still thinking about how to make that better.

3. Provide as much flexibility as possible

This is the reasoning for how binary (and unary) operators work: you can define pretty much any operator you want, as long as it only contains symbols. And there are no keywords you might run into conflicts with (although you might overwrite default built-in functions/methods.)

Objects in Brat are not really constrained in any way. You can even share or swap methods between objects. It uses prototyping instead of static (or fairly static) class hierarchies, which I am still getting used to.

4. Worry about performance later

I mean, seriously. I am not going to worry about this until there is more than one person using the language.


Actual Brat Release

I’ve been working on my little language a bit more lately, instead of doing work I really should be doing.

Now that Brat is working pretty well, I thought I would make a release. It’s basically just a snapshot of the current Subversion repository, but I know people would rather download a tar file than use SVN to check something out. It is only for Linux at the moment, but maybe I will get it working on Windows sometime. It should work with Ruby 1.8.6, 1.8.7, and 1.9.

Alright, on to language stuff. I recently changed the scoping rules to be more what a person might expect: any scope can access its outer scopes (rather than before, which used the Neko default of copying the outer scope). This makes the toplevel variables something like globals, except not quite.

Example:
a = { x }
x = 1
p a   #Error - x was not defined when a was
b = { x }
p b   #Prints '1'
c = { x = x + 1 }
c
p b   #Prints '2'

And recursive functions no longer need to be attached to an object:

rec = { x | false? x < 1, { p x; rec(x - 1) } }
rec 10

Bad news is that my current approach does not do a good job of releasing memory. But that is a problem for another day.

Of course, many bugs large and small have been fixed along the way. I hope to at least get this language to the point where I use it regularly. I think it is an interesting and probably odd mix of object-oriented and functional programming.

Also, I set up a discussion group in case Brat generates any attention.


It’s hard out here for a white guy

Okay. This is a serious post.

I am a white guy. I self-identify as one. Of course, I doubt I am completely white (I am not even sure what that means, really), but I look like I am. This means I am pretty much the worst person to talk about race and ethnicity and skin color and whatever, because I guess white guys have done enough talking about that and they mostly suck at it.

Alright, that first paragraph came out a little weird. Let me instead begin with a story. The “world” in which I live is pretty white. I know that is the case because I would notice if it weren’t. I noticed the other day when my girlfriend and I went to an area we don’t go to very often and I noticed a lot of people around were black. I say “black” the way I say I am “white,” to indicate that their skin or my skin is what is mostly accepted as being one or the other.

So it occurred to me, as I noticed that there were a lot of black people around, that I really wish I didn’t notice that. Today my girlfriend and I went to a restaurant and I believe there was one other white person there, the rest were Asian. I wish I didn’t know that. I don’t want to care. I don’t want my girlfriend (she isn’t white) to notice that most of the people around are white. I don’t want to notice that a lot of the people around where I live are Hispanic.

What do I mean by that? I mean I wish I lived somewhere where that was normal. Yes, it is true that I am getting better. I am actually amazingly glad that my girlfriend has a different culture from my own. It has given me (I hope) a much better perspective on things, especially “race” (not sure what that is, really). But the fact is that I immediately notice when I am surrounded by people who do not look like me. And I begin to wonder if that is what other people feel like, too. Is that how my girlfriend feels most of the time? (I asked, she said it was.)

But what is a white guy to do about it? Perhaps I am too self-conscious, but doesn’t everything white people do seem to come off as either pretentious (“I am so much better than you, so I am going to help you”) or like a wannabe (like, I wannabe a gangsta). Maybe it comes off other negative ways, too. But sincerity…I do not know how to do that. I mean, I know how to be sincere. I am sincere. But I do not know how to show that. And I know I am immediately critical of others trying to do the same. What is wrong with me?

When are we all going to be able to just laugh at all this? At how unsophisticated our treatment of race and ethnicity and culture really is today? I really hope it is soon.


Twitter…sigh

As if I don’t have enough of a useless online presence, I have succumbed and joined Twitter.

I even added a little thing over on the sidebar to show my updates. Nifty.


In search of Earth

Maybe I have just been reading too much Jack McDevitt lately, but this makes me excited. We’re launching a spacecraft which will be searching for Earth-like planets within our galaxy. Just finding another planet that we might one day be able to walk around on without spacesuits…awesome.


Does this mean..?

Does this mean Brat will never be taken seriously?

a_!?-*+^&@1~\\><$ = new
a_!?-*+^&@1~\\><$.-!_+~%~+_!- = {d0~!@><?&&<>\/+-*^&% | d0~!@><?&&<>\/+-*^&%}
a_!?-*+^&@1~\\><$.@==||------> = { a_!?-*+^&@1~\\><$ }
a_!?-*+^&@1~\\><$.<------||==@ = { ->@==||------> }
@==||------>a_!?-*+^&@1~\\><$ -!_+~%~+_!- a_!?-*+^&@1~\\><$.<------||==@