Natural Language in Small Wide World


Sept 1, 2016

Yesterday I published a small piece of software to Small Wide World's git to very little fanfare. It was a generalization of a bad piece of software I wrote the day before. It uses NLTK to perform a simple task: parse a simple sentence which follows the form "subject verb object" with optional additional information starting with "because". Examples of this grammar include:

GnuPG is software
IRC is a protocol
software implements a protocol
Javantea is human
AI3 is software
Javantea wrote AI3
Javantea writes software
Javantea writes English
Javantea reads German
Javantea reads Japanese
Javantea reads Portuguese
Javantea reads Spanish

nlp1.py creates this graph of the relationships:

Read more »

Links

Sono.us: link shortener for the author
AltSci Network: sites by same author
AltSci Blog: Unified blog for Javantea. Search this blog and others by the same author

Read more »

Small Wide World Video Update

This blog post is a quick video with some backup data for my Small Wide World project.

Welcome to a quick update for Small Wide World. Progress on this project is moving rapidly so I just like to let you know what's going on and how awesome it is. The first improvement you've probably noticed on the website is better looking graphs all over it. That's a good sign that things are improving but that's just showing you the interesting improvements that can be made with a little bit of code. Today I wrote this code I'm going to show you. Here you see a two-node graph with a and b. It's the simplest you can get. You can see that we've added a new node onto a here. So now we have a three node network. And then you see it's the same network c-a-b and we've connected that via a to d. You can see that even though it looks different, we still have c-a-b and d and we've just connected e to a. This is a self-organizing network. It's actually very fast, it uses very few operations so it's going to become the default algorithm for the web, Python, C, and all versions. They will use this algorithm to make the initial pattern. It doesn't work perfectly for all types of graphs, but it works especially well for these types of graphs: graphs which aren't cyclic and are using lots of branches. See that we add g to b. See the angle between g-b-a, g-b-f, and a-b-f are all the same and the length of all the bonds are the same. This is a very easy technique that shows a graph effectively. This makes it so that you can understand the graph more quickly — or at least that is the idea. We've added h, we've added i, we've added j, and k. Then you see we've added l to d, so in order to make a-d-i, i-d-l, and l-d-a equal angles we turned d-i-k sideways. This is just another one rotated because we've added m to a here. You can see n added, we're getting close to where nodes start to overlap with each other. That's one limitation of this algorithm, but I can pretty confidently say that solving that is a job for a more expensive algorithm, for example global optimization. Once this algorithm has finished you can run a global optimization algorithm on it and it will have better success than just a random graph. You can see that k, h, and c are a little crowded, so an optimization algorithm would push these away from each other because k and h are too close to one another. We're almost to the point where it's going to fail. And here we see that it fails with c-h overlapping p-q and c-s overlapping i-k. In order to fix that, an optimization algorithm would only have to move p and q down and i and k up. How much computation does that require? Local minimization algorithms would probably be able to do this in a few million operations (less than a second but far more than this algorithm takes), but if they couldn't properly solve the collision, you'd have to use a global optimization algorithm like basin-hopping or Monte-Carlo Metropolis to finish the job. This could take seconds or even minutes. Small Wide World currently implements basin-hopping using SciPy which I will be testing thoroughly against graphs like these.

Read more »