Note: This might be quite obsoleted now As you are reading this, you are probably scraching your head, and thinking, "SandStorm... That's sounds familiar from somewhere...". After trying to activate some of those grey cells(They don't tend to work at 3AM, especially after a whole day of coding and bug-hunting), you finally find out why. "Ah, HailStorm! that's where it's coming from!". Then you start to wonder what SandStorm is. Perhaps it is the code name for the next stage in Microsoft's world domination plans. Perhaps it is just another name for HailStorm. Or perhaps this is all a bad dream, and you fell asleep while coding(That's what happends when you are too lazy to make a cup of coffe). All of these conclusions are wrong, of course(perhaps the last one isn't, though), and to prevent brain overloading, I will explain what SandStorm is all about. First, it's name is mostly for marketing reasons. It has nothing to do with HailStorm, with the only common properity is the general aim(World domination of course !). Now that you know what SandStorm isn't, you can sit down, relax, and read what SandStorm is. In short, SandStorm provides a loose frame-work for creating scaleable, complex distributed applications, using XML-RPC for communication. So what SandStorm consists of? SandStorm can be divided into three parts: A) Application API B) Library implementation C) Standard components. Application API =============== SandStorm's application API is pretty minimal. it consists of the registry API, and component API. The registry, a component by itself, contain lists of all of the known components, except the registry,as well as their location, or URL. component registeration is provided by XML-RPC interface of course, and can be done either by the component itself, or by a "3rd party" utility. The name of the component in the registry marks the method name namespace it occupies, so a component named as "my.namespace" means that all the methods releated to that component start with "my.namespace". Each component, in addition to the method in it's "private" namespace, must also conform to Eric Kidd's introspection API, as well as a set of standard component methods, which take the "active.component" namespace. Library implementation ====================== While the SandStorm spec may be used directly with XML-RPC, it is much better to have some kind of abstraction layer between the client and SandStorm, as well as simplifying the creation of SandStorm components. The first abstraction layer was built in PHP, on top of Useful Inc XML-RPC implementation. Useful's implementation itself was heavyly patched and modified(part of the patch was accepted to the next version of Useful's XML-RPC), and a nice abstraction layer was put on top of it. Then a wrapper was added, simplifying access to the registry and the components, with server wrapper closely following. Since then, python library was created on top of lightly patched xmlrpclib from PythonWare, and Eric Kidd's introspection implementation. Perl client/server library was also created, using Eric Kidd's registry, and Frontier::RPC2 module, which is not included in the package because of too much depencies and size issues(Although there is a guy working on alternative for Frontier::RPC2) There is also Ruby client code, although the XML-RPC library itself is not included. Last but not least, there is support of course for SandML, an XML programming dialect, designed and aimed as a glue language between XML-RPC services, but that will be discussed later:) Standard components =================== This is where all the fun starts:) Having a registry and support libraries is nice, but otherwise useless. so SandStorm comes with a quite rich collection of components, most of them written as CGI's, with a minority written as stand-alone components, and they mostly have CGI counterparts. I will not describe all of them, and the included API browser can teach you quickly about their API, but i will describe the more exotic and interesting components. SandStorm::Cache ================ Namespace: active.cache Implementation : CGI, Standalone Description: This component was designed to be used by dynamic content generators. one of the obstacles of dynamic content is that generating it can have a significant overhead. The common sense suggests that the solution is to cache it. but cache managment is hard, and since the nature of CGI's, it is usally done by storing it on disk. SandStorm::Cache provides a simple API for managing your cache by hash->value pairs, no matter which language you use. There are two implementations of the API, a simple CGI based, which stores it on disk, and a high-performence stand-alone implementation, storing cache on memory. It is usally suggested to use the CGI implementation only for testing purposes. the stand-alone component has low-latency, and does things like flushing unused entries(using TTL), and compression for big stuff. It also provides an alternative API, which instead of client given hashes, uses server-assigned id numbers. the hash API is in fact wrapped on top of the API, so this one can be slightly faster. SandStorm::Mirror ================= Namespace: active.mirror Implementation: CGI, Stand-alone Description: This component provides file transfer service(optionally deleting the source), for mirroring parts of your site. basicly it's API is pretty simple. name of the filename, and the target URI, which can be in theory anything, although only FTP is currently supported. Like in the case of SandStorm::Cache, there are once again two implementations, which although they conform to the same API, they behave in a different way. The CGI implementation is synchornized, that is, the method will return only after the file transfer is completed. The Stand-alone implementation, on the other hand, is async. It has a file transfer queue, and a fixed number of threads serving it. when a file transfer request is issued, it is moved to the queue, and the method returns true. The advantage of the CGI implementation is that you know when a transfer was completed, but the stand-alone implementation, on the other hand, provides scalability. It is possible that in the future the stand-alone implementation will have an API for "record keeping", that is, things will be async, but can be traced by file transfer id number. SandStorm::Public ================= Namespace: none Implementation: Stand-alone Description: One of SandStorm's assumptions(and in fact XML-RPC as a whole) is that security and authentication should be provided by HTTP. it's a good assumption, since it simplifies things, but the problem is that most XML-RPC libraries rarely support HTTP auth and/or SSL. Now, since SandStorm mostly provides the back-end, this is not so critical, as back-ends are usally behind a firewall. But what if you wish to give your clients a restricted access to the application API after all? For example, you may wish to provide read-only access to SandStorm::Cache, or your newswire component. That's where SandStorm::Public steps in. basicly, it acts as XML-RPC proxy for all other components(Clients do not need the registry at all). But it also provides ACL mechanism for controlling who gets in and who isn't. The ACL is largely inspired from IPChains. the ACL is a table of rules, where each is a collection of matches. A match has a selector string, for example "auth: user pass". if the selector is correct, a policy exception is raised. If policy is defined by the ACL interepter, execution finished. if it isn't, then the rule table is searched for a matching rule, and when found, the rule is executed. The ACL rules are described in XML, with another high-level XML dialect, for more "component-oriented" approach. SandStorm::SandML ================= Namespace: none Implementation: Stand-alone Description: This one, like SandStorm::Public, does not provide services directly. instead, it provides a run-time enviroment for SandML. SandML can be defined as the "native" language of SandStorm. it allows creating very simple components, or glue together a few simple components into a big complex one. A simple SandML document may look like this: <sml id="my.namespace"> <method id="myMethod" desc="Just a test"> <params> <param id="name" type="string"/> <param id="action" type="string"/> </params> <def id="toret" type="string"> <var id="name"/> <var type="string"> Is </var> <var id="action"/> </def> <use component="my.logger"/> <call id="logAction"> <var id="toret"/> </call> <return> <var id="toret"/> </return> </method> </sml> This might seem at first more cryptic then an average perl code, but it basicly describes a component with the "my.namespace" namespace. it has one method, "myMethod". it accepts two parameters, a name and an action. when calling: my.namespace.myMethod("idan","coding") "idan Is coding" string is returned, and another method, "my.logger.logAction" accepts the string as well. There are also some other directives in the language, that I do not have the time to describe in this document. So what SandStorm::SandML does with this document? First, it is semi-compiled into what is called DCode, which, without going into technical explaintions, is a sort of mini-VM, with a minimal set of 7 Instructions. After compiled as DCode, SandStorm::SandML registers all SandML components, and serves as a bridge between the client and the DCode VM It should be note that currently SandStorm::SandML is a seperate package, and can be described as alpha software. while the SandML->DCode translation is very clean, the DCode VM probably has quite a lot of bugs, and doesn't yet handle errors well. Summary ======= This article/document is a brief introduction for what SandStorm stands for, and for what it offers. My next article will describe how to install SandStorm, and writing simple client/server in python. It will probably take a bit of time, as I first have to finish SandStorm 0.62, but I believe it will worth the wait:)