Psty

Psty is a little server you run on your desktop as a sidekick to Pigshell. While pigshell can be used standalone, psty (the p is silent, as in Psmith) adds so much to its power and reach as to be practically indispensable.

Psty implements three services:

  1. Local file server: Psty exposes a local directory of your choice to pigshell, letting you read and write files stored on your desktop from pigshell.
  2. HTTP proxy server: Psty proxies HTTP requests for pigshell, giving it access to any URL on the web, bypassing the same-origin restrictions faced by Javascript apps in the browser.
  3. Websocket server: Every unix app on the desktop which uses stdin/stdout can participate in a pigshell pipeline.

Together, these features enable powerful data movement and transformation pipelines. For instance:

pig> cat http://pigshell.com/sample/oslogos.png | wsh convert -implode 1 - - | to -g canvas

Implode

We grabbed an image from pigshell.com using the proxy, piped it via websocket to the local Unix host running psty, ran the ImageMagick convert tool to transform the image (taking care to use stdin/stdout), piped it back to pigshell and displayed the transformed image.

Psty is implemented as a single standalone Python file weighing < 1000 lines, and requires only a standard Python 2.7 distribution. It can run on Linux and Mac OS. A Windows port is in the works.

Installation

Get it here and run it from a shell as follows:

    bash$ python psty.py -a -d /some/scratch/directory

This command starts all three services and exposes /some/scratch/directory to all pigshell instances running in your browser. There are CLI options to start a subset of the services, and to change the default port of 50937.

In your pigshell tab, type the command

    pig> mount http://localhost:50937/ /home

and you should be able to see the contents of /some/scratch/directory inside /home.

This mount command needs to be typed every time you start or reload the page. To do it automatically,

    pig> echo "HOME=/home\nmount http://localhost:50937/ $HOME" >/local/rc.sh

/local/rc.sh is a script stored in the browser's LocalStorage and will be invoked every time http://pigshell.com is (re)loaded.

Currently, pigshell assumes that websocket and proxy services are available at localhost:50937. This will be configurable in future.

Examples

  1. Data movement: Back up all Picasa photos to a local directory

    pig> mkdir /home/taj; cp /picasa/albums/TajMahal/* /home/taj
  2. Data retrieval: Copy a file from the web, continuing from where we left off

    pig> cp -c http://ftp.freebsd.org/pub/FreeBSD/ISO-IMAGES-amd64/10.0/FreeBSD-10.0-RC4-amd64-bootonly.iso /home
  3. Running local commands:

    pig> wsh ps ax

    runs ps on your desktop and dumps the output in pigshell.

  4. Visualization: Local disk usage visualized in an interactive zoomable treemap

    pig> wsh du /Users/foo | to -g text | template -ig /templates/d3-du-treemap

du-treemap

(Note that `du` of a deep tree may take a while, try with a shallow
directory tree first)

Why Psty

Pigshell wants to be a common platform where web and local data can copied, mixed, viewed, analyzed and visualized using the browser's display engine. There are several issues holding it back from this goal:

  1. CORS: For security reasons, Ajax calls from Javascript are restricted by the common origin policy from retrieving data from random websites, since that would enable evil.com's scripts to trawl through LAN websites which were really not meant to be publicly visible.

    However, this same policy also prevents Javascript from accessing different-origin websites like Wikipedia, which are meant to be publicly visible. The official solution to this is CORS, where public websites add a header to their responses, indicating that cross-origin access is OK.

    Unfortunately, CORS support in the wild is practically non-existent. The Wikipedia article on CORS does not have a CORS header. Even those sites which claim to support CORS often have half-hearted support (GET but not HEAD). Picasa APIs work cross-origin as long as you're doing GETs, but don't work with POSTs, since they don't implement OPTIONS properly.

    So you have no choice but to use a proxy to access public URLs from Javascript.

  2. Local filesystem access: Getting access to the local filesystem from the browser is an awkward and messy affair. Dumping large objects to the desktop from Javascript has always been a challenge because there is no way to "stream" it. The File API is still very much a work in progress.

    Accessing the local filesystem via an HTTP server gives us a ton of benefits: streaming reads and writes, ability to move data back and forth from the cloud to the desktop, and an unlimited workspace.

  3. Utilities: Tons of apps are coming to Javascript every day, many of them via the Emscripten route, but there are still a wealth of tools available on the average Unix workstation - ImageMagick, R, even plain old grep - for which exact equivalents are hard to find.

While some of these requirements can be met by a cloud backend, we strongly prefer pure client-side, self-scaling architectures (basically we don't want the headache of maintaining a server) Psty lets us do all these things while still being easy to deploy and use.

Security

Psty serves data only to localhost, and then only when it sees an origin header from http://pigshell.com.

Any modification of psty can be dangerous. Specifically,

  1. Changing the CORS headers to '*' should never be done. This allows any site you visit from your desktop to use psty. If you want to run a copy of pigshell from another domain, change the headers to that domain name and not '*'.
  2. Don't changing the listening IP from localhost. You don't want your data to be exposed even on the LAN.

As pigshell is open source, you can verify that the code does what it is supposed to do. Running pigshell scripts from untrusted sources is as dangerous as running untrusted shell scripts on your desktop.