Pigshell is a pure client-side Javascript app running in the browser, which presents resources on the web as files. These include public web pages as well as private data in Google Drive, Dropbox, Facebook and even the desktop. It provides a command line interface to construct pipelines of simple commands to transform, display and copy data.
Pigshell is open source software, released under the GNU GPLv3.
The pigshell system is similar in spirit to Unix and bash, but also borrows
several ideas from Plan 9 and rc
, mixing syntax and features in a manner
calculated to annoy experienced users of both systems.
The name pigshell comes from the time-honoured tradition of weak puns and recursive acronyms: GNU's Not Unix, and PIG Isn't GNU.
Shells and shell scripts occupy an important niche in the Unix users' universe: they can quickly assemble ad-hoc tools from simple components to interact with their data. For complex applications, they might open the editor and write a program, but for hundreds of simple operations, the humble shell suffices.
There is no equivalent in the world of the web and the cloud, though an increasing amount of our data resides there. One is forced to go through GUIs, each with their individual warts and annoyances. Imagine having to open a different GUI application every time you accessed a different disk, with no way to directly copy from one disk to the other.
The web abounds in APIs, but there is no easy way to connect half a dozen random APIs together without reading up a ton of API documents and do a fair amount of data normalization, coding and debugging before getting the first trickle of data to go from point A to B.
Pigshell is a place to have informal conversations with data.
In this document, we describe the different components of the system, their main features and examples of usage. In addition, we will also point out the more prominent gotchas, unimplemented features and bugs.
Broadly, Pigshell consists of the following:
The shell is designed to be feel familiar to Unix and bash users, but there are crucial differences. The most important of these are:
Stdout
) asks the upstream command for the next object, which in turn
asks its upstream command and so on, until the command at the head
reluctantly yields an object. It is processed and "returned" downstream,
until it hits Stdout
which displays it on the terminal. Stdout
has an
insatiable appetite for objects, so it asks for one more, and the process
continues until a null
object, signifying the end of the stream, makes
its way down. Unlike Unix commands, pigshell commands are not
independently executing processes.The shell presents itself as a terminal with a command line. Emacs-style command line editing is possible. Common shortcuts include:
The primary prompt consists of pig<basename_of_cwd>$
.
When you type a command at the primary prompt and hit Enter, it starts
running immediately. This is the foreground command. A secondary prompt of
>
is displayed.
You can use this prompt to typeahead another command, which will be executed after the foreground command completes. You can queue multiple commands in this way, and they will be executed in strict sequence.
To kill the foreground command, use Ctrl-C. This also triggers the running of the next queued command, if any.
Similarly, to pause the foreground command and continue with any queued
commands, use Ctrl-Z. The paused command can be resumed using ps
and
start
.
You may use Ctrl-B to "background" the foreground command and start running the next queued command. This is typically done when the foreground command is going to run for several seconds, and the queued command is not dependent on its predecessor.
The output of commands is restricted to an (elastic) area below the command line. Thus, many commands may be running and generating output at the same time without stomping over each other, maintaining the question-answer structure of the command line conversation.
This also means that multiple commands may be waiting for input, as indicated by blinking cursors. Simply click next to the cursor to switch focus.
The running status of a pipeline is visually indicated by the colour of the prompt.
Reloading the webpage is equivalent to rebooting the system and the loss of all local state. Only files stored in /local and filesystems backed by a persistent remote store (e.g. PstyFS, Google) will survive a reboot.
ಠ_ಠ Occasionally, things may get buggered up to the point that there is no cursor visible anywhere. In such cases, simply click near the last prompt and you should get focus there, and resume typing commands.
ಠ_ಠ Cut and paste is also somewhat iffy.
ls | sum
echo able baker charlie >/tmpfile
echo some more >>/tmpfile
ls A*
ls *.jpg
cat bar | grep foo >/dev/null && echo "bar contains foo"
cat < asd > bsd
rm somefile || echo rm failed!
Pigshell variables are lists of objects. Most commonly, they are lists of strings. Variables may be assigned values in the usual manner:
msg="How's it going?"
dirs=(/facebook /twitter /gdrive)
Parentheses are used to enclose lists. The variable dirs
is thus assigned a
list of two strings. msg
is a list containing one string.
Lists are expanded on reference.echo $dirs
would yield/facebook /twitter /gdrive
The echo
command is invoked with two arguments.
To add to a list,dirs=($dirs /picasa)
echo $dirs
would give/facebook /twitter /gdrive /picasa
Variables may be subscripted by a list of numbers (or a list of expressions
yielding numbers) to retrieve part of the list. List indexing starts at zero.
For example,index=0
echo $dirs($index 2 $index)
would give/facebook /gdrive /facebook
The number of elements in the variable dirs
can be found using $#dirs
.
One can do the equivalent of an array.join(' ')
using the $"
operator.words=(Holy Plan9 Ripoff Batman)
sent=$"words
echo $words
and echo $sent
will both printHoly Plan9 Ripoff Batman
Note thatecho $#words $#sent
will print4 1
Referring to a nonexistent variable yields nothing, referring to its
length gives 0, and $"nonexistent
gives the empty string. Therefore, when
unsure of a variable's existence, it is better to use [ $"foo = "bar" ]
,
which is equivalent to [ "" = "bar" ]
, while [ $foo = "bar" ]
would expand
to [ = "bar" ]
which would throw an error.
Local Scope: Positional variables ($1
, $2
... $*
) and
variables whose names begin with an underscore (e.g. _i
, _foo
) are
local to the enclosing function or shell.
Global Scope: All other variables are global to the shell, and may be freely referenced and set inside functions.
Exports: There is no notion of export
, copies of all global
variables are inherited by a child shell from its parent. Changing a
variable in a child will not affect the value in the parent.
Arguments may be concatenated using the ^
operator. In most cases, it is
not necessary, since pigshell will automatically concatenate arguments which
adjoin each other without any intervening whitespace. For example, in the
commandable=able; baker=baker; echo "able"baker able'baker' "able"'baker' able$baker $able^baker $able$baker
echo
has 6 arguments, each of which is ablebaker
. Note that a caret was
only required to resolve ambiguity in one case.
The rules for concatenating lists are as follows:
a^b^c
is parsed as (a^b)^c
toString()
method before concatenation.a=able; b=(1 2 3)
echo $a$b
givesable1 able2 able3
echo $b$a
gives1able 2able 3able
a=(able baker charlie); b=(1 2 3)
echo $a$b
givesable1 baker2 charlie3
Command substitution allows the standard output of a command to be converted
into an expression, which may be used as a command argument or assigned to a
variable. Pigshell supports only the $(command)
form, not the backtick
form. For example,files=$(ls)
nfiles=$(ls | sum)
echo "Number of files: " $(ls | sum)
Note that files
contains a list of file objects. Command substitution is
the easiest way to get objects into variables.
Command substitutions may be nested:echo $(printf -s $format $i $(cat $i/status) $(cat $i/cmdline))
Deferred pipelines are created using the ${<command1> | <command2>... }
syntax. The pipeline is created and assembled but not run. The expression
yields an object, which can be stored in variable, or used as an argument to
another command. The next
command, with this object as an argument, can be
used to crank the pipeline to produce one object. Further invocations of the
next
command produce subsequent items in the stream, until EOF is reached,
after which the EPIPE error is returned.
To run the deferred pipeline to completion and get all the objects in the
stream in one shot, cat
can be used.
p=${echo foo; echo bar}
next $p
gives
foo
A further next $p
gives
bar
Running next $p
again results in EOF. Any further invocations of next $p
return an EPIPE error.
Alternately,
cat $p
gives
foo
bar
The syntax of the if
construct is very similar to bash.if
cond; then
tcmd... [; elif
cond; then
tcmd... ]
[; else
ecmd... ]; fi
If the exit value of the cond command is true
, we enter the then
clause.
Any exit value other than true
is considered false. Commands may be spread
over multiple lines, like in bash.
for
loops are also similar to bash.for i in
list ; do
cmd...; done
while
loops are, again, similar to bash.while
cond; do
cmd...; done
Functions can be defined as follows:function
funcname {
cmd.. }
Functions behave like inline scripts in how they are invoked, how arguments
are accessed within the body, and their ability to be part of pipelines.funcname arg1 arg2
funcname arg1 arg2 | grep foo
Arguments are accessed within the body of the function using positional
arguments, $0...$n
and $*
.
All global variables accessed, defined and modified in the body of a function are part of the global scope of the enclosing shell. Variables whose names begin with an underscore are local to the function.
Function definitions may be deleted usingfunction
funcname
with no body. Note that this is different from function
funcname {}
,
which is a function with an empty body.
To execute a command, pigshell searches within its builtins and the paths in
the variable PATH for a match, in that order. If a command contains a path
separator, then it is looked up directly in the filesystem without going
through the search process. In case the PATH variable is not set, /bin/
is
assumed to the default path.
Note that PATH, like other pigshell variables, is a list. It must be set
using the list syntax, i.e. PATH=(/bin /usr/bin)
The following special variables are maintained by pigshell:
$0, $1.. $n
, $*
, $#
: These variables are used inside a script
to determine individual arguments to the script, the list of arguments, and
the number of arguments respectively.$?
: Exit value of the last command. true
for successful commands.$!
: PID of the latest executed pipeline.Pigshell has a large number of built-in commands. These commands are implemented in Javascript and have access to all the internal APIs and filesystems. Many of these commands follow a common set of idioms.
help
command. Specific usage
of a given command, say, grep
, may be obtained either using help grep
or grep -h
. All builtins support the -h
option.Stdin
and Stdout
"command" at the head
and tail respectively. Objects which reach Stdout
are displayed according
to their type. Objects like files have an html
attribute which is
used to render them to the output div.Filter commands like grep
and printf
take in files, filter or
transform them, and emit objects to Stdout
. These commands can be
supplied with files in one of two ways:
As a list of arguments, corresponding to the <file>...
option given
in the usage. These arguments may be strings representing file paths,
actual File objects, or a mixture of both. e.g.
grep -f gender "female" /facebook/friends/*
grep -f gender "female" /facebook/friends/A* $close_friends
where the
close_friends
variable a list of File objects.
As a list of File objects from Stdin
. e.g.
ls /facebook/friends | grep -f gender "female"
echo $close_friends | grep -f gender "female"
If you accidentally fail to give either of these, a line with a blinking
cursor will open up below the command. This is Stdin
trying to get input
from the terminal. Typing into this line and pressing Enter will feed a
string to the command. To indicate end of input, type Ctrl-D. To
simply get out, click to the right of the latest shell prompt to move
focus there.
Many commands which operate on objects have options to specify or extract attributes from the object.
The -f
option is commonly used to refer to a field in the object. For
instance, File objects correponding to Facebook friends have attributes
like gender
, friend_count
, etc. You can thus
ls /facebook/friends | grep -f gender "^male"
ls /facebook/friends | sort -f friend_count
to use those specific fields for filtering or sorting.
You can access nested attributes as well:
ls /facebook/friends | grep -f raw.relationship_status single
The -e
option can be used to specify a lambda expression in Javascript
which can be used to combine or filter field values in complex ways.
ls /picasa/albums/Blah | sort -e "x.width * x.height"
sorts photos based on how many pixels they contain. The expression will
be called with the argument x
set to the object. width
and height
are attributes of the object.
Pipeline status and control files are exposed in a special /proc filesystem, so simple scripts in /bin are sufficient to implement process management.
kill -STOP
.kill -CONT
.Pigshell represents cloud resources and system resources as files. Filesystems are responsible for maintaining local file objects corresponding to remote resources. We will briefly go over the filesystems currently supported.
Google: Supports Picasa and Google Drive. Click the Connect Google
button to mount Picasa albums under /picasa/<email>
and GDrive under
/gdrive/<email>
.
Dropbox: Click the Connect Dropbox button to mount your Dropbox under
/dropbox/<email>
.
Facebook: Click the Connect Facebook button to mount your Facebook
account at /facebook
. Pigshell is pure client-side, so privacy is
completely assured.
Download: Presents a single directory, /download
. You may copy files
into this directory to download them to the desktop.
Upload: Click the Upload button in the right menu and select files.
Alternately, drag and drop files onto the terminal. These files will be
available under /upload
and can be copied from there to a target directory.
Proc: The proc filesystem, mounted at /proc
, maintains a directory
corresponding to each running pipeline. Each directory has the following
files:
Lstor: Mounted at /local
, this filesystem is backed by HTML5 local
storage. Files stored here will survive "reboots". It is single-level; you
cannot make directories here.
Pigshell is inspired by Unix and Plan 9. We are very familiar with several
Unix implementations, but our experience with Plan 9 is purely platonic. We
have tried to retain as much of a bash
flavour as possible, to make it easy
for experienced Unix users to start using the system and incrementally
discover features, without having to read a long and tedious document like
this one.
There is more than one way to TIMTOWDI: one is characterized by a profusion of syntactic forms, where one cannot read one's own code after a few weeks. In another, it emerges from different ways of expressing the same meaning by combining of a small set of core concepts. Pigshell leans heavily towards the latter.
The pigshell syntax is intended to be used as a glue language for composing "tweet"-sized sentences and short scripts. Longer and more elaborate solutions on the pigshell platform are better written in Javascript.
The pigshell grammar is implemented using a PEG, which is far easier to specify and debug than BNF. The disadvantage is somewhat poor error reporting.