From 3103721c8e2f554448ec116cbc1c4e7247dba50e Mon Sep 17 00:00:00 2001 From: Thomas Hobson Date: Thu, 18 Feb 2021 23:09:16 +1300 Subject: [PATCH] design pt1 --- ARCHITECTURE.TXT | 2 +- design/api.txt | 75 ++++++++++++++++++++++++++++++++++++++++++++++++ design/index.txt | 13 +++++++++ design/ppman.txt | 74 +++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 163 insertions(+), 1 deletion(-) create mode 100644 design/api.txt create mode 100644 design/index.txt create mode 100644 design/ppman.txt diff --git a/ARCHITECTURE.TXT b/ARCHITECTURE.TXT index db212dc..edf52d3 100644 --- a/ARCHITECTURE.TXT +++ b/ARCHITECTURE.TXT @@ -1,4 +1,4 @@ -== Breif == +== Breif == [ Piston ] This document covers the overall architecture of Piston v3, and not the individual components and their implementations. diff --git a/design/api.txt b/design/api.txt new file mode 100644 index 0000000..a9e7e63 --- /dev/null +++ b/design/api.txt @@ -0,0 +1,75 @@ +== Piston API == [ Piston ] + +When we speak of piston, what we actually talk about is the Piston API. +This API provides unrestricted, unlimited access to managing piston and +thus shouldn't be publicly exposed. This API is comparable to one of the +docker engine, where everything regarding control of docker goes directly +through the api. + +The API is responsible for managing the execution lifecycle of any given +job, as well as managing the different languages which it can execute a +job in. + + + +== Job Execution == + +Piston v3 exposes an endpoint per package `/execute`, which when called takes +in both a string of code, and an array of arguments to pass into the program +as well as data to write to STDIN. The stdout and stderr from the process are +then both returned seperately, along with the error code returned. + +All of this is has no rate-limiting built in making it lightning fast as a call +will directly start the runner process and get under way instantly. + +The 2 stages of this process - compile and run are both run in sequence, with +different timeouts configurable in the runners config file located in the +data directory. + +Requests to this endpoint can have caching enabled at 3 different levels. +The first option is to have no caching, which is the default for all +interpreted language. The second option is for the compiled binaries to be +cached, which is the default for all compiled languages. The final option is +for output to be cached, which isn't used by default but can be enabled per +package or per request. This is done for the reason that code may choose to +source data from /dev/(u)random or similar sources and as such may not be as +reliable when their outputs are cached. Caching is per package and is used as +an acceleration method to help boost performance of Piston. Cache entries are +automatically purged after the set time, or can be manually purged through the +API on a per package basis. + + +== Package Manager == + +Piston v3 has an inbuilt package manager which is responsible for +(un)installing different packages. Piston v3 by default has access to a single +offical repository hosting various versions of various common languages. These +packages and repositories conform to the specifications set out in ppman.txt + +The Piston API service downloads the repository index whenever a `/packages` +request is issued to a repository with the `sync` flag is set. This will cause +the service to download the latest repostiory index off the mirror. + +In piston there is no concept of a package being "outdated" as each package is +a specific version of a language, and different languages can be installed in +paralleland function without any issues. Each package should be considered the +final version of that language. If there is a new version of a language +available (i.e. Python 3.9.1 -> 3.9.2), a new package should be created for +this. + +Invidual languages can be queried from the repo using the +`/repos/{repo}/packages/{package}/{package-version}` endpoint. This endpoint +allows for the metadata of the package to be accessed, such as the author, +size, checksums, dependencies, build file git url and download url. + +To install packages, a request to `/install` can be made to the package +endpoint and it will download and install it, making it available on the +`/packages/{package}/{version}` endpoint. + +There is a meta-repository name `all` which can be used to access all +repositories. + +Internally the install process involved downloading and unpacking the package, +ensuring any dependencies are also downloaded and installed, mounting the +squashfs filesystem to a folder, then overlaying it with all its dependencies +in another folder. diff --git a/design/index.txt b/design/index.txt new file mode 100644 index 0000000..685222a --- /dev/null +++ b/design/index.txt @@ -0,0 +1,13 @@ +== Index == [ Piston ] + +Design outlines the design of the different components and does not give a +concrete definition of the implementation or how to use it. + +api.txt Design of Piston API +ppman.txt Design of the package manager's package and repository format + + +== Glossary == + +Execution Job A single code run with arguments resulting in an output +Package A version of a language bundled together into a tarball \ No newline at end of file diff --git a/design/ppman.txt b/design/ppman.txt new file mode 100644 index 0000000..7721ebd --- /dev/null +++ b/design/ppman.txt @@ -0,0 +1,74 @@ +== Package Manager (ppman) == [ Piston ] + +The package manager is the part of the API responsible for managing different +versions of different languages, managing their installation, uninstallation +and their dependencies. The package manager talks over the piston api and is +built directly into piston, although has parts which are not directly built +into the API (i.e. the repositories and the cli utility). + +The package manager is a complex part of piston, and requires 2 different file +specifications - the repository index file and the package file. + + + +== Repository Index File == + +The piston repository is the central place where packages are hosted and +downloaded from. This repository can either be a webserver or a local file +containing the right content - as long as its accessable by a URL, its +considered a valid repository by piston. A repository URL is simply a URL +pointing to a repository index file, as set out by the following information. + +A repository index file is a YAML file containing the keys: `schema`, `baseurl` +and `packages`. + +The schema key simply should have a value of `ppman-repo-1`. This indicates the +version and file format for the client to recieve. + +The baseurl key contains the base url that relative URLs should be based off, +this doesn't need to be related to the url that the repository index is hosted +at, only the downloadable files, which are possible to split over many domains +by using absolute paths. + +The packages key contains a list of packages, which contain the keys: `author`, +`language`, `version`, `checksums`, `dependencies`, `size`, `buildfile` and +`download`. + +The author field is self explainatory, it is simply the authors name and email, +formatted similar to git's default format: `Full Name `. If the +repository index is automatically generated, it is best to use the commit +author's name in here. + +The language and version fields define the version and name of the compiler or +interpreter contained within. The language should not include a version at all. +In the case of python, use the name python for both python 2 and 3, using the +version field to differentiate between the 2. + +The checksums field is simply a map of hash types to hashes, hash types include +md5, sha1, sha256, sha512. The digests should simply be written as lowercase +hex characters. Only one checksum is required, but if more are supplied the +most secure one is picked, with sha512 as the highest possible. + +The dependencies is simply a map of language names to versions, which should be +installed for the package to run correctly. An example of this would be +typescript requires node to run. + +The size field is the number of bytes the package file is in size, while +uncompressed. This is used to determine if there is enough room, and thus +should be accurate. + +The buildfile field is a URL pointing to the exact build script for this build. +This should always point to a URL either containing steps, a script or other +means of reproducing the build. This field is purely so people can understand +how the image was built, and to make sure you aren't packing any mallicious +code into it. + +The final field is download, this points to a URL of which the package file can +be obtained from. If this is a relative url, the baseurl will be appended to +it. This is particularly useful if everything is stored within 1 s3 bucket, or +you have a repository in a folder. + + +== Package File == + +TODO \ No newline at end of file