Enabling Reproducible Computing in nteract Play
Contents
My project is accepted for GSoC 2020 for the nteract organization under NumFOCUS. If you haven't already read the announcement post, read it here. I am excited to spend my summer developing nteract play to support reproducible computing.
This post is to describe the proposal briefly. I will cover all the significant steps and topics, some of the topics will also be explained in sub-posts. Note some steps might be altered during development, but most will remain the same.
What is nteract?
nteract is an ecosystem of SDKs, applications, and libraries created by the nteract community to help you and your team make the most of interactive notebooks and REPLs.
- Core SDK to help you bring the power of notebooks to every application by enabling you to create your own computing experience.
- A Suite of applications allows you to quickly create and publish notebooks in the cloud and on the desktop.
- Libraries to enhance your notebook workflows from end-to-end.
What is nteract play?
nteract play is a web application that provides an interactive playground for users to run code samples
against a Binder instance.
Proposal
My proposal is to add support for reproducible computing in nteract play. This will allow you to share your public notebooks from Github via a unique URL, and let other people start a session to reproduce your work. This will also allow you and others to make edits and save them back to Github.
Why this proposal?
There are many uses cases of reproducible computing, few are mentioned below:
- It will empower researchers to easily showcase their findings without fretting if users can set up the environment.
- It will save your time when reproducing someone’s work.
- It can be used by professors and teachers to teach their class.
- New students can use it to quickly learn and test different concepts.
Phases
This project is divided into three phases.
Phase 1: Github Integration
We take Github's details from the URL query or the menu and use GitHub API to perform different operations like fetching, committing, and so forth. GitHub provides the official library octokit/rest.js to interact with Github API.
User Authorizing
We don’t need to authorize a user to work with public repositories and to test them but is required to commit changes and to fork the repository. We can authorize users using the GitHub OAuth and save the access token in the browser web storage. Web storage has no expiration date, and even if the token gets deleted for some reason, the reauthorization is quick this time, as the app is already allowed by the user.
|
|
Fetch Data
Using the octokit/rest.js, we will fetch data like directory list, file content, modification date, etc. It will not just make the app content-rich but will also take off the load from the MyBinder Instance.
|
|
Saving Data
When saving data, there can be two cases:
The user is authentic to commit to the repo. In this case, changes can be easily auto-committed.
The user is not authentic to commit to the repo. In this case, we will need to fork the repo first or find the already forked repo and then push the changes to auto-commit.
1 2 3 4
octokit.repos.createFork({ owner:"username", repo"repo" });
There can be three cases:
Action | Commit Quality | Commit Frequency |
---|---|---|
On Change | Low | High |
On Run | Medium | Medium |
On Save | High | Low |
For this project, we will go with on run
. We will also create a program to take in changes and generate appropriate commit message based on the changes. In the future, we can incorporate the on save
action and also allow users to pick their commit messages.
Phase 2: MyBinder Integration
We plan to use mybinder.org to launch and manage binder instances, running a jupyter-notebook from our Github repo. Once that is up, we use the jupyter-client to communicate and manage jupyter kernels.
Launching
To launch a binder instance, we use rx-binder, which uses mybinder build link[mybinder.org/build/gh/{USER}/{REPO}/{GITREF}] to launch the instance.
|
|
It returns an object with a token and URL of the remote Jupyter Server.
|
|
Communicating
We use rx-jupyter, which uses Jupyter-API to run queries on remote Jupyter Server. We already have the URL and token, so the API calls look like this:
1
|
http://{url}/api/{endpoint}?_={token} |
EndPoint | Use |
---|---|
/content | Fetch, save, delete, create, rename, list files and folders. |
/sessions | Delete, create, rename, list sessions. |
/kernels | Start, kill, interrupt, restart kernels. |
/kernelspecs | Get kernel specs./configGet or update configuration of session. |
/terminals | Create, get and delete terminal. |
/status | Get status of the server. |
Executing
Messaging in Jupyter uses sockets to execute commands and exchange input and output. We use @nteract/messaging on the frontend to work with sockets to execute the command and fetch the output.
|
|
Phase 3: UI/UX Integration
To make the nteract-play application more interactive, we can create new UI/UX by introducing the following new/improved components and functionality.
- FileExplorer
List all the files in the repo, which on click is visible on the editor or the viewer. - Viewer
It is to display different file types with syntax highlighting. Editor
To edit notebooks using the nteract environment.Console
It is placed on the bottom of the page as users are familiar with this UX, and we can console more information about binder and notebook activities.Notification
It is to notify about the connection failure or success, or other important messages for the user.Loading
To give a visual confirmation on processing something or loading something.
The idea is to give users a smooth and playful experience on the application, so we can also introduce more components as per requirement during the development of this project.
Conclusion
This proposal is projected to take 3 months or so to complete, and as mentioned above, a lot of steps anddetails will change during the development. I will write a blog at the end of the GSoC period to include all the changes done in the development.
Acknowledgement
- Hero Image and “Understanding Nteract” by @fabric_8