Using Jupyter Notebook-like data analysis and machine learning is a more convenient and flexible development method recently, but Jupyter Notebook lacks native support for multiple languages, and is lacking in managing dependencies within the notebook and data visualization. Recently, NetFlix open sourced its data analysis and machine learning development tool Polynote. This tool supports multiple languages to run in a notebook program, and also adds many new features, which are worth trying for readers and friends.
When it comes to development tools in the field of data science, Jupyter is undoubtedly a very well-known one. It is flexible and efficient, and is ideal for development, debugging, sharing, and teaching. Recently, Netflix has also played a crossover, they open sourced a program called Polynote. Similar to Jupyter, Polynote can do development work and can support multiple programming languages including Python.
According to Netflix's Medium article, Polynote was developed to provide data scientists and machine learning researchers with a notebook environment that allows them to freely and seamlessly integrate with Netflix's own JVM-based machine learning platform. This platform is largely based on the Scala language, along with some Python-based machine learning and visualization code libraries. Polynote was previously used by Netflix's internal team, and now they want to open source it to promote related research.
Five features of Polynote
Support multiple languages
Unlike Jupyter Notebook, Polynote natively supports programming in multiple languages. In addition to the first-level support for the Scala language, Polynote also supports multi-language operations in a single notebook, including Scala, Python, SQL, and Vega, and these languages have autocompletion.
It allows users to write different code in different blocks. In addition, each code block accepts a variable that matches the input name and returns the required variable to give the next piece of code, regardless of the language of the code. This allows users to choose the most appropriate tool in a language as needed.
polyglot supports multiple languages running together.
Besides inserting text edits into code editing like Jupyter Notebook, Polynote makes it easy to insert LaTex formulas.
In addition, there are commonly used editing functions in text editing.
When running, the running code blocks and lines of code will be highlighted for easy viewing by developers.
As shown, Polynote displays the currently running code blocks and code as well as the time it takes to complete a task when it runs.
Dependency and configuration management
Polynote supports the management of notebook dependencies and configurations, which can avoid many runtime problems.
As shown in the figure above, the configuration and dependency management interface of polynote is similar to advanced IDEs such as PyCharm. You can change the version and installation method of various dependencies yourself. Unlike Jupyter Notebook, these configurations have no external dependencies.
Polynote is combined with two very well-known data visualization tools, Vega and Mataplotlib. Polynote also has natural support for data mining, including data table views, table checking tools, chart building tools, and support for Vega.
There is a button at the data position to display the chart.
Build charts with tools.
In addition, Polynote also has some interesting small features, such as recording the location of code blocks, so that the code base can run in order to ensure reproducibility.
how to instal
Currently, Polynote is a notebook program, so users can use it locally or set up a web service.
First, the user needs to download this JVM-based server application (used to provide a network service proxy). If you want to use it in your local environment, you can find the latest version from the open source list and download a file named
in the release page (https://github.com/polynote/polynote/releases).
tar -zxvpf polynote-dist.tar.gz cd polynote
After downloading, go to the directory and then prepare for installation.
The preparations currently include the following:
Polynote has only been tested on OSX and Linux, and proxyed with Chrome browser, so the authors hope that users can feedback in time.
If users need Spark support, they need to install Apache Spark.
Users need to use Python3, not Python2.
There are some other dependencies, the installation code is as follows:
pip3 install jep jedi pyspark virtualenv
To configure, users need to copy the
file and uncomment the configuration that needs to be modified.
file to the
Run the following file to start the server: