KNIME Python Integration Guide

We use cookies and similar technologies to improve user experience and analyze website traffic. For these reasons, we may share your site usage data with our analytics partners. If you do not wish this, click here . For more information read our privacy policy

Configure the AP-wide environment

Configure node-specific environments

Executor configuration

Troubleshooting

Find debug information

What to do in case of the error "No module named knime.scripting"

Windows-specific issues

Data type not supported

SSL error during execution

This guide describes how to set up and use the KNIME Python Integration in KNIME Analytics Platform with its two nodes: Python Script node and Python View node.

In the v4.5 release of KNIME Analytics Platform, we introduced the Python Script (Labs) node, which is since the v4.7 release the current Python Script node of this guide.

The KNIME Python Integration works with Python versions 3.9 to 3.11 and comes with a bundled Python environment to let you start right away. This convenience allows to use the nodes without installing, configuring or even knowing environments. The included bundled Python environment comes with these packages .

To start right away, drag and drop the extension KNIME Python Integration from the KNIME Hub into the workbench to install it or manually via File → Install KNIME Extensions… . Then proceed to Using the Python nodes .

The section Using the Python nodes explains how the configuration of the dialogs can be used, as well as how to work with data coming to and going out of the nodes, how to work with batches and how to use the Python Script node with scripts of older Python nodes. It also provides the use-case of using Jupyter notebooks and references further examples.

If you need packages, that are not included in the bundled environment , you need to set up your own environment. In the section Configure the Python Environment the different options to set up and change environments are explored.

Before the v4.7 release, this extension was in labs and the KNIME Python Integration (legacy) was the current Python Integration. For anything related to the legacy nodes of the former KNIME Python Integration, please refer to the Python Integration guide of KNIME Analytics Platform v4.6 . The advantages of the current Python Script node and the Python View node compared to legacy nodes are significantly improved performance and data transfer between Python processes and the KNIME Analytics Platform thanks to Apache Arrow , a bundled environment to start right away, a unified API via the knime.scripting.io module, conversion support to and from both Pandas DataFrames and PyArrow Tables , support for arbitrarily large data sets by using batches . If you look for Python 2 support, you will also need to use the KNIME Python Integration (legacy). To achieve biggest possible performance gains, we recommend configuring your workflows to use Columnar Backend . Right-click a workflow in KNIME Explorer, select Configure… , then choose the Columnar Backend option under Selected Table Backend . Additional information about table backends can be found here .

This chapter guides through the configuration of the script dialog and the amount of ports, followed by examples of usage. These examples cover the access of input data, followed by table conversion and the usage of batches for data larger than RAM. Then it will explain how to port scripts from Python legacy nodes to this extension. After that, the additional features of the Python View node are explained. The chapter concludes with the use-case of loading and accessing Jupyter notebooks.

Script Editor

Your primary area for code development is the Script Editor. It comes with the convenience of auto-completion to expedite your coding process. Additionally, hovering over functions or methods reveals tooltips, providing usage guidance.

Inputs/Outputs (Left Panel)

Displayed here are the input and output variables accessible to your node. You can easily incorporate these into your script by dragging them from the panel into the Script Editor.

Ask K-AI

Tap into AI for code assistance. Input a prompt in the "Ask K-AI" box, and our AI model will suggest code relevant to your prompt. Inspect the generated code and, if it meets your requirements, integrate it into your script.

Execution Controls ("Run all", "Run selected lines")

The "Run all" button allows for the execution of your entire script in a new Python process, which remains accessible post-execution. To run a specific segment of your code, select the desired lines and click "Run selected lines," executing them in the active Python process.

Temporary Values

Post-execution, this panel lists the local variables defined in your script. It’s not just for show; you can interact with these variables by clicking on them, prompting their values to be printed in the console. This interactive feature is particularly useful for quick variable inspections and debugging.

Console

The console displays the real-time standard output from your Python session, including print statements and other script outputs. To start afresh or declutter the console, use the trash icon button situated at the top right.

Execution Status

This section provides feedback on the script’s execution process. It indicates the status of the last script run, allowing you to confirm that the script has executed as intended or to identify if there are any actions needed to address script issues.

Output Preview

The Output Preview panel is only visible in the dialog of the "Python View" node and shows the output view after script execution. This interactive preview is updated on the fly whenever the output view is update by the interactive Python session.

The "Ask K-AI" feature within the KNIME Python Scripting Node is an advanced AI-assisted code generation tool. When activated, you can input prompts specifying the intended functionality of the code. The AI assistant has contextual awareness of the KNIME Python API, the input data’s structure, and the current script content in the editor.

Once the assistant generates the code, it is presented to you in a diff-editor format, which highlights the differences between your current code and the new suggestion. You then have the option to review these suggestions and choose whether to accept them into your script or discard them, providing a high degree of control over the changes made to your code.

Upon utilizing this service, be aware that the current code from the editor, the input data’s schema, and the prompt are sent over the internet to the configured KNIME Hub and OpenAI, which is a consideration for data privacy. This transmission is necessary for the AI to tailor code suggestions accurately to your script’s context and the data you are working with.

When you create a new instance of the Python Script nodes, the code editor will already contain starter code, in which we import knime.scripting.io as knio . The content shown in the input, output, and flow variable panes can be accessed via this knime.scripting.io module.

If the package


   knime

is installed via

pip

in the environment used for the Python script node, accessing the


   knime.scripting.io

module will fail with the error


   No module named 'knime.scripting'; 'knime' is not a package

. In that case, run


   pip uninstall knime

in your Python environment.

knio.output_images[i] to output images, which must be either a string describing an SVG image or a byte array encoding a PNG image,

where i is the index of the corresponding table/object/image ( 0 for the first input/output port, 1 for the second input/output port, and so on).

The knime.scripting.io module provides a simple way of accessing the input data as a Pandas DataFrame or PyArrow Table . This can prove quite useful since the two data representations and corresponding libraries provide a different set of tools that might be applicable to different use-cases.

First, you need to initialise an instance of a table to which the batches will be written after being processed:

processed_table = knio.BatchOutputTable.create()

Find debug information Resourceful information helps in understanding issues. Relevant information can be obtained in the following ways.

Find debug information
Resourceful information helps in understanding issues. Relevant information can be obtained in the following ways.