✅ Jupyter Integration

Scimax VS Code provides comprehensive integration with Jupyter kernels, allowing you to execute source code blocks using Jupyter's powerful kernel infrastructure. This enables interactive computing with support for multiple programming languages, rich output (plots, images, HTML), and persistent sessions with state management.

This guide covers everything you need to know about using Jupyter kernels in Scimax VS Code org-mode documents.

✅ Overview

✅ What is Jupyter?

Jupyter is an open-source project that provides interactive computing environments across dozens of programming languages. At its core, Jupyter uses "kernels" - language-specific execution engines that run code and return results.

✅ What Jupyter Integration Provides

The Jupyter integration in Scimax VS Code brings the following capabilities:

Multi-language support - Execute Python, Julia, R, and 20+ other languages
Interactive sessions - Maintain kernel state across multiple code blocks
Rich output - Display plots, images, HTML, LaTeX, and other media types
Automatic image saving - Graphics automatically saved to .ob-jupyter/ directory
ZeroMQ protocol - Native communication with Jupyter kernels via ZeroMQ
Session management - Run multiple kernels simultaneously with named sessions
Kernel lifecycle control - Start, stop, restart, and interrupt kernels
VS Code integration - Kernel status in status bar, output channel logging

✅ Jupyter vs Native Executors

Scimax provides two ways to execute code blocks:

Native executors - Direct execution via system commands (python, node, etc.)
Jupyter executors - Execution via Jupyter kernels (jupyter-python, etc.)

✅ When to Use Jupyter

Use Jupyter kernels when you need:

Persistent sessions with shared state across blocks
Rich output like plots, images, or interactive visualizations
Kernel-specific features (IPython magics, R plots, Julia macros)
Code completion and introspection
Multiple parallel sessions

✅ When to Use Native Executors

Use native executors when you need:

Quick one-off script execution
Minimal dependencies
Shell integration (pipes, redirects)
System-level operations

✅ Installation and Setup

✅ Prerequisites

Before using Jupyter kernels, you need:

Jupyter installed on your system
ZeroMQ native module (automatically included with Scimax)
Kernel specifications for your languages of interest

✅ Installing Jupyter

✅ Python Users

If you have Python installed, install Jupyter using pip:

# Install Jupyter and IPython kernel
pip install jupyter ipykernel

# Or using conda
conda install jupyter

✅ System Package Managers

On Linux systems, Jupyter may be available via package managers:

# Ubuntu/Debian
sudo apt install jupyter jupyter-core python3-ipykernel

# Fedora
sudo dnf install jupyter-core python3-ipykernel

# Arch Linux
sudo pacman -S jupyter jupyter-core python-ipykernel

✅ macOS with Homebrew

brew install jupyter

✅ Installing Kernels

✅ Python (IPython Kernel)

The IPython kernel is installed automatically with Jupyter:

pip install ipykernel

✅ Julia

Install the IJulia kernel from Julia:

using Pkg
Pkg.add("IJulia")

✅ R

Install the IRkernel from R:

install.packages('IRkernel')
IRkernel::installspec()

✅ Other Languages

Jupyter supports many languages through community kernels:

Language	Kernel Name	Installation
JavaScript	ijavascript	npm install -g ijavascript && ijsinstall
TypeScript	tslab	npm install -g tslab && tslab install
Ruby	iruby	gem install iruby && iruby register
Rust	evcxr	cargo install evcxr_jupyter && evcxr_jupyter --install
Go	gophernotes	go install github.com/gopherdata/gophernotes@latest
C++	xeus-cling	conda install -c conda-forge xeus-cling
Java	IJava	Download from https://github.com/SpencerPark/IJava
Scala	almond	Follow instructions at https://almond.sh
Haskell	IHaskell	stack install ihaskell && ihaskell install

✅ Verifying Installation

Check which kernels are installed:

jupyter kernelspec list

Available kernels:
  ir            /Users/jkitchin/Library/Jupyter/kernels/ir
  julia-1.12    /Users/jkitchin/Library/Jupyter/kernels/julia-1.12
  python3       /Users/jkitchin/Dropbox/uv/.venv/share/jupyter/kernels/python3

This will display all available kernels with their installation paths.

✅ Troubleshooting ZeroMQ

If you encounter ZeroMQ-related errors, the native module may need rebuilding for VS Code's Electron version. Scimax will display instructions if this is required.

The error message will look like:

Failed to load ZeroMQ native module. This usually means the module was
compiled for a different Node.js version.

Follow the on-screen instructions or consult the extension documentation.

✅ Using Jupyter Kernels

✅ Basic Execution

✅ Explicit Jupyter Syntax

To explicitly use a Jupyter kernel, prefix the language name with jupyter-:

import numpy as np
print(np.array([1, 2, 3, 4, 5]).mean())

3.0

The jupyter- prefix forces Babel to use the Jupyter executor for that language.

✅ Supported Languages

The Jupyter executor recognizes these language prefixes:

Org Language	Kernel Language	Common Kernel Names
jupyter-python	python	python3, python
jupyter-julia	julia	julia-1.9, julia
jupyter-r	r	ir
jupyter-ruby	ruby	ruby
jupyter-rust	rust	rust, evcxr
jupyter-go	go	gophernotes
jupyter-c++	c++	xcpp, xeus-cling
jupyter-java	java	java, ijava
jupyter-scala	scala	scala, almond
jupyter-haskell	haskell	haskell, ihaskell
jupyter-javascript	javascript	javascript, nodejs
jupyter-typescript	typescript	typescript, tslab

✅ Session Management

✅ Named Sessions

Use the :session header argument to maintain persistent kernel state:

import pandas as pd
data = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]})
print("Data loaded")

Data loaded

# The 'data' variable is still available from the previous block
print(data.describe())

x    y
count  3.0  3.0
mean   2.0  5.0
std    1.0  1.0
min    1.0  4.0
25%    1.5  4.5
50%    2.0  5.0
75%    2.5  5.5
max    3.0  6.0

All blocks with the same :session name share the same kernel and namespace.

✅ Default Session

If no session name is specified, blocks use a language-specific default session:

#+BEGIN_SRC jupyter-python :session
# Uses the default "python-default" session
x = 42

# Same session, x is still available
print(x)

✅ Multiple Sessions

You can run multiple independent sessions simultaneously:

import time
print("Training model...")
time.sleep(5)  # Simulate training
model = "trained_model"
with open('data.csv', 'w') as f:
    f.write("id,value\n1,10\n2,20\n3,30\n")

Training model...

import pandas as pd
data = pd.read_csv('data.csv')
print(f"Loaded {len(data)} rows")
try:
   print(model)  # This will raise an error since 'model' is not defined here
except Exception as e:
    print(f"model is not defined in this session")

Loaded 3 rows
model is not defined in this session

Each session runs in its own kernel with isolated state.

✅ Rich Output and Visualization

✅ Image Output

Jupyter kernels can produce rich output including images. When a code block generates an image (matplotlib plot, Julia plot, etc.), it is automatically saved to the .ob-jupyter/ directory in your document's folder.

✅ Python Matplotlib Example

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 2 * np.pi, 100)
plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, np.cos(x), label='cos(x)')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Trigonometric Functions')
plt.legend()
plt.grid(True)

The image is automatically linked in the results, and you'll see it rendered in VS Code's org-mode preview (default is on hover).

✅ Julia Plots Example

import Pkg; Pkg.add("Plots")

using Plots
x = 0:0.1:2π
plot(x, [sin.(x) cos.(x)], label=["sin" "cos"],
     xlabel="x", ylabel="y", title="Trigonometric Functions")

✅ R Graphics Example

x <- seq(0, 2*pi, length.out=100)
plot(x, sin(x), type='l', col='blue', ylab='y', main='Sine Wave')
lines(x, cos(x), col='red')
legend('topright', c('sin(x)', 'cos(x)'), col=c('blue', 'red'), lty=1)

✅ Supported Output Types

The Jupyter integration automatically handles these MIME types:

image/png - PNG images (most common for plots)
image/svg+xml - SVG vector graphics
image/jpeg - JPEG images
application/pdf - PDF documents
text/html - HTML content (saved as .html for complex output)
text/plain - Plain text output
text/latex - LaTeX equations

✅ The .ob-jupyter Directory

All Jupyter output files are saved to .ob-jupyter/ in your document's directory:

my-analysis.org
.ob-jupyter/
  ├── output-1705171200000-0.png
  ├── output-1705171200001-0.svg
  └── output-1705171200002-0.html

Files are named output--. to ensure uniqueness.

generated outputs to version control.

✅ Header Arguments

Jupyter blocks support all standard Babel header arguments plus Jupyter-specific options.

✅ Common Header Arguments

✅ :session

Controls which kernel session to use:

# Code runs in the "mysession" kernel

# Code runs in default session

# Code runs in a new temporary kernel (no session persistence)

✅ :results

Controls how results are collected and displayed:

print("Hello")
print("World")

Hello
World

2 + 2

import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [1, 4, 9])
plt.show()

x = expensive_computation()

✅ :exports

Controls what gets exported to HTML/PDF/LaTeX:

print("This will show code AND output in exports")

This will show code AND output in exports

This will show ONLY output in exports

print("This will show ONLY code in exports")

✅ :dir

Set the working directory for code execution:

import os
print(os.getcwd())  # Prints: the working directory

/Users/jkitchin/Dropbox/projects/scimax_vscode/.github

Or by absolute path:

ls

workflows

✅ :var

Pass variables between blocks:

echo "Alice 30"
echo "Bob 25"
echo "Carol 35"

Alice 30
Bob 25
Carol 35

# The shell output is passed as a string
lines = data.strip().split('\n')
[line.split() for line in lines]

[['Alice', '30'], ['Bob', '25'], ['Carol', '35']]

✅ :file

Explicitly specify output file path (overrides auto-naming):

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 20 * np.pi, 500)
y1 = np.cos(x) * np.exp(-.2 * x)
y2 = np.sin(x) * np.exp(-.2 * x)

plt.plot(y1, y2)

✅ Less Common Header Arguments

✅ :cache

Cache results to avoid re-execution:

import time

def compute_expensive_value():
   time.sleep(10)  # Slow operation
   return 42

result = compute_expensive_value()
print(result)

Results are cached based on code content. Changes to code invalidate the cache.

✅ :async

Run block asynchronously (non-blocking):

import time
print("Starting...")
time.sleep(30)
print("Done!")

Starting...
Done!

Code blocks generally run in async mode in VS Code.

✅ :prologue / :epilogue

Add code before/after the block:

# numpy is already imported due to prologue
x = np.array([1, 2, 3])
print(x.mean())
# "Done" will be printed due to epilogue

2.0
Done

✅ Kernel Management

✅ Starting Kernels

✅ Automatic Start

Kernels start automatically when you execute a Jupyter block. The first execution creates the kernel; subsequent blocks reuse it.

✅ Manual Start

You can manually start a kernel before executing code:

This command:

Detects the language of the current block (or file)
Finds an appropriate kernel
Starts the kernel in a default session

✅ Selecting a Specific Kernel

This command:

Lists all installed Jupyter kernels
Allows you to select and start one
Shows kernel display name, language, and installation path

✅ Viewing Running Kernels

This displays all active kernels with:

Session name
Language
Current state (idle, busy, starting)
Kernel ID

From this menu you can:

Restart a kernel
Interrupt a kernel
Shutdown a kernel
View kernel information

✅ Kernel States

Kernels can be in one of several states:

State	Description	Status Bar Icon
starting	Kernel is launching	$(loading~spin)
idle	Kernel is ready and waiting	$(check)
busy	Kernel is executing code	$(sync~spin)
dead	Kernel has stopped or crashed	$(error)

The kernel state is shown in the VS Code status bar when a kernel is active.

✅ Restarting Kernels

Restarting a kernel:

Terminates the current kernel process
Starts a fresh kernel with the same spec
Clears all session state (variables, imports, etc.)
Maintains the same session name

Use restart when:

The kernel becomes unresponsive
You want to clear all state
You need to reload modified modules
Memory usage grows too large

✅ Interrupting Kernels

Interrupting sends a SIGINT (Ctrl+C) to the kernel:

Stops currently executing code
Preserves session state
Allows you to regain control of a long-running computation

Use interrupt when:

Code is taking longer than expected
You notice an error in the running code
You want to stop an infinite loop

✅ Stopping Kernels

Stopping a kernel:

Gracefully terminates the kernel process
Frees system resources
Removes the session (state is lost)

Shuts down all running kernels simultaneously.

✅ Changing Kernels

This allows you to:

Select a new kernel spec
Choose which session to replace
Shutdown the old kernel
Start the new kernel in the same session

Use this to switch languages or kernel versions without changing session names.

✅ Advanced Usage

✅ Mixing Languages

You can use multiple Jupyter kernels in a single document:

import json
data = {'values': [1, 2, 3, 4, 5]}
with open('/tmp/data.json', 'w') as f:
    json.dump(data, f)
print("Data saved to /tmp/data.json")

Data saved to /tmp/data.json

import Pkg; Pkg.add("JSON")
using JSON
data = JSON.parsefile("/tmp/data.json")
result = sum(data["values"])
println("Sum: $result")

Sum: 15

library(jsonlite)
data <- fromJSON("/tmp/data.json")
barplot(data$values, main="Values", col="steelblue")

Each language runs in its own kernel with isolated state. Use files or other IPC mechanisms to pass data between languages.

✅ IPython Magics

When using Python kernels, you have access to IPython magic commands:

%timeit sum(range(1000))

5.23 μs ± 31.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%%timeit
total = 0
for i in range(1000):
    total += i

11.9 μs ± 85.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%run my_script.py

! uv pip list | grep numpy

[2mUsing Python 3.12.11 environment at: /Users/jkitchin/Dropbox/uv/.venv[0m
numpy                                    2.3.5
numpyro                                  0.19.0

Common magic commands:

%timeit / %%timeit - Time execution
%run - Run external Python file
%load - Load code from file into cell
%who / %whos - List variables
%matplotlib - Set matplotlib backend
%load_ext - Load IPython extensions
!command - Run shell command

✅ Kernel Introspection

Jupyter kernels support code introspection (when exposed via VS Code APIs):

Code completion - Auto-complete variable names, functions, methods
Documentation - Hover to see docstrings and signatures
Inspection - View source code and help text

The level of support depends on the specific kernel implementation.

✅ Custom Kernel Discovery

Jupyter looks for kernels in standard locations:

~/.local/share/jupyterkernels
/usr/local/share/jupyterkernels
/usr/share/jupyterkernels

%APPDATA%\jupyter\kernels
%PROGRAMDATA%\jupyter\kernels

JUPYTER_DATA_DIR - Override default data directory
JUPYTER_PATH - Additional search paths (colon-separated)

To make a custom kernel available, place its kernel.json in one of these directories.

✅ Performance Considerations

✅ Kernel Startup Time

First execution in a new session incurs kernel startup overhead:

Python: ~1-3 seconds
Julia: ~5-15 seconds (due to compilation)
R: ~1-2 seconds

Keep sessions running to avoid repeated startup costs.

✅ Memory Usage

Each kernel is a separate process with its own memory:

Monitor memory usage with system tools
Restart kernels to free memory
Use del (Python), clear (Julia), rm() (R) to free objects

✅ Image Output

Large images increase document size:

Matplotlib: Use plt.savefig(dpi..., bbox_inches'tight') to control size
Consider using SVG for vector graphics (smaller, scalable)
Use :results silent for intermediate plots

✅ Troubleshooting

✅ Kernel Not Found

Verify the kernel is installed:
Install the kernel if missing (see Installation section)
Check that jupyter is in your PATH:
If using a virtual environment, ensure it's activated

✅ Kernel Dies Immediately

Check the Jupyter output channel for error messages:
Test the kernel outside VS Code:
Check kernel logs in Jupyter runtime directory:
Common issues:

✅ ZeroMQ Errors

The extension will display instructions to rebuild the module
This is typically only needed when VS Code's Electron version changes
Follow the on-screen instructions or rebuild manually:
Restart VS Code after rebuilding

✅ Connection Timeout

Increase timeout if using a slow kernel (Julia)
Check firewall settings (kernels use localhost ports)
Verify ports are available (not blocked by other processes)
Check system resources (CPU, memory)

✅ Output Not Appearing

:results silent header argument
:exports none or :exports code
Code doesn't produce output (no print/return)
Error occurred but was suppressed

Remove :results silent
Check :exports setting
Add explicit print() statements
Check stderr in output channel

✅ Images Not Saving

Ensure :results file is set (or omitted, file is default for graphics)
Check .ob-jupyter/ directory exists and is writable
Verify the plotting library is properly configured:
Check that your code actually generates graphics:

✅ Kernel State Confusion

Verify you're using the same :session name
Check execution order (blocks may be out of sequence)
Restart the kernel to clear state
Use explicit session names to avoid confusion

✅ Multiple Kernel Versions

List all kernel specs:
Use specific kernel names in kernel.json (e.g., julia-1.9)
Remove unwanted kernel specs:

✅ Session Conflicts

This is by design! Blocks with the same :session name share state.

Use different session names for independent computations
Use no :session for isolated executions
Restart the kernel to clear shared state

✅ Tips and Best Practices

✅ Session Naming

Use descriptive session names: :session data-preprocessing
Group related blocks in the same session
Use separate sessions for independent analyses
Omit :session for quick one-off computations

✅ Image Management

Add .ob-jupyter/ to .gitignore if outputs are generated
Use :file custom-name.png for important plots you want to keep
Periodically clean up .ob-jupyter/ to save disk space
Use SVG format for publication-quality vector graphics

✅ Performance Optimization

Keep kernels running during active development
Shutdown unused kernels to free memory
Use :cache yes for expensive computations
Consider :results silent for intermediate steps
Break large computations into smaller blocks

✅ Reproducibility

Document kernel versions and dependencies
Use :session to ensure execution order
Include setup blocks with :exports none
Consider :prologue for common imports
Test the full document with a fresh kernel

✅ Error Handling

Check the "Jupyter Kernels" output channel for detailed errors
Use explicit print() statements for debugging
Interrupt kernels rather than restarting when possible
Add try/except blocks for robust error handling

✅ Literate Programming

Explain code intent before each block
Use :exports code for implementation details
Use :exports results to show only output
Group related blocks under headings
Use :tangle to extract code to files

✅ Mixed Language Workflows

Use JSON/CSV files to pass data between languages
Keep each language in its own session
Document data formats and conventions
Consider temporary files for large data transfers
Use each language for its strengths

✅ Examples

✅ Complete Data Analysis Example

Here's a complete example showing a data analysis workflow:

,* Data Analysis: Sales Trends

This analysis examines monthly sales data to identify trends and seasonality.

,** Data Loading

First, load and clean the data:

,#+BEGIN_SRC jupyter-python :session analysis :exports both
import pandas as pd
import numpy as np

# Generate sample sales data
np.random.seed(42)
dates = pd.date_range('2023-01-01', '2023-12-31', freq='D')
sales = 1000 + np.random.normal(0, 100, len(dates)) + np.sin(np.arange(len(dates)) * 2 * np.pi / 365) * 200

df = pd.DataFrame({'date': dates, 'sales': sales})
df['month'] = df['date'].dt.to_period('M')

print(f"Loaded {len(df)} days of sales data")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
,#+END_SRC

,** Summary Statistics

,#+BEGIN_SRC jupyter-python :session analysis :exports results
print(df['sales'].describe())
,#+END_SRC

,** Visualization

,#+BEGIN_SRC jupyter-python :session analysis :results file :exports results
import matplotlib.pyplot as plt

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))

# Daily sales
ax1.plot(df['date'], df['sales'], alpha=0.6)
ax1.set_title('Daily Sales')
ax1.set_xlabel('Date')
ax1.set_ylabel('Sales ($)')
ax1.grid(True, alpha=0.3)

# Monthly average
monthly = df.groupby('month')['sales'].mean()
ax2.bar(range(len(monthly)), monthly.values)
ax2.set_title('Average Monthly Sales')
ax2.set_xlabel('Month')
ax2.set_ylabel('Average Sales ($)')
ax2.set_xticks(range(len(monthly)))
ax2.set_xticklabels([str(m) for m in monthly.index], rotation=45)
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()
,#+END_SRC

,** Findings

The analysis reveals clear seasonal patterns in sales data with peak performance
during summer months.

✅ Julia Scientific Computing Example

Numerical Methods: Newton's Method

# Define function and derivative
f(x) = x^2 - 2
fp(x) = 2x

# Newton's method implementation
function newton(f, df, x0; tol=1e-6, maxiter=100)
    x = x0
    for i in 1:maxiter
        fx = f(x)
        if abs(fx) < tol
            return x, i
        end
        x = x - fx / df(x)
    end
    return x, maxiter
end

# Find sqrt(2)
root, iters = newton(f, fp, 1.0)
println("Root: $root")
println("Iterations: $iters")
println("Error: $(abs(root - sqrt(2)))")

Root: 1.4142135623746899
Iterations: 5
Error: 1.5947243525715749e-12

✅ R Statistical Analysis Example

Statistical Analysis: t-test

# Generate two samples
set.seed(123)
group_a <- rnorm(50, mean=100, sd=15)
group_b <- rnorm(50, mean=105, sd=15)

# Perform t-test
result <- t.test(group_a, group_b)
print(result)

Welch Two Sample t-test

data:  group_a and group_b
t = -2.4316, df = 97.951, p-value = 0.01685
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.131728  -1.228414
sample estimates:
mean of x mean of y 
 100.5161  107.1961

✅ Further Reading

✅ Official Documentation

✅ Org-Mode and Babel

✅ Language-Specific Resources

✅ Appendix: Jupyter Protocol

The Jupyter integration uses the Jupyter Messaging Protocol (version 5.3) over ZeroMQ sockets. This provides:

Shell channel - Execute requests and replies
IOPub channel - Output streams, display data, status updates
Stdin channel - Input requests (not currently used)
Control channel - Interrupt and shutdown requests
Heartbeat channel - Kernel health monitoring (not currently used)

Message signing uses HMAC-SHA256 to ensure message integrity.

✅ Connection Files

Kernels are started with connection files containing:

{
  "ip": "127.0.0.1",
  "transport": "tcp",
  "shell_port": 12345,
  "iopub_port": 12346,
  "stdin_port": 12347,
  "control_port": 12348,
  "hb_port": 12349,
  "key": "secret-key-for-hmac",
  "signature_scheme": "hmac-sha256"
}

These files are stored in the Jupyter runtime directory and cleaned up when kernels stop.