DevLog: Using Claude Code to fix a bug in open source software

Full Claude Code session

Nov 10, 2025

A screenshot of the open source software application, DocWrangler — DocWranger

What’s this?

It turns out that many use cases for AI involve data processing: looking for patterns in restaurant orders; summarizing pain points from thousands of customer service call transcripts; finding the most common drug modalities from clinical trials; analyzing a corpus of EHR records for medication side-effects…the list goes on and on.

A star in the applied AI field, Shreya Shankar, realized this and created some extremely powerful open source tools for building these types of workflows

DocETL - An engine for LLM-based data processing
DocWrangler - An IDE for building these workflows visually

One reason these tools are great is that Shreya and her co-developers have lots of experience building data workflows for AI. They know all the little, annoying places where friction creeps in. Things like, annotating LLM responses and using the notes to improve prompts, or offering Python components where an LLM may be overkill. The tools just feel smart.

But, as with any open source project, there can be bugs. This post is my verbatim chat session with Claude Code to squash a bug I found when trying to run the application on my laptop. I should probably have waited to post until the PR has been accepted but whatever.

To configure this session, I forked the DocETL repo and opened a session of Claude Code in the directory. This way, it had access to the codebase so I could just paste in errors and it could go to work.

The entire process of bug detected to pull request took a little over an hour.

Complete Claude Code Chat

USER:

Please explain this error in the context of your codebase. Don’t change anything:

INFO: 192.168.65.1:46560 - “POST /fs/save-documents HTTP/1.1” 200 OK

[12:35:44] INFO detected formats: [<InputFormat.PDF: ‘pdf’>] document.py:361

INFO Going to convert document batch... document_converter.py:314

INFO Initializing pipeline for StandardPdfPipeline document_converter.py:359

with options hash

4a89bda5a26a77cb2b7befc50a30d131

INFO Accelerator device: ‘cpu’ accelerator_utils.py:82

INFO: 192.168.65.1:46560 - “POST /api/convert-documents?use_docetl_server=true HTTP/1.1” 500 Internal Server Error

ERROR: Exception in ASGI application

Traceback (most recent call last):

File “/app/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py”, line 403, in run_asgi

result = await app( # type: ignore[func-returns-value]

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py”, line 60, in __call__

return await self.app(scope, receive, send)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/fastapi/applications.py”, line 1134, in __call__

await super().__call__(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/applications.py”, line 113, in __call__

await self.middleware_stack(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py”, line 186, in __call__

raise exc

File “/app/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py”, line 164, in __call__

await self.app(scope, receive, _send)

File “/app/.venv/lib/python3.11/site-packages/starlette/middleware/cors.py”, line 93, in __call__

await self.simple_response(scope, receive, send, request_headers=headers)

File “/app/.venv/lib/python3.11/site-packages/starlette/middleware/cors.py”, line 144, in simple_response

await self.app(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py”, line 63, in __call__

await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 53, in wrapped_app

raise exc

File “/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 42, in wrapped_app

await app(scope, receive, sender)

File “/app/.venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py”, line 18, in __call__

await self.app(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/routing.py”, line 716, in __call__

await self.middleware_stack(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/routing.py”, line 736, in app

await route.handle(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/routing.py”, line 290, in handle

await self.app(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/fastapi/routing.py”, line 125, in app

await wrap_app_handling_exceptions(app, request)(scope, receive, send)

File “/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 53, in wrapped_app

raise exc

File “/app/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 42, in wrapped_app

await app(scope, receive, sender)

File “/app/.venv/lib/python3.11/site-packages/fastapi/routing.py”, line 111, in app

response = await f(request)

^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/fastapi/routing.py”, line 391, in app

raw_response = await run_endpoint_function(

^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/fastapi/routing.py”, line 290, in run_endpoint_function

return await dependant.call(**values)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/server/app/routes/convert.py”, line 296, in convert_documents

conv_result = next(docling_iter)

^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”, line 260, in convert_all

for conv_res in conv_res_iter:

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”, line 332, in _convert

for item in map(

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”, line 379, in _process_document

conv_res = self._execute_pipeline(in_doc, raises_on_error=raises_on_error)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”, line 400, in _execute_pipeline

pipeline = self._get_pipeline(in_doc.format)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”, line 362, in _get_pipeline

self.initialized_pipelines[cache_key] = pipeline_class(

^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/docling/pipeline/standard_pdf_pipeline.py”, line 380, in __init__

self._init_models()

File “/app/.venv/lib/python3.11/site-packages/docling/pipeline/standard_pdf_pipeline.py”, line 404, in _init_models

self.table_model = TableStructureModel(

^^^^^^^^^^^^^^^^^^^^

File “/app/.venv/lib/python3.11/site-packages/docling/models/table_structure_model.py”, line 72, in __init__

from docling_ibm_models.tableformer.data_management.tf_predictor import (

File “/app/.venv/lib/python3.11/site-packages/docling_ibm_models/tableformer/data_management/tf_predictor.py”, line 13, in <module>

import cv2

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

ASSISTANT:

I’ll help you understand this error in the context of your codebase. Let me examine the relevant files.

ASSISTANT:

## Error Explanation

This error occurs when your **local document conversion** is attempting to process a PDF file. Here’s what’s happening:

### Call Flow

1. Request hits `/api/convert-documents?use_docetl_server=true` (line 116 in `server/app/routes/convert.py`)

2. The custom Docling URL and Modal endpoints either weren’t available or failed

3. Falls back to **local Docling processing** (line 245-302)

4. Docling initializes the `StandardPdfPipeline` with table structure detection enabled (line 249-250)

5. The `TableStructureModel` tries to import OpenCV (`cv2`)

6. OpenCV requires the system library `libGL.so.1` which is **missing from the Docker container**

### Root Cause

Your Dockerfile uses `python:3.11-slim` as the base image (line 2 and 36), which is a minimal Python image. When Docling’s table extraction feature initializes, it uses the `docling_ibm_models` package which depends on OpenCV for image processing. OpenCV requires OpenGL libraries (`libGL.so.1`) for certain operations, but these are not included in slim Docker images.

### Why It Fails

In `server/app/routes/convert.py:247-250`, you have:

python

pipeline_options = PdfPipelineOptions()

pipeline_options.do_ocr = False

pipeline_options.do_table_structure = True # This triggers the TableStructureModel

The table structure detection requires computer vision capabilities, which need system graphics libraries that aren’t present in your container.

### Context

This only happens during **local fallback processing**. Your primary conversion paths (custom Docling server and Modal endpoint) likely work fine because:

- They run on servers with proper system dependencies installed

- They’re separate services with their own environments

The error indicates the local fallback was triggered, meaning both the custom Docling URL and Modal endpoints were unavailable or failed for this request.

USER:

why was local fallback triggered?

ASSISTANT:

Looking at the code flow in `server/app/routes/convert.py`, the local fallback was triggered because the **Modal endpoint** either failed or wasn’t reachable. Let me trace through the logic:

ASSISTANT:

Based on your request URL `?use_docetl_server=true` and the code logic, here’s what happened:

## Why Local Fallback Was Triggered

Your request had `use_docetl_server=true` and was processing PDF files, so the code **should have** used the Modal endpoint at `server/app/routes/convert.py:225-240`.

The local fallback happened because the Modal endpoint either:

1. **Returned a non-200 status code** (most likely) - Notice at line 239-240: if the status isn’t 200, the code just silently continues to local processing **without any error logging**

2. **Raised an exception** - Would have logged `”Modal endpoint failed: {error}”` at line 243, but that message isn’t in your logs

3. **No custom_docling_url header was provided** - So lines 128-221 were skipped entirely

Since you don’t see a “Modal endpoint failed” message in your logs, this suggests the **Modal endpoint returned a non-200 HTTP status** (like 400, 500, 503, etc.) and the code silently fell through to local processing.

## The Silent Failure

There’s a logging gap in the code at `server/app/routes/convert.py:238-240`:

```python

async with session.post(MODAL_ENDPOINT, data=data, timeout=120) as response:

if response.status == 200:

return await response.json()

# No else clause - just continues to local fallback silently!

```

**To confirm what happened**, check if the Modal endpoint at `https://ucbepic--docling-converter-convert-documents.modal.run` is:

- Currently running/accessible

- Returning error responses

- Experiencing high load or rate limiting

USER:

I put that url in my browser and also curled it. In both cases I get the message: “modal-http: app for invoked web endpoint is stopped”

ASSISTANT:

Perfect! That explains exactly why the local fallback was triggered.

## What Happened

1. Your request had `use_docetl_server=true` with PDF files

2. Code attempted to POST to the Modal endpoint: `https://ucbepic--docling-converter-convert-documents.modal.run`

3. Modal returned a response (likely 404 or 503) with the message: **”modal-http: app for invoked web endpoint is stopped”**

4. Since `response.status != 200`, the code silently fell through to local processing at line 245

5. Local processing tried to initialize Docling with table structure detection

6. This requires OpenCV → needs `libGL.so.1` → crashes because it’s not in the Docker container

## Your Options

You have three paths forward:

1. **Start the Modal endpoint** - If you control this Modal app, deploy/start it so `use_docetl_server=true` works as intended

2. **Fix the Docker image** - Add OpenGL libraries to your Dockerfile so local fallback works:

```dockerfile

# In the runtime stage, add to apt-get install:

libgl1 libglib2.0-0

```

3. **Deploy your own Docling server** - Use the `custom_docling_url` header to point to a running Docling instance instead of relying on Modal

The Modal endpoint being stopped is the root cause of why you hit the local fallback code path that exposed the missing system dependencies.

USER:

that seems like a weird domain for an endpoint. Are you sure it’s right?

ASSISTANT:

You’re right to question it! But yes, that’s actually the correct format for a **Modal.com** endpoint.

ASSISTANT:

Yes, that domain format is correct for **Modal** (modal.com), a serverless platform.

Modal endpoints follow this pattern:

https://{workspace}--{app-name}-{function-name}.modal.run

In your case:

- **Workspace**: ucbepic

- **App/Function**: docling-converter-convert-documents

- **Domain**: .modal.run

The double-dash (`--`) separates the workspace from the app name - that’s Modal’s standard URL structure.

You can see it’s hardcoded at `server/app/routes/convert.py:30`, and there’s even a commented dev endpoint on line 31 following the same pattern.

## The Real Issue

The endpoint URL is valid, but the **Modal app is stopped/not deployed**. This means:

- Someone deployed a Modal function for document conversion in the `ucbepic` workspace

- That deployment is currently stopped (either manually stopped, scaled to zero, or the deployment was deleted)

- Your Docker container expected it to be running

You need to either:

1. **Deploy/start the Modal app** - If you have access to the `ucbepic` Modal workspace

2. **Switch to local processing** by setting `use_docetl_server=false` (but then you’ll need to fix the Docker OpenGL issue)

3. **Use a different document conversion service** via the `custom_docling_url` header

USER:

does local processing rely on docling?

ASSISTANT:

Yes, absolutely. Local processing **entirely relies on Docling**.

ASSISTANT:

Yes, **100% of the local processing uses Docling**. Look at the imports (lines 16-19) and the local fallback code:

python

# Lines 247-258

pipeline_options = PdfPipelineOptions()

pipeline_options.do_ocr = False

pipeline_options.do_table_structure = True # This is what causes the OpenCV requirement

pipeline_options.table_structure_options.do_cell_matching = True

doc_converter = DocumentConverter(

format_options={

InputFormat.PDF: PdfFormatOption(

pipeline_options=pipeline_options, backend=PyPdfiumDocumentBackend

)

}

)

Then at line 284-296, it uses this Docling converter for all non-text files.

## The Stack Trace Confirms It

Your error trace shows the entire chain through Docling:

docling/document_converter.py
docling/pipeline/standard_pdf_pipeline.py
docling/models/table_structure_model.py
docling_ibm_models/tableformer/data_management/tf_predictor.py ← imports `cv2`

The problem is specifically the **table structure detection** feature (`do_table_structure = True` at line 249), which requires OpenCV and thus the missing `libGL.so.1` library.

## Your Conversion Paths

1. **Custom Docling server** (via header) → Uses remote Docling

2. **Modal endpoint** → Uses remote Docling (currently stopped)

3. **Local fallback** → Uses local Docling (crashes due to missing OpenGL libraries)

4. **Azure endpoint** (`/api/azure-convert-documents`) → Different path, doesn’t use Docling

USER:

does the docling in the local fallback run within the docker container?

ASSISTANT:

**Yes, absolutely.** The local fallback Docling runs inside the Docker container.

Looking at the error stack trace:

File “/app/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py”

File “/app/.venv/lib/python3.11/site-packages/docling/document_converter.py”

The `/app/.venv/` path shows this is running in the container’s virtual environment.

## How It Works

From your Dockerfile:

1. **Line 47-51**: Sets up the Python virtual environment at `/app/.venv` inside the container

2. **Line 53**: Copies the venv (including Docling) from the builder stage

3. **Line 56-60**: Copies your application code including `server/` into the container

4. **Line 77**: Starts the FastAPI server: `python3 server/app/main.py`

When a request hits `/api/convert-documents`:

- FastAPI is running **inside the container**

- When local fallback triggers, it imports and runs Docling **inside the same container**

- Docling tries to initialize OpenCV’s table detection

- OpenCV needs `libGL.so.1` which doesn’t exist in the `python:3.11-slim` base image

- **Crash** 💥

That’s why the fix would need to modify the Dockerfile (line 39-43) to add the missing system libraries to the container:

dockerfile

RUN apt-get update && apt-get install -y \

curl \

libgl1 \ # ← Add this

libglib2.0-0 \ # ← Add this

&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \

USER:

there’s a known problem with the module AutoProcessor running on apple silicon m3 pro. Does this mean that the Docling in the Docker container will fail on my mac?

ASSISTANT:

**The current error you’re seeing is NOT the AutoProcessor/Apple Silicon issue** - it’s the missing `libGL.so.1` library, which would happen on any machine.

However, to answer your question about whether the Apple Silicon AutoProcessor issue will affect you:

## It Depends on Container Architecture

Docker Desktop on Mac runs **Linux containers**, not macOS. The key question is: **what architecture is your Docker container running?**

ASSISTANT:

**Important finding**: Your container is running **`aarch64`** (arm64 Linux), which means it’s running natively on your M3 Pro architecture.

## What This Means

**Current error (libGL.so.1)**: This will happen regardless of architecture - it’s just a missing system library in the slim Docker image.

**AutoProcessor/Apple Silicon issues**:

- Your container runs **Linux arm64**, not macOS arm64

- macOS-specific issues (MPS/Metal backend) don’t apply in Linux containers

- However, **arm64-specific ML model issues could still occur** if Docling’s models have arm64 compatibility problems

## Potential Issues You Might Hit (After Fixing libGL)

If Docling’s AutoProcessor or ML models have known arm64 issues, you might still encounter them because your container runs on arm64 architecture. Common arm64 ML problems include:

- Missing or broken pre-compiled wheels for PyTorch/transformers on arm64

- Model inference bugs on arm64

- Performance issues on arm64

## Solutions if AutoProcessor Fails

1. **Force amd64 architecture** - Build/run your container with `--platform linux/amd64` (uses Rosetta 2 emulation, slower but more compatible)

2. **Check Docling’s compatibility** - Verify if Docling officially supports arm64

3. **Use remote endpoints** - Keep using Modal/custom Docling servers (which likely run on amd64) instead of local fallback

Want me to check if there are any arm64-specific issues in your Docling installation?

USER:

i don’t want to go off too far on a tangent but I do want to know if the fallback to local mode will continue to fail even if I change teh Dockerfile to install the missing packages. Here’s an example of the error I run into with Docling on my mac:

(.venv) johndamask@Johns-MacBook-Pro-2 2025-11-10-docling-test % python app.py

Traceback (most recent call last):

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py”, line 2317, in __getattr__

module = self._get_module(self._class_to_module[name])

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py”, line 2347, in _get_module

raise e

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py”, line 2345, in _get_module

return importlib.import_module(”.” + module_name, self.__name__)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/Users/johndamask/.pyenv/versions/3.12.6/lib/python3.12/importlib/__init__.py”, line 90, in import_module

return _bootstrap._gcd_import(name[level:], package, level)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “<frozen importlib._bootstrap>”, line 1387, in _gcd_import

File “<frozen importlib._bootstrap>”, line 1360, in _find_and_load

File “<frozen importlib._bootstrap>”, line 1331, in _find_and_load_unlocked

File “<frozen importlib._bootstrap>”, line 935, in _load_unlocked

File “<frozen importlib._bootstrap_external>”, line 995, in exec_module

File “<frozen importlib._bootstrap>”, line 488, in _call_with_frames_removed

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py”, line 27, in <module>

from ...image_processing_utils import ImageProcessingMixin

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/image_processing_utils.py”, line 21, in <module>

from .image_processing_base import BatchFeature, ImageProcessingMixin

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/image_processing_base.py”, line 26, in <module>

from .image_utils import is_valid_image, load_image

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/image_utils.py”, line 55, in <module>

from torchvision.transforms import InterpolationMode

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/torchvision/__init__.py”, line 10, in <module>

from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/torchvision/datasets/__init__.py”, line 1, in <module>

from ._optical_flow import FlyingChairs, FlyingThings3D, HD1K, KittiFlow, Sintel

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/torchvision/datasets/_optical_flow.py”, line 14, in <module>

from .utils import _read_pfm, verify_str_arg

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/torchvision/datasets/utils.py”, line 4, in <module>

import lzma

File “/Users/johndamask/.pyenv/versions/3.12.6/lib/python3.12/lzma.py”, line 27, in <module>

from _lzma import *

ModuleNotFoundError: No module named ‘_lzma’

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File “/Users/johndamask/src/tries/2025-11-10-docling-test/app.py”, line 1, in <module>

from docling.document_converter import DocumentConverter

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/docling/document_converter.py”, line 64, in <module>

from docling.pipeline.standard_pdf_pipeline import StandardPdfPipeline

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/docling/pipeline/standard_pdf_pipeline.py”, line 37, in <module>

from docling.models.code_formula_model import CodeFormulaModel, CodeFormulaModelOptions

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/docling/models/code_formula_model.py”, line 17, in <module>

from transformers import AutoModelForImageTextToText, AutoProcessor

File “/Users/johndamask/src/tries/2025-11-10-docling-test/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py”, line 2320, in __getattr__

raise ModuleNotFoundError(

ModuleNotFoundError: Could not import module ‘AutoProcessor’. Are this object’s requirements defined correctly?

ASSISTANT:

Good question! This specific error is **unlikely to occur in the Docker container**. Here’s why:

## Your macOS Error vs Docker

**Your macOS error**: Missing `_lzma` module - this is a Python compilation issue. When Python is built via pyenv on macOS, if `liblzma` isn’t available at build time, the `_lzma` module won’t be compiled into Python.

**Docker container**: Uses the official `python:3.11-slim` image, which comes pre-built with all standard library modules including `_lzma` properly compiled.

Let me verify this:

ASSISTANT:

✅ **Good news**: Your container already has `_lzma` available.

Now let’s check if the transformers/torchvision stack will work on arm64:

ASSISTANT:

✅ **Even better news**: `AutoProcessor` imports successfully in your container!

Let’s test if Docling can initialize:

ASSISTANT:

Excellent! The import works. The failure only happens when trying to **actually process a document**. Let me verify that’s where the `libGL.so.1` error occurs:

ASSISTANT:

## Summary

**Your macOS `_lzma` error will NOT happen in Docker.** Here’s what’s actually going on:

### ✅ Works in Docker (arm64)

- Python standard library including `_lzma`

- `transformers` and `AutoProcessor`

- `torchvision`

- Docling imports

### ❌ Fails in Docker (arm64)

- **Only OpenCV** (`cv2`) due to missing `libGL.so.1`

## The Answer to Your Question

**Yes, local fallback will work after fixing the Dockerfile.** The fix is simple - just add the missing OpenGL libraries to your Dockerfile at line 39:

```dockerfile

RUN apt-get update && apt-get install -y \

curl \

libgl1 \

libglib2.0-0 \

&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \

&& apt-get install -y nodejs \

&& rm -rf /var/lib/apt/lists/*

```

Your macOS issues are separate - they stem from how Python was compiled via pyenv. The Docker container doesn’t have those problems because it uses properly built official Python images.

USER:

ok please make the necessary change to the dockerfile. Be sure to keep this tightly scoped - don’t try to change more than one thing at a time

ASSISTANT:

Done. I’ve added `libgl1` and `libglib2.0-0` to the runtime stage at Dockerfile:41-42.

This will fix the `libGL.so.1` missing library error when Docling’s table structure detection initializes OpenCV. You’ll need to rebuild your Docker image for the change to take effect.

USER:

Wow this seems to have worked!! I can now convert PDFs locally! Please check the logs and verify - you should see two operations within the last 10 mins or so. The first would have been a fallback to local and the second should have been using the local server specifically

ASSISTANT:

Perfect! ✅ I can verify both operations succeeded:

## First Operation (13:45:15 - 13:45:47)

**Request**: `POST /api/convert-documents?use_docetl_server=true`

- This **attempted to use Modal** but fell back to local (no “Modal endpoint failed” message means Modal returned non-200 silently)

- Processed `readai-privacy-policy.pdf` locally

- **Successfully completed in 32.09 seconds** with `200 OK`

- Pipeline initialized without errors (no `libGL.so.1` crash!)

## Second Operation (13:46:58 - 13:47:09)

**Request**: `POST /api/convert-documents` (no query param)

- **Explicitly used local processing** (`use_docetl_server` defaults to false)

- Processed `readai-terms-of-service.pdf` locally

- **Successfully completed in 11.36 seconds** with `200 OK`

## Key Evidence of Success

Both show the complete Docling pipeline executing without errors:

INFO detected formats: [<InputFormat.PDF: ‘pdf’>]

INFO Going to convert document batch...

INFO Initializing pipeline for StandardPdfPipeline

INFO Accelerator device: ‘cpu’

INFO Processing document [filename].pdf

INFO Finished converting document [filename].pdf

No `ImportError: libGL.so.1` crashes! The Dockerfile fix worked perfectly. Both local conversions completed successfully on your M3 Pro arm64 Docker container. 🎉

Don't over(look|state) the obvious

Discussion about this post

Ready for more?