General · · 5 min read

When a 20-Year-Old Perl Bug Comes Knocking: How We Used AI to Track It Down

At Veritas Supera, we maintain PDF::Reuse — a Perl module on CPAN that lets developers efficiently create PDF documents by reusing existing components. It’s roughly 7,000 lines of code spanning about 100 subroutines, and it handles everything from cross-reference tables to TrueType font embedding. It’s also a dependency of Koha, the world’s most popular open-source integrated library system.

So when Koha’s label and patron card creator started crashing, we needed answers fast.


The Bug Nobody Could See Coming

The symptoms were deceptively simple. A specific sequence of PDF operations — opening a file, adding a page with a TrueType font, and closing the session — would sometimes crash with:

Can't use an undefined value as an ARRAY reference at PDF/Reuse.pm line 1486.

The key word there is sometimes. Simple scripts worked fine. The crash only appeared in multi-session patterns — exactly the kind of usage Koha relies on when generating batches of labels or patron cards.


Going Deep with Claude

We brought Anthropic’s Claude into the debugging workflow, pointing it at the GitHub issue and giving it access to the full codebase. What followed was a methodical investigation across two CPAN distributions and roughly 1,500 lines of interconnected code.

Claude’s approach was systematic. It traced the full chain of object creation, registration, and serialization — from prTTFont() through the internal DocProxy bridge, into Text::PDF‘s object system. Along the way, it tested and eliminated about seven different hypotheses: duplicate registration, UID collisions, stream content interference, hash key separators, and more.

The breakthrough came when Claude shifted from reading code to actually reproducing the bug. It built instrumentation that tracked a critical internal identifier — the ' uid' field on font objects — through the lifecycle of a multi-session run:

TTFont0 uid field: [pdfuid000]
Resetting %font...
TTFont0 uid field: [UNDEF]

There it was. The UID was being silently destroyed.


The Root Cause: A Dual Ownership Problem

The bug turned out to be a classic object lifecycle error with a nasty twist.

PDF::Reuse wraps TrueType fonts in an internal PDF::Reuse::TTFont object. When that wrapper is garbage-collected, its DESTROY method dutifully calls release() on the underlying font object:

# The bug: DESTROY wipes a shared object's internal state
sub DESTROY
{  my $self = shift;
   if(my $ttfont = $self->{ttfont})
   {  if(my $font = delete $ttfont->{' font'})
      { $font->release();       # Frees Font::TTF data (OK)
      }
      $ttfont->release();        # WIPES ALL FIELDS including ' uid' (BUG!)
   }
   %$self = ();
}

That release() call is inherited from Text::PDF::Objind — a method in a completely different CPAN distribution that deletes every key from the object hash. Including the ' uid' field that another part of the system depends on for tracking.

The wrapper thought it owned the font object. But DocProxy also held a reference and needed that object’s state to remain intact for PDF serialization. When Perl’s garbage collector cleaned up the wrapper between sessions, it silently corrupted the font object that DocProxy was still counting on.


The Moment Human Judgment Mattered Most

Here’s where the story gets interesting. Claude’s initial instinct was a minimal defensive fix: guard against the missing reference and reset the state. Quick, clean, done.

We asked one question: “Does minimal == best?”

Claude reconsidered — and realized the minimal fix would silently produce corrupt PDFs. Missing font descriptors, broken ToUnicode maps, garbled text. A crash you can debug. A silently corrupt PDF that renders wrong? That’s the kind of bug that erodes trust.

So instead of patching the symptom, we fixed the actual ownership problem:

# The fix: DESTROY no longer releases the shared font object
sub DESTROY
{  my $self = shift;
   # Do NOT release the ttfont (TTFont0) object here -- it is still
   # owned by the DocProxy's objcache and will be cleaned up in prEnd().
   %$self = ();
}

The cleanup responsibility was moved to prEnd(), where it belongs — after the font objects have been fully serialized to PDF:

# In prEnd(), after write_objects has serialized everything:
if($docProxy)
{  $docProxy->write_objects;
   for my $obj (values %{ $docProxy->{' objcache'} })
   {  if ($obj->isa('Text::PDF::TTFont0'))
      {  if (my $font = delete $obj->{' font'})
         {  $font->release();
         }
         $obj->release();
      }
   }
   undef $docProxy;
}

The Result

The fix shipped as PDF::Reuse 0.43 on CPAN, with three regression tests covering normal sessions, state reset scenarios, and multi-session usage. All CI checks passed across Perl 5.24 through 5.40.

Without Claude’s systematic tracing, this bug — spanning two CPAN distributions, involving garbage collection timing, and only manifesting in specific session patterns — would have taken an experienced developer several hours of careful manual investigation. The combination of Claude’s exhaustive analysis and our domain expertise in PDF internals and Perl object lifecycle management got us to a clean, correct fix efficiently.


What This Means for Your Legacy Code

If you’re maintaining Perl modules, complex legacy systems, or any codebase where bugs hide in the interactions between components — you know how expensive these investigations can be. The kind of deep, methodical debugging that this fix required isn’t just about having the right tools. It’s about knowing how to direct those tools, when to push back on easy answers, and how to evaluate whether a fix truly solves the problem.

That’s what we do at Veritas Supera. We pair AI-accelerated analysis with decades of real-world experience in Perl, open-source systems, and legacy code maintenance to solve the bugs that keep your team stuck.

Have a gnarly bug in legacy code? Get in touch — we’d love to help you track it down.