🔥

Azure Data Factory8 min readMarch 16, 2026

Oracle V1 to V2 Connector Migration in ADF — The Parquet Precision Bug That Nearly Broke Production

Everything looked fine — connection worked, preview worked, pipelines validated. Then runtime hit and all hell broke loose. Here's the hidden precision issue Oracle V2 introduced and the exact fixes that work.

Let me set the scene. We had a working Azure Data Factory pipeline that had been happily copying data from Oracle into Parquet files for months. No issues. Stable. Boring in the best possible way. Then came the migration mandate — upgrade Oracle Linked Services from V1 to V2. Microsoft is retiring V1 on March 31, 2026, so this wasn't optional, just a matter of when. Seemed straightforward. Spoiler: it wasn't.

This is a post about a bug that doesn't show up in your connection test, doesn't fail in schema preview, doesn't get caught by pipeline validation — and then detonates at runtime and takes your entire copy activity down. If you're migrating Oracle connectors in ADF right now, read this before you hit the same wall.

The Migration Looked Clean

We followed the standard upgrade path. Updated the Linked Service type to Oracle version 2.0, pointed it to the same connection string, re-tested connectivity, ran source previews, validated the pipelines. Everything came back green. Every single check passed.

▸Connection test — successful
▸Source data preview — correct rows and columns showing
▸SQL queries in source tab — executing and returning data
▸Pipeline validation — zero errors, zero warnings
▸Debug run row sampling — looking fine

We pushed to staging, felt confident, and scheduled production deployment. That confidence lasted until the first real pipeline run.

⚠️ This is the most dangerous kind of bug — one that passes every pre-flight check and only detonates when real data flows at runtime into a Parquet writer.

When It Exploded

The first production run after migration failed with a Java exception buried inside the Copy Activity logs. At first glance it looked like an infrastructure problem. Then I read the actual stack trace.

text

ErrorCode=ParquetJavaInvocationException
Message: An error occurred when invoking java, message:
java.lang.ArrayIndexOutOfBoundsException: 255

at ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java)

255. That number meant nothing to me for about 10 minutes until I dug into Parquet's internal specification. Then it clicked immediately.

The Root Cause — Parquet Has a Hard Precision Limit

Parquet has a hard limit on decimal precision. The maximum it supports is 38. Not 39, not 100, definitely not 255 or 256. When a Parquet writer tries to create a decimal column with precision higher than 38, it doesn't gracefully degrade — it throws an ArrayIndexOutOfBoundsException at the JVM level because its internal lookup table is sized for exactly 38 entries. Index 255 is simply out of bounds. The writer crashes.

So where was precision 256 coming from? That's where Oracle V1 and Oracle V2 diverge in a way that isn't obvious from the migration documentation.

Oracle NUMBER — The Chameleon Datatype

Oracle NUMBER with no precision or scale specified — just raw NUMBER — is essentially unconstrained. Oracle can store anything in it. When an external connector reads that column, it needs to assign a concrete type. Oracle V1 and V2 made completely different decisions here.

Oracle V1 mapped unconstrained NUMBER columns to safe, compatible types that downstream systems like Parquet could handle without issues. It was conservative and it worked fine for years.

Oracle V2 changed this. It tries to be more faithful to Oracle's internal representation of unconstrained NUMBER — so it infers a very high precision value. By default, numberPrecision defaults to 256 and numberScale defaults to 130 for these columns. Technically more representative of what Oracle can actually store. Practically catastrophic when the destination is Parquet.

sql

-- Oracle column definition (no precision/scale)
CREATE TABLE transactions (
    amount   NUMBER,         -- unconstrained ← this is the problem column
    rate     NUMBER(10,4)    -- constrained   ← this works fine in both V1 and V2
);

-- Oracle V1 ADF inference:  amount → safe compatible type (within Parquet limits)
-- Oracle V2 ADF inference:  amount → decimal(256, 130)  ← Parquet explodes here

Constrained columns with explicit NUMBER(10,4) definitions worked perfectly fine in both V1 and V2. The failures were entirely from unconstrained NUMBER columns — which are extremely common in legacy Oracle schemas built by developers who didn't specify precision because Oracle didn't require it.

Why Every Pre-Migration Test Passed

This is the part that still frustrates me. Every test we ran was testing the wrong thing.

▸Connection test — checks if ADF can reach the Oracle server. Has nothing to do with datatype inference
▸Source preview — ADF renders data as strings in the UI. No Parquet conversion happens, precision issues are completely invisible
▸Query execution in debug — fetches rows and displays them. Still no Parquet writer involved, no precision check triggered
▸Pipeline validation — validates structure and configuration. Does not simulate the full type conversion path end-to-end

The Parquet writer only comes into play at actual runtime when Copy Activity tries to serialize Oracle data into Parquet binary format. That is the only moment the high precision inference hits the Parquet limit. Everything before that — every test, every preview, every validation — was just testing connectivity, not type compatibility.

💡 Connectivity testing and schema compatibility testing are completely different things. Passing one tells you nothing about the other.

Temporary Fix 1 — Set numberPrecision and numberScale in the Source Tab

The quickest workaround to unblock a failing pipeline is to explicitly set numberPrecision and numberScale in the Copy Activity source tab. These are Oracle V2-specific properties that control how unconstrained NUMBER columns are inferred when supportV1DataTypes is not enabled.

In the Copy Activity, go to the Source tab, and set these two properties directly in the UI or in the source JSON:

json

"source": {
    "type": "OracleSource",
    "numberPrecision": 38,
    "numberScale": 10
}

Setting numberPrecision to 38 — the maximum Parquet supports — ensures the inferred type stays within bounds. numberScale is the number of digits after the decimal point, set it to whatever fits your actual data. This stops the ArrayIndexOutOfBoundsException immediately.

This is a per-Copy-Activity setting. Every pipeline that reads Oracle data into Parquet needs this applied individually. For a single pipeline it's fine. For dozens of pipelines across an enterprise it becomes a real maintenance problem — every new pipeline you build needs the same setting, and missing it on one brings the same crash back.

Temporary Fix 2 — Manual Column Type Override in Mapping

The other workaround is to go into the Copy Activity Mapping tab and manually override the type for each problematic decimal column, setting precision to 38 explicitly:

json

{
    "name": "amount",
    "type": "Decimal",
    "precision": 38,
    "scale": 10
}

This also works, but it's even more granular than the source tab fix — you're overriding per column, not per activity. Useful if you only have one or two problematic columns in a large table and want to be surgical about it. Not useful if you have hundreds of unconstrained NUMBER columns scattered across your schema.

The Permanent Fix — supportV1DataTypes

The correct permanent solution is enabling supportV1DataTypes on the Oracle V2 Linked Service. This tells the connector to use Oracle V1's datatype inference behavior — safe, Parquet-compatible type mapping for unconstrained NUMBER columns — instead of V2's high-precision default.

You can set this two ways:

Option A — Through the ADF UI

Go to your Oracle V2 Linked Service in ADF Studio, open it for editing, and scroll down to the Additional connection properties section. Add a new property with name supportV1DataTypes and value true. Save and publish. That's it — no JSON editing required.

Option B — Through the Linked Service JSON

Switch to the JSON editor view of your Linked Service and add the property inside typeProperties:

json

{
    "name": "OracleLinkedService",
    "properties": {
        "type": "Oracle",
        "version": "2.0",
        "typeProperties": {
            "server": "(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=your-host)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=yourdb)))",
            "authenticationType": "Basic",
            "username": "<username>",
            "password": {
                "type": "SecureString",
                "value": "<password>"
            },
            "supportV1DataTypes": true
        }
    }
}

Once this is set, every pipeline using this Linked Service automatically gets V1 datatype compatibility behavior. Unconstrained NUMBER columns get mapped to Parquet-safe types. No per-pipeline source settings needed, no per-column mapping overrides needed. One config change covers everything.

🔑 supportV1DataTypes: true is the correct permanent fix. Set it in the Linked Service — either through the UI additional properties or the JSON editor — and it covers all pipelines using that service automatically.

One Important Note — supportV1DataTypes and numberPrecision Don't Mix

Worth knowing: numberPrecision and numberScale on the source tab only work when supportV1DataTypes is NOT enabled. They are mutually exclusive approaches. If you have supportV1DataTypes: true on the Linked Service, the source-level precision settings are ignored. Pick one approach and stick with it — for most enterprise migrations, supportV1DataTypes is the right call.

Why This Is So Commonly Missed

supportV1DataTypes is not surfaced prominently during the Linked Service setup wizard in the ADF UI. When you create an Oracle V2 Linked Service through the normal flow, you fill in the connection details, test the connection, hit save — and nowhere in that flow does it prompt you about datatype compatibility. The Additional connection properties section is easy to miss entirely if you don't know to look for it.

And the failure mode is particularly cruel. It doesn't fail on connection. It fails mid-pipeline, after Oracle data has already been read, only when the Parquet writer tries to serialize a decimal column with precision 256. By then you've consumed runtime, you've hit the database, and the only signal you get is a cryptic Java stack trace that doesn't obviously point back to a Linked Service configuration issue.

How to Actually Validate a Connector Migration

After this experience I changed how I validate any connector migration in ADF. Connection tests and source previews are not enough. Here's what I now treat as the minimum bar:

▸Run a real end-to-end debug Copy Activity that writes actual Parquet output to storage — not a preview, a real file write
▸Target specifically the tables with unconstrained NUMBER columns first, not just the clean well-defined tables
▸Open the output Parquet file and inspect the schema — verify decimal columns have precision 38 or lower
▸Check the Copy Activity output JSON for column type mappings — this shows exactly how ADF inferred each column
▸Run a row count comparison between Oracle source and Parquet destination to confirm no silent data loss
▸Test with your actual production table set, not just a sample table with five columns

Lessons From This Migration

▸Connectivity passing is not the same as compatibility passing — these are completely different things
▸Source preview is rendered as strings in ADF UI — it tells you nothing about how types will be serialized in the destination format
▸Oracle NUMBER without precision/scale is a landmine in any strongly-typed destination format — audit these columns before migration
▸supportV1DataTypes exists precisely for this scenario — add it before you run your first post-migration pipeline, not after it fails
▸The numberPrecision and numberScale source properties are useful for targeted fixes — but supportV1DataTypes is the correct enterprise-scale solution
▸Keep V1 Linked Service alive as rollback until V2 has been stable in production for at least two weeks

Final Thoughts

The frustrating thing about this issue is that it's completely preventable with one config change that takes 30 seconds to apply. But it's invisible in every standard pre-migration test, and the error it throws at runtime is a Java stack trace that doesn't obviously connect back to a Linked Service setting.

If you're doing this migration right now — go add supportV1DataTypes: true to your Oracle V2 Linked Service before you run anything in production. You can do it through the ADF UI in Additional connection properties without touching any JSON. It costs you nothing and it prevents the entire class of Parquet precision failures that Oracle V2 introduces for unconstrained NUMBER columns.

If you're already hitting the ArrayIndexOutOfBoundsException: 255 or 256 error — set numberPrecision to 38 in the Copy Activity source tab to unblock production immediately, then go add supportV1DataTypes to the Linked Service as the permanent fix.

✅ Quick summary: Oracle V2 defaults to decimal(256, 130) for unconstrained NUMBER columns. Parquet max precision is 38. Permanent fix: enable supportV1DataTypes: true on the Oracle V2 Linked Service — via UI Additional properties or JSON editor. Temporary fix: set numberPrecision=38 and numberScale in the Copy Activity source tab.

🔭

Stop Checking Workflows Manually — Build a Centralized Databricks Monitoring Dashboard

Databricks · 10 min read

→