SDK Documentation

Deploy Causal MMA locally for maximum data privacy, offline processing, and seamless integration with your infrastructure.

Cloud API vs Local SDK

Cloud API: Best for most users. Zero infrastructure costs, instant setup, $149-$799/month.

Local SDK: For enterprises with existing AI infrastructure, strict data privacy requirements, or massive datasets (10M+ rows). Contact us for SDK access →

Installation

Requirements

Python 3.8 or higher
Valid SDK license key (contact us at cm-sales@infinidatum.net)
Recommended: GPU for large datasets (optional, CPU works fine for < 1M rows)

Install via pip

pip install causalmma-client

Verify Installation

python -c "from causalmma_client import LocalEngine; print('✅ SDK installed successfully')"

Quick Start

Initialize the Client

from causalmma_client import LocalEngine

# Initialize with your API key (production)
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    control_plane_url="https://ops.causalmma.com"  # Production control plane
)

# For offline/air-gapped deployments (no license validation)
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    offline_mode=True
)

Basic Attribution Analysis

import pandas as pd
from causalmma_client import LocalEngine

# Initialize engine
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    control_plane_url="https://ops.causalmma.com"
)

# Prepare your data (stays local!)
df = pd.DataFrame([
    {"customer_id": "c1", "timestamp": "2025-01-01T10:00:00", "channel": "email", "conversion": 0, "ad_exposure": 1},
    {"customer_id": "c1", "timestamp": "2025-01-01T12:00:00", "channel": "facebook", "conversion": 0, "ad_exposure": 1},
    {"customer_id": "c1", "timestamp": "2025-01-01T14:00:00", "channel": "paid_search", "conversion": 1, "conversion_value": 150.0, "ad_exposure": 1}
])

# Run attribution analysis (100% local execution)
result = engine.analyze(
    df=df,
    model="data_driven",
    treatment="ad_exposure",
    outcome="conversion"
)

print(result["attribution_weights"])

Response

{
    "attribution_weights": {
        "email": 0.35,
        "facebook": 0.25,
        "paid_search": 0.40
    },
    "attributed_revenue": {
        "email": 52.50,
        "facebook": 37.50,
        "paid_search": 60.00
    },
    "confidence_intervals": {
        "email": {"lower": 0.28, "upper": 0.42},
        "facebook": {"lower": 0.18, "upper": 0.32},
        "paid_search": {"lower": 0.33, "upper": 0.47}
    },
    "p_values": {
        "email": 0.002,
        "facebook": 0.015,
        "paid_search": 0.0001
    },
    "method_used": "doubly_robust",
    "total_revenue": 150.00,
    "causal_graph": {...}
}

Attribution Models

The SDK supports all the same attribution models as the cloud API:

Model	Description	Best For	Method
`data_driven`	AI-powered causal inference	Most accurate results	Doubly Robust Estimation
`shapley`	Game-theoretic fair credit	Provably fair attribution	Shapley Values
`propensity_score`	Propensity score matching	Comparing similar customers	PSM with caliper matching
`instrumental_variables`	Two-stage least squares	Handling unmeasured confounding	2SLS with IV validation
`time_decay`	Recency-weighted attribution	Recent touchpoints matter more	Exponential decay
`position`	First & last touch emphasized	Awareness + conversion focus	U-shaped weighting
`linear`	Equal credit to all touchpoints	Baseline comparison	Uniform distribution

Advanced Features

Batch Processing

import pandas as pd
from causalmma_client import LocalEngine

engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    control_plane_url="https://ops.causalmma.com"
)

# Process large dataset locally (no network transfer!)
df = pd.read_csv("large_dataset.csv")  # 10M rows

result = engine.analyze(
    df=df,
    model="data_driven",
    treatment="ad_exposure",
    outcome="purchase"
)

# Results computed locally
print(f"Attribution weights: {result['attribution_weights']}")
print(f"Processing time: {result.get('execution_time_ms', 'N/A')}ms")

Quick Analysis Function

from causalmma_client import analyze
import pandas as pd

# Quick analysis without explicit engine initialization
df = pd.read_csv("your_data.csv")

result = analyze(
    df=df,
    api_key="ca_live_YOUR_API_KEY",
    model="shapley",
    treatment="ad_exposure",
    outcome="purchase"
)

print(result)

Advanced: Custom Configuration

from causalmma_client import LocalEngine
import pandas as pd

# Initialize with custom settings
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    control_plane_url="https://ops.causalmma.com"
)

# Large dataset processing
df = pd.read_csv("10M_rows_dataset.csv")

result = engine.analyze(
    df=df,
    model="propensity_score",  # Use PSM model
    treatment="ad_exposure",
    outcome="purchase"
)

print(f"Average Treatment Effect: ${result.get('ate', 0):.2f}")
print(f"Confidence Interval: {result.get('confidence_interval', {})}")

Offline Mode (Air-Gapped Deployment)

from causalmma_client import LocalEngine
import pandas as pd

# For environments without internet access
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    offline_mode=True  # No license validation, no control plane calls
)

# All processing happens locally, no network required
df = pd.read_csv("sensitive_data.csv")

result = engine.analyze(
    df=df,
    model="data_driven",
    treatment="treatment",
    outcome="outcome"
)

print(result)

Data Privacy & Security

🔒 100% Local Processing

Your data never leaves your servers - All computation happens locally
License validation only - Control plane only validates license key (no data sent)
Offline mode available - Set offline_mode=True for air-gapped deployments
HIPAA, GDPR, SOC 2 compliant - Since data stays local, compliance is simplified

Offline/Air-Gapped Deployment

from causalmma_client import LocalEngine
import pandas as pd

# For environments without internet access
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    offline_mode=True  # No license validation, no control plane calls
)

# All processing happens locally, no network required
df = pd.read_csv("sensitive_data.csv")

result = engine.analyze(
    df=df,
    model="data_driven",
    treatment="treatment",
    outcome="outcome"
)

print(result)

Performance

Dataset Size	Cloud API (Network Transfer)	Local SDK (No Transfer)	Speedup
10,000 rows	~500ms	~100ms	5x faster
100,000 rows	~5s	~800ms	6x faster
1,000,000 rows	~60s	~4s	15x faster
10,000,000 rows	Not recommended (transfer time)	~30s	30x+ faster

* Benchmarks on AWS c5.4xlarge (16 vCPU, 32 GB RAM). GPU acceleration available for datasets > 1M rows.

Configuration Options

Client Configuration

from causalmma_client import LocalEngine

engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",

    # Control Plane (Production)
    control_plane_url="https://ops.causalmma.com",

    # Privacy (Air-gapped deployments)
    offline_mode=False  # Set to True for air-gapped/offline environments
)

# The SDK automatically:
# - Validates your license with the control plane
# - Fetches feature flags and permissions
# - Starts background heartbeat (telemetry)
# - Processes all data 100% locally (never sent to server)

Analysis Models Available

# Available models in the SDK:
result = engine.analyze(
    df=df,
    model="data_driven",    # Doubly robust estimation (default, most accurate)
    # model="shapley",      # Game-theoretic fair attribution
    # model="propensity_score",  # Propensity score matching
    # model="instrumental_variables",  # Two-stage least squares
    # model="time_decay",   # Recency-weighted
    # model="position",     # First & last touch emphasis
    # model="linear",       # Equal credit baseline
    treatment="ad_exposure",
    outcome="purchase"
)

Integration Examples

Integrate with Data Pipeline

import os
import pandas as pd
from causalmma_client import LocalEngine

# In your ETL pipeline
def process_daily_attribution(date):
    engine = LocalEngine(
        api_key=os.environ["CAUSALMMA_API_KEY"],
        control_plane_url="https://ops.causalmma.com"
    )

    # Load data from your data warehouse (stays local!)
    query = f"""
        SELECT customer_id, timestamp, channel, ad_exposure, conversion, conversion_value
        FROM touchpoints
        WHERE DATE(timestamp) = '{date}'
    """
    df = pd.read_sql(query, conn)

    # Run attribution (100% local processing, no data sent)
    result = engine.analyze(
        df=df,
        model="data_driven",
        treatment="ad_exposure",
        outcome="conversion"
    )

    # Save results back to warehouse
    results_df = pd.DataFrame(result['attribution_weights'].items(),
                             columns=['channel', 'attribution_weight'])
    results_df['date'] = date
    results_df.to_sql('attribution_results', conn, if_exists='append')

    return result

Real-Time Scoring API

from causalmma_client import LocalEngine
from flask import Flask, request, jsonify
import pandas as pd

app = Flask(__name__)
engine = LocalEngine(
    api_key="ca_live_YOUR_API_KEY",
    control_plane_url="https://ops.causalmma.com"
)

@app.route('/score', methods=['POST'])
def score_attribution():
    data = request.json
    df = pd.DataFrame(data['touchpoints'])

    result = engine.analyze(
        df=df,
        model="data_driven",
        treatment=data.get('treatment', 'ad_exposure'),
        outcome=data.get('outcome', 'conversion')
    )

    return jsonify(result)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8000)

Pricing

SDK Licensing

The SDK requires a separate enterprise license. Pricing is based on:

Number of servers/instances running the SDK
Volume of data processed (rows per month)
Support level (standard, priority, or dedicated)

Typical pricing:

Small deployment (1-3 servers, < 10M rows/month): $1,500-$3,000/month
Medium deployment (4-10 servers, 10M-100M rows/month): $3,000-$8,000/month
Enterprise deployment (10+ servers, 100M+ rows/month): Custom pricing

Contact us for a quote: cm-sales@infinidatum.net

Support

Support Level	Included	Response Time
Standard	Email support, documentation	48 hours
Priority	Email + Slack, dedicated support engineer	4 hours
Dedicated	24/7 support, dedicated team, custom SLA	1 hour

Contact

📧 Email: cm-support@infinidatum.net

📚 Documentation: API Docs | Code Examples

🐙 GitHub: github.com/rdmurugan/causallm

FAQ

When should I use the SDK instead of the Cloud API?

Use the SDK if you:

Already have AI infrastructure (GPUs, servers) and want to avoid cloud API costs
Have strict data privacy requirements (HIPAA, GDPR, SOC 2)
Process massive datasets (10M+ rows regularly) where network transfer is slow
Need offline/air-gapped deployment

Otherwise, the Cloud API is more cost-effective ($149-$799/month vs. $1,500+ for SDK).

Does the SDK send my data to your servers?

No. All attribution algorithms run 100% locally on your infrastructure. The only network call is license validation (metadata only, no data). You can also run in offline_mode=True for air-gapped environments.

Can I use both the Cloud API and SDK?

Yes! Many customers use the Cloud API for real-time/small workloads and the SDK for batch processing large datasets.

What Python versions are supported?

Python 3.8, 3.9, 3.10, 3.11, and 3.12. We recommend Python 3.10+ for best performance.

Do I need a GPU?

No, but recommended for datasets > 1M rows. The SDK will auto-detect and use GPU if available. For most use cases, CPU is sufficient.

Legal and Compliance

Software License

The SDK is licensed, not sold. By installing and using the SDK, you agree to our Terms of Service, which include:

Limited License: Non-exclusive, non-transferable license for the number of servers specified in your agreement
No Redistribution: You may not distribute, sublicense, or share the SDK with third parties
Modifications: You may modify for internal use only
License Keys: Keep your license key confidential; unauthorized sharing may result in termination

Data Privacy

The SDK processes data 100% locally on your infrastructure:

Your Data Stays Local: Attribution algorithms run entirely on your servers—no customer data is sent to Infinidatum
License Validation Only: The SDK contacts our control plane solely to validate your license (metadata only, no customer data)
Offline Mode: Air-gapped deployments can run without any external network calls
Your Compliance Responsibility: You are responsible for HIPAA, GDPR, SOC 2, and other compliance requirements for data processed on your infrastructure

See our Privacy Policy for full details on data handling.

Warranties and Disclaimers

THE SDK IS PROVIDED "AS IS" WITHOUT WARRANTIES OF ANY KIND. Infinidatum does not warrant that the SDK will be error-free, uninterrupted, or suitable for your specific purposes. See our Terms of Service for full disclaimer.

Limitation of Liability

Infinidatum's total liability for all claims related to the SDK shall not exceed the amount you paid for the SDK license in the 12 months preceding the claim. See Terms of Service for details.

Export Compliance

The SDK may be subject to U.S. export control laws. You agree to comply with all applicable export and import laws. You may not use the SDK in embargoed countries or provide access to prohibited parties.

Audit Rights

Infinidatum reserves the right to audit your use of the SDK to verify compliance with the license terms. We will provide 30 days' advance notice for audits.

Ready to Deploy Locally?

By requesting SDK access, you agree to our Terms of Service and Privacy Policy.

Contact for SDK Access View Cloud API Pricing