Skip to content
0

Amazon Ads Automation - Complete Implementation Guide

This guide documents the successful implementation of an automated Amazon Ads data extraction system. After navigating API limitations and authentication challenges, we built a working solution that fetches campaign data and generates Excel reports with a single command.

Status:Production Ready - Successfully generates reports with 131+ campaigns

🎯 What This System Does

  1. Fetches ALL campaigns from your Amazon Ads account (with pagination support)
  2. Retrieves performance metrics for the last 30 days
  3. Merges lifetime campaign data with recent performance
  4. Generates Excel reports with calculated metrics (CTR, ACOS)
  5. Timestamps output files for version tracking

Sample Output

🚀 Pulling Amazon Ads Campaign Data (all campaigns + 30-day performance)...
📋 Fetching all campaigns...
📊 Found 131 total campaigns
📈 Fetching performance data (30-day lookback)...
📋 Report requested: d6cb3450-9005-4737-b346-a2416fd9d0c7
⏳ Polling report...
✅ Report completed!
✅ Merged 131 campaigns with performance data
✅ Success! Workbook saved to output/reports/202603051206_Ads_Campaign_Report.xlsx
📊 Total campaigns: 131
📈 Active campaigns (last 30 days): 15

📁 Project Structure

ads-direct-sdk/
├── .env                          # Credentials (NEVER commit!)
├── config/
│   └── settings.py               # Report column definitions
├── src/
│   ├── api_client.py             # API client & authentication
│   └── report_fetcher.py         # Report fetching & merging logic
├── scripts/
│   ├── run_weekly.py             # Main orchestration script
│   └── test_connection.py        # Connection test utility
├── data/
│   └── raw/                      # Raw data storage (optional)
├── output/
│   └── reports/                  # Generated Excel reports
└── venv_sdk/                     # Python virtual environment

🛠️ Environment Setup

1. Create Virtual Environment

bash
# Navigate to project directory
cd ads-direct-sdk

# Create virtual environment
python -m venv venv_sdk

# Activate it (macOS/Linux)
source venv_sdk/bin/activate

# Activate it (Windows)
# venv_sdk\Scripts\activate

2. Install Dependencies

bash
pip install python-amazon-ad-api pandas openpyxl python-dotenv click requests

3. Configure Credentials (.env file)

bash
# Amazon Ads API Credentials
CLIENT_ID=amzn1.application-oa2-client.xxxxx
CLIENT_SECRET=amzn1.oa2-cs.v1.xxxxx
REFRESH_TOKEN=Atzr|xxxxx
PROFILE_ID=2502259796317581

# Region and Marketplace
REGION=NA
MARKETPLACE=US

⚠️ Security Note: Never commit the .env file to version control. Add it to .gitignore.


📄 Core Implementation Files

1. config/settings.py

Defines the columns to fetch from the Amazon Ads API:

python
# config/settings.py
from ad_api.base import Marketplaces

MARKETPLACE_MAP = {"US": Marketplaces.US, "UK": Marketplaces.UK, "JP": Marketplaces.JP}

# Metrics for campaign reports
# Note: API column names use 14d attribution window, but we rename them in output
SP_REPORT_COLUMNS = [
    "campaignName",
    "impressions",
    "clicks",
    "cost",
    "sales14d",           # Renamed to 'sales' in output
    "purchases14d",       # Renamed to 'purchases' in output
    "startDate",
    "endDate",
]

2. src/api_client.py

Handles authentication and provides the Reports API client:

python
# src/api_client.py
import os
import json
import requests
from dotenv import load_dotenv
from ad_api.api.reports import Reports
from ad_api.base import Marketplaces

load_dotenv()


class AdsClient:
    """Amazon Ads API Client with authentication and report management"""

    def __init__(self):
        self.client_id = os.getenv("CLIENT_ID")
        self.client_secret = os.getenv("CLIENT_SECRET")
        self.refresh_token = os.getenv("REFRESH_TOKEN")
        self.profile_id = str(os.getenv("PROFILE_ID")).strip()
        self.api_base_url = "https://advertising-api.amazon.com"
        self.marketplace = os.getenv("MARKETPLACE", "US")

    def get_access_token(self):
        """Exchange refresh token for access token"""
        payload = {
            "grant_type": "refresh_token",
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "refresh_token": self.refresh_token,
        }
        res = requests.post("https://api.amazon.com/auth/o2/token", data=payload)
        res.raise_for_status()
        return res.json().get("access_token")

    def get_headers(self):
        """Get headers for v3 Campaigns API"""
        token = self.get_access_token()
        schema_version = "application/vnd.spCampaign.v3+json"
        return {
            "Authorization": f"Bearer {token}",
            "Amazon-Advertising-API-ClientId": self.client_id,
            "Amazon-Advertising-API-Scope": self.profile_id,
            "Content-Type": schema_version,
            "Accept": schema_version,
        }

    def list_campaigns(self):
        """Test connection by listing campaigns"""
        url = f"{self.api_base_url}/sp/campaigns/list"
        headers = self.get_headers()
        payload = {}
        res = requests.post(url, headers=headers, data=json.dumps(payload))
        return res

    def _raw_post(self, url, headers, payload):
        """Internal method for raw POST requests"""
        return requests.post(url, headers=headers, data=json.dumps(payload))

    def get_report_manager(self):
        """Returns a Reports API v3 client for async report operations"""
        marketplace_map = {
            "US": Marketplaces.US,
            "UK": Marketplaces.UK,
            "JP": Marketplaces.JP,
            "EU": Marketplaces.EU,
        }
        marketplace = marketplace_map.get(self.marketplace, Marketplaces.US)

        return Reports(
            account="default",
            marketplace=marketplace,
            credentials={
                "client_id": self.client_id,
                "client_secret": self.client_secret,
                "refresh_token": self.refresh_token,
                "profile_id": self.profile_id,
            },
        )

3. src/report_fetcher.py

Core logic for fetching campaigns with pagination and merging with performance data:

python
# src/report_fetcher.py
import datetime
import time
import uuid
import re
import json

from config.settings import SP_REPORT_COLUMNS
from ad_api.base.exceptions import AdvertisingApiException


class ReportFetcher:
    """
    Fetches ALL Sponsored Products campaigns with performance data.

    Uses a hybrid approach:
    1. Get all campaigns via campaigns list API (with pagination)
    2. Get performance data via Reports API (limited to 30-day range)
    3. Merge together - inactive campaigns get 0 metrics
    """

    def __init__(self, client):
        self.client = client
        self.api = client.get_report_manager()

    def fetch_sp_campaigns(self, days_back=30):
        """
        Fetch ALL campaigns with performance data.

        Args:
            days_back: Number of days for performance lookback (max 30)

        Returns:
            List of campaign dictionaries with lifetime config + recent performance
        """
        # Step 1: Get all campaigns with pagination
        print("📋 Fetching all campaigns...")
        all_campaigns = self._fetch_all_campaigns_paginated()
        print(f"📊 Found {len(all_campaigns)} total campaigns")

        # Step 2: Get performance data via Reports API
        print(f"📈 Fetching performance data ({days_back}-day lookback)...")
        download_url = self._fetch_report(days_back)

        if download_url:
            # Step 3: Merge performance data with all campaigns
            return self._merge_campaign_data(all_campaigns, download_url)
        else:
            print("⚠️  Returning campaigns without performance data")
            return all_campaigns

    def _fetch_all_campaigns_paginated(self):
        """Fetch all campaigns using pagination (handles 100+ campaigns)"""
        url = f"{self.client.api_base_url}/sp/campaigns/list"
        headers = self.client.get_headers()

        all_campaigns = []
        next_token = None

        while True:
            payload = {"maxResults": 100}
            if next_token:
                payload["nextToken"] = next_token

            res = self.client._raw_post(url, headers, payload)
            if res.status_code != 200:
                print(f"❌ Failed to fetch campaigns: {res.status_code}")
                break

            data = res.json()
            campaigns = data.get("campaigns", [])
            all_campaigns.extend(campaigns)

            next_token = data.get("nextToken")
            if not next_token:
                break

        return all_campaigns

    def _merge_campaign_data(self, all_campaigns, download_url):
        """Merge performance data from report with all campaigns"""
        import pandas as pd

        # Load performance data from report
        df_perf = pd.read_json(download_url, compression="gzip")

        # Rename columns to match Amazon Ads UI (remove 14d suffix)
        df_perf = df_perf.rename(columns={
            "sales14d": "sales",
            "purchases14d": "purchases",
        })

        # Create lookup by campaign name
        perf_lookup = {}
        for _, row in df_perf.iterrows():
            name = row.get("campaignName")
            if name:
                perf_lookup[name] = row.to_dict()

        # Merge with all campaigns
        merged = []
        for c in all_campaigns:
            name = c.get("name", "")
            perf = perf_lookup.get(name, {})

            row = {
                "campaignName": name,
                "campaignId": c.get("campaignId", ""),
                "state": c.get("state", ""),
                "startDate": c.get("startDate", ""),
                "endDate": c.get("endDate", ""),
                "impressions": perf.get("impressions", 0),
                "clicks": perf.get("clicks", 0),
                "cost": perf.get("cost", 0),
                "sales": perf.get("sales", 0),
                "purchases": perf.get("purchases", 0),
            }
            merged.append(row)

        print(f"✅ Merged {len(merged)} campaigns with performance data")
        return merged

    def _fetch_report(self, days_back=30):
        """Fetch performance data via async reports API (max 30 days)"""
        days_back = min(days_back, 30)

        end_date = datetime.date.today() - datetime.timedelta(days=1)
        start_date = end_date - datetime.timedelta(days=days_back)

        body = {
            "name": f"SP_Report_{start_date}_{uuid.uuid4().hex[:8]}",
            "startDate": start_date.strftime("%Y-%m-%d"),
            "endDate": end_date.strftime("%Y-%m-%d"),
            "configuration": {
                "adProduct": "SPONSORED_PRODUCTS",
                "groupBy": ["campaign"],
                "columns": SP_REPORT_COLUMNS,
                "reportTypeId": "spCampaigns",
                "timeUnit": "SUMMARY",
                "format": "GZIP_JSON",
            },
        }

        try:
            res = self.api.post_report(body=body)
            report_id = res.payload.get("reportId")
            print(f"📋 Report requested: {report_id}")
        except AdvertisingApiException as e:
            status_code = getattr(e, 'status_code', getattr(e, 'code', 0))
            if status_code == 425 and hasattr(e, 'content'):
                match = re.search(r'duplicate of :\s*([a-f0-9-]+)', str(e.content))
                if match:
                    report_id = match.group(1)
                    print(f"⚠️  Duplicate request detected, reusing existing report: {report_id}")
                else:
                    raise
            else:
                raise

        print(f"⏳ Polling report {report_id}...")
        for i in range(40):  # Up to 20 minutes
            status_res = self.api.get_report(report_id)
            status = status_res.payload.get("status")
            if status == "COMPLETED":
                print(f"✅ Report completed!")
                return status_res.payload.get("url")
            elif status == "FAILED":
                print(f"❌ Report generation failed: {status_res.payload.get('failureReason')}")
                return None
            time.sleep(30)

        print("⏱️  Report generation timed out")
        return None

4. scripts/run_weekly.py

Main orchestration script that generates the Excel report:

python
#!/usr/bin/env python3
"""
Weekly Amazon Ads Data Automation Script
Generates Excel report with all campaigns and 30-day performance metrics
"""

import os
import sys
from datetime import datetime

# Add project root to Python path
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

import click
import pandas as pd
from src.api_client import AdsClient
from src.report_fetcher import ReportFetcher


@click.command()
def main():
    """Main function to orchestrate campaign data collection"""

    client = AdsClient()
    fetcher = ReportFetcher(client)

    print("🚀 Pulling Amazon Ads Campaign Data (all campaigns + 30-day performance)...")
    campaigns = fetcher.fetch_sp_campaigns(days_back=30)

    if campaigns:
        # Convert list of dicts to DataFrame
        df = pd.DataFrame(campaigns)

        # Calculate Excel-ready metrics (handle division by zero)
        df["CTR"] = (df["clicks"] / df["impressions"]).replace(
            [float("inf"), -float("inf")], 0
        ).fillna(0)
        df["ACOS"] = (df["cost"] / df["sales"]).replace(
            [float("inf"), -float("inf")], 0
        ).fillna(1)  # ACOS = 1 (100%) when no sales

        # Generate timestamped filename (timestamp prepended)
        timestamp = datetime.now().strftime("%Y%m%d%H%M")
        output_path = f"output/reports/{timestamp}_Ads_Campaign_Report.xlsx"

        # Sort by sales descending and save
        df_sorted = df.sort_values("sales", ascending=False)
        df_sorted.to_excel(output_path, index=False)

        # Count campaigns with activity
        active_count = (df["impressions"] > 0).sum()

        print(f"✅ Success! Workbook saved to {output_path}")
        print(f"📊 Total campaigns: {len(df_sorted)}")
        print(f"📈 Active campaigns (last 30 days): {active_count}")
        print(f"📅 Campaign date range: {df_sorted['startDate'].min()} to {df_sorted['endDate'].max()}")
    else:
        print("❌ Error: Failed to fetch campaign data.")


if __name__ == "__main__":
    main()

5. scripts/test_connection.py

Utility script to test API connectivity:

python
# scripts/test_connection.py
import os
import sys

sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

from src.api_client import AdsClient


def test():
    print("🔌 Manual Connection Test (Bypassing SDK)...")
    try:
        client = AdsClient()
        response = client.list_campaigns()

        if response.status_code == 200:
            print("✅ SUCCESS! We are connected.")
            data = response.json()
            print(f"📦 Found {len(data.get('campaigns', []))} campaigns.")
        else:
            print(f"❌ FAILED! Status: {response.status_code}")
            print(f"📝 Response: {response.text}")

    except Exception as e:
        print(f"💥 Error: {str(e)}")


if __name__ == "__main__":
    test()

🚀 Usage

Test Connection

Before running the full report, verify your credentials work:

bash
source venv_sdk/bin/activate
python scripts/test_connection.py

Expected output:

🔌 Manual Connection Test (Bypassing SDK)...
✅ SUCCESS! We are connected.
📦 Found 2 campaigns.

Generate Report

Run the main script to fetch all campaigns and generate the Excel report:

bash
python scripts/run_weekly.py

Expected output:

🚀 Pulling Amazon Ads Campaign Data (all campaigns + 30-day performance)...
📋 Fetching all campaigns...
📊 Found 131 total campaigns
📈 Fetching performance data (30-day lookback)...
📋 Report requested: d6cb3450-9005-4737-b346-a2416fd9d0c7
⏳ Polling report...
✅ Report completed!
✅ Merged 131 campaigns with performance data
✅ Success! Workbook saved to output/reports/202603051206_Ads_Campaign_Report.xlsx
📊 Total campaigns: 131
📈 Active campaigns (last 30 days): 15

📊 Output Report Structure

Filename Format

{YYYYMMDDHHMM}_Ads_Campaign_Report.xlsx

Example: 202603051206_Ads_Campaign_Report.xlsx

Columns

ColumnDescription
campaignNameCampaign name
campaignIdUnique campaign identifier
stateCampaign state (ENABLED, PAUSED, ARCHIVED)
startDateCampaign start date
endDateCampaign end date (if applicable)
impressionsImpressions in last 30 days
clicksClicks in last 30 days
costSpend in last 30 days
sales14-day attributed sales
purchases14-day attributed units sold
CTRClick-through rate (calculated)
ACOSAdvertising cost of sales (calculated)

Note: Column names sales and purchases match the Amazon Ads UI. The API returns 14-day attribution window data (sales14d, purchases14d), which we rename for clarity.


🔧 Troubleshooting

Issue: "Invalid Input exception - profile ID required"

Solution: Check your .env file uses PROFILE_ID (singular), not PROFILE_IDS:

bash
PROFILE_ID=2502259796317581

Issue: "Server cannot provide a response with Content-Type"

Solution: Ensure the v3 schema headers are set in get_headers ():

python
schema_version = "application/vnd.spCampaign.v3+json"
headers = {
    "Content-Type": schema_version,
    "Accept": schema_version,
}

Issue: "startDate to endDate range must not exceed maximum range (31 days)"

Solution: The Reports API has a 30-day maximum lookback per request. Keep days_back=30 or less.

Issue: Report generation times out

Solution: Reports can take 2-3 hours for large accounts. The script polls for 20 minutes by default. Increase the poll range in _fetch_report ():

python
for i in range(80):  # Up to 40 minutes

Issue: Only 100 campaigns returned

Solution: The API paginates at 100 campaigns. Our implementation handles pagination automatically via _fetch_all_campaigns_paginated ().


📚 API Limitations & Workarounds

LimitationImpactOur Solution
Reports API: 30-day max lookbackCannot get lifetime performance in one reportHybrid approach: campaigns API + reports API
Reports API: Only active campaignsInactive campaigns don't appear in reportsFetch all campaigns via campaigns list API
Campaigns API: No performance metricsOnly config data (budget, dates, state)Merge with reports API data by campaign name
Attribution window: 14d fixedSales/purchases use 14-day windowAccept API default, rename columns for clarity

🎓 Key Learnings

1. Hybrid Data Fetching Strategy

The Amazon Ads API has two main endpoints for campaign data:

EndpointProvidesLimitation
/sp/campaigns/listAll campaigns, config dataNo performance metrics
/reporting/reportsPerformance metricsOnly campaigns with activity in date range

Our solution: Fetch from both and merge by campaign name.

2. Pagination is Essential

If you have more than 100 campaigns, you must handle pagination:

python
while True:
    payload = {"maxResults": 100}
    if next_token:
        payload["nextToken"] = next_token

    res = requests.post(url, headers=headers, data=json.dumps(payload))
    data = res.json()
    all_campaigns.extend(data.get("campaigns", []))

    next_token = data.get("nextToken")
    if not next_token:
        break

3. Column Naming Matters

The API returns sales14d and purchases14d, but the Amazon Ads UI shows Sales and Purchases. We rename columns in the output to match user expectations:

python
df_perf = df_perf.rename(columns={
    "sales14d": "sales",
    "purchases14d": "purchases",
})

📅 Automation Options

Option 1: Cron (macOS/Linux)

Edit crontab:

bash
crontab -e

Add weekly run (every Monday at 6 AM):

bash
0 6 * * 1 cd /path/to/ads-direct-sdk && source venv_sdk/bin/activate && python scripts/run_weekly.py >> output/reports/weekly.log 2>&1

Option 2: Task Scheduler (Windows)

  1. Open Task Scheduler
  2. Create Basic Task
  3. Trigger: Weekly (Monday, 6 AM)
  4. Action: Start a program
    • Program: python.exe
    • Arguments: scripts/run_weekly.py
    • Start in: C:\path\to\ads-direct-sdk

Option 3: Manual Weekly Run

Simply run when needed:

bash
source venv_sdk/bin/activate
python scripts/run_weekly.py

🔐 Security Best Practices

  1. Never commit .env - Add to .gitignore:

    gitignore
    .env
    __pycache__/
    *.pyc
    output/reports/*.xlsx
  2. Use environment variables for credentials (already implemented via python-dotenv)

  3. Restrict file permissions on .env:

    bash
    chmod 600 .env
  4. Rotate credentials periodically via Amazon Advertising Console


📖 Additional Resources


🎉 Conclusion

This implementation provides a robust, production-ready solution for automating Amazon Ads data extraction. By combining the campaigns list API (for complete campaign inventory) with the reports API (for performance metrics), we overcome individual API limitations and generate comprehensive reports suitable for analysis and decision-making.

Next Steps:

  • Customize column selections in config/settings.py
  • Add additional report types (search terms, targeting, products)
  • Implement data visualization or dashboard integration
  • Set up automated weekly scheduling
最近更新