Skip to content

Development Guide

Prerequisites

  • Python 3.9 or higher
  • Apify account and API token
  • Google Cloud Platform project with BigQuery enabled
  • Access to the appropriate environment (dev/staging/production)

Setup on Local

1. Install Dependencies

pip install -r requirements.txt

2. Set Environment Variables

export APIFY_TOKEN="your_apify_token"
export GCP_PROJECT_ID="your_project_id"
export BRAND_INTERACTIONS_WEBHOOK_URL="http://localhost:8080/webhook/apify"
export ENVIRONMENT="dev"  # Options: dev, staging, production

3. Run the API Server

uvicorn src.linkedin_interactions_pipelines.main:app --host 0.0.0.0 --port 8080

4. Test the Endpoint

curl -X POST http://localhost:8080/apify/linkedin/interactions \
  -H "Content-Type: application/json" \
  -d '{
    "user_company_id": "12345678",
    "targets": [{"linkedin_handle": "microsoft", "entity": "COMPANY"}]
  }'

Development Workflow

Feature Development Process

  1. Create Feature Branch

       git checkout develop
       git pull origin develop
       git checkout -b feature/your-feature-name
    

  2. Develop and Test Locally

  3. Make your changes
  4. Test locally using the dev environment
  5. Run unit tests (if available)

  6. Merge to Develop

       git checkout develop
       git merge feature/your-feature-name
       git push origin develop
    

  7. This automatically deploys to the Dev environment
  8. Test in the Dev environment

  9. Promote to Staging

       git checkout staging
       git merge develop
       git push origin staging
    

  10. This automatically deploys to the Staging environment
  11. Perform comprehensive testing

  12. Deploy to Production

  13. Open a Pull Request from staging to main
  14. Get PR reviewed and approved
  15. Merge to main
  16. This automatically deploys to Production

Environment Configuration

Dev Environment

  • Purpose: Local and development testing
  • API URL: https://brand-interactions-pipeline-dev-691864719497.us-central1.run.app
  • Tracker Table: linkedin.ApifyInteractionsTrackerTableDev
  • Branch: develop

Staging Environment

  • Purpose: Pre-production validation
  • API URL: https://brand-interactions-pipeline-staging-691864719497.us-central1.run.app
  • Tracker Table: linkedin.ApifyInteractionsTrackerTableStaging
  • Branch: staging

Production Environment

  • Purpose: Live production use
  • API URL: https://brand-interactions-pipeline-691864719497.us-central1.run.app
  • Tracker Table: linkedin.ApifyInteractionsTrackerTable
  • Branch: main

Testing Tips

Testing API Endpoints

  1. Health Check

       curl https://brand-interactions-pipeline-dev-691864719497.us-central1.run.app/health
    

  2. Trigger Scraping Job

       curl -X POST https://brand-interactions-pipeline-dev-691864719497.us-central1.run.app/apify/linkedin/interactions \
         -H "Content-Type: application/json" \
         -d '{
           "user_company_id": "test123",
           "targets": [{"linkedin_handle": "microsoft", "entity": "COMPANY"}],
           "max_posts": 5
         }'
    

  3. Check Job Status in BigQuery

       SELECT *
       FROM `linkedin.ApifyInteractionsTrackerTableDev`
       WHERE user_company_id = 'test123'
       ORDER BY created_at DESC
       LIMIT 10;
    

Common Issues

  1. Webhook not receiving callbacks
  2. Check that BRAND_INTERACTIONS_WEBHOOK_URL is publicly accessible
  3. Verify Apify actor configuration includes webhook URL

  4. BigQuery permission errors

  5. Ensure service account has proper BigQuery roles
  6. Check GCP_PROJECT_ID is correct

  7. Apify job fails

  8. Check Apify token is valid
  9. Verify LinkedIn handle format is correct
  10. Check Apify actor logs for specific errors

Code Structure

src/linkedin_interactions_pipelines/
├── main.py                 # FastAPI application and endpoints
├── apify_client.py        # Apify API client wrapper
├── bigquery_client.py     # BigQuery operations
├── models.py              # Pydantic request/response models
├── webhook_handler.py     # Webhook callback processing
└── polling_job.py         # Fallback polling job

Best Practices

  1. Always test in Dev first before promoting to Staging or Production
  2. Use small test datasets (e.g., max_posts: 5) during development
  3. Monitor tracker tables to verify jobs are completing successfully
  4. Check Apify logs if scraping fails
  5. Validate request payloads match the API schema before sending