Data Structure Documentation

Complete reference for FashFinder's data formats and structures. All data can be downloaded from the Analytics tab.

Overview

FashFinder provides three export formats for accessing classification data and pattern analysis. All exports are available from the Analytics tab in the dashboard.

Classifications Data

The main classifications data is an array of classification objects. Each classification represents a single analyzed piece of content (statement, policy, article, etc.).

Top-Level Structure

{
  "classification_id": "string",
  "timestamp": "ISO 8601 datetime",
  "source_metadata": { ... },
  "evidence_chain": { ... },
  "classification": { ... },
  "evidence": { ... },
  "context_factors": { ... },
  "analysis": { ... },
  "review_status": { ... },
  "cost_info": { ... },
  "content_type": "rhetoric" | "action" | "unknown",
  "content_type_reasoning": "string"
}

Key Fields

classification_id (string) - Unique identifier for this classification (hash + timestamp)

timestamp (string, ISO 8601) - When this classification was created

content_type ("rhetoric" | "action" | "unknown") - Whether this content describes rhetoric (statements, proposals) or actions (implemented policies, enforcement). This is independent of source type - investigative journalism can report on either.

source_metadata

{
  "type": "official_government" | "investigative_journalism" |
          "legal_document" | "mainstream_news",
  "name": "Source name (e.g., 'White House', 'ProPublica')",
  "title": "Article/statement title",
  "date": "Publication date (YYYY-MM-DD)",
  "url": "Source URL",
  "author": "Author name (if available)",
  "access_date": "When content was scraped"
}

classification

{
  "primary_category": "1-10 (string)",
  "secondary_categories": ["category IDs"],
  "severity_level": 1 | 2 | 3 | 4,
  "confidence": "High" | "Medium" | "Low",
  "confidence_rationale": "Explanation of confidence level"
}

Severity Levels:

1 - Rhetoric Without Implementation: Statements, proposals not yet enacted
2 - Oppressive Policies/Systems: Discriminatory laws, surveillance, institutional capture
3 - State Violence: Political arrests, extrajudicial killings, detention camps
4 - Mass Atrocities: Genocide, ethnic cleansing, systematic extermination

evidence

{
  "quotes": [
    {
      "text": "Exact quote from source",
      "subcategory": "X.Y.Z subcategory ID",
      "explanation": "Why this quote exemplifies the pattern"
    }
  ],
  "observable_indicators": ["Key pattern observed", ...],
  "distinguishing_features": ["What makes this fascist vs normal policy", ...]
}

evidence_chain

{
  "primary_evidence_type": "direct_quote" | "leaked_document" |
                           "court_filing" | ...,
  "primary_evidence_source": "Source name",
  "verification_status": "single_source" | "corroborated" | "official_record",
  "corroborating_sources": [
    {
      "name": "Source name",
      "url": "URL",
      "date": "YYYY-MM-DD",
      "source_type": "..."
    }
  ]
}

Pattern Analysis Data

Aggregated metrics and trend analysis across all classifications.

Structure

{
  "fascism_stage": {
    "stage": 1 | 2 | 3 | 4,
    "label": "Stage label",
    "description": "Stage description",
    "average_severity": number
  },
  "current_intensity": {
    "level": "mild" | "moderate" | "severe" | "extreme",
    "average_severity": number,
    "recent_count": number,
    "timeframe_days": number
  },
  "momentum": {
    "level": "stable" | "accelerating" | "rapid" | "critical",
    "description": "Momentum description",
    "acceleration_factor": number,
    "clustering_rate": number,
    "severity_trend": "increasing" | "stable" | "decreasing",
    "severity_change": number,
    "has_rhetoric_action_progressions": boolean
  },
  "rhetoric_vs_action": {
    "rhetoric_count": number,
    "action_count": number,
    "rhetoric_percentage": number,
    "action_percentage": number,
    "rhetoric_to_action_progressions": [
      {
        "category": "category ID",
        "rhetoric_title": "Title of rhetoric",
        "rhetoric_date": "YYYY-MM-DD",
        "action_title": "Title of action",
        "action_date": "YYYY-MM-DD",
        "lag_days": number,
        "matching_reasoning": "Why these are related"
      }
    ]
  },
  "severity_trend": { ... },
  "category_patterns": { ... },
  "co_occurrence": { ... }
}

CSV Export Format

When exporting classifications as CSV, the data is flattened into the following columns:

Columns

Date - Publication or classification date
Source - Source name (e.g., "White House", "ProPublica")
Source Type - official_government, investigative_journalism, etc.
Title - Article or statement title
URL - Link to original source
Content Type - rhetoric, action, or unknown
Primary Category - Main category ID (1-10)
Secondary Categories - Additional categories (semicolon-separated)
Severity - Severity level (1-4)
Confidence - High, Medium, or Low
Key Quote - First quote or observable indicator
Classification ID - Unique identifier

How to Use This Data

Downloading Data

Go to the Analytics tab on the main dashboard and use the "Export Data" section to download:

Classifications (JSON) - Full structured data with all fields
Classifications (CSV) - Flattened data for spreadsheet analysis
Pattern Analysis (JSON) - Aggregated metrics and trends

Common Use Cases

Research: Import JSON into analysis tools (Python, R, etc.)
Spreadsheet Analysis: Use CSV export in Excel, Google Sheets
Visualization: Load JSON into BI tools or custom dashboards
Citation: Reference specific classification_ids in papers

Data Integrity

All data includes:

Complete source attribution and URLs
Evidence chains showing verification status
Confidence levels for every classification
Timestamps for reproducibility