Methods / Audit · v4

Methods and Audit

Every claim must be linked to a timestamped, testable artifact.

Filings, labels, splits, features, models, predictions, inference, portfolio diagnostics, and audit checks form one evidence chain.

50_company_public_fmp_alpha_2016_2025_v4

Leakage-safe research design

Only Past Information Enters the Forecast

Every text feature and model decision must be available before the prediction timestamp.

1Filing timestampSEC acceptance time defines information availability.

2Section parsingBusiness, Risk Factors, Legal Proceedings, and MD&A are extracted.

3Forward labelsVolatility and CAR targets begin only after filing availability.

4Rolling splitsTrain, validation, and test windows move through time with embargo purges.

5Train-window featuresTF-IDF/SVD vocabularies are fit inside training windows only.

6Validation-only tuningHyperparameters are selected using validation Rank IC only.

7Tie-aware test Rank ICTest ranking uses average ranks; constant predictions return zero Rank IC.

8Evidence hierarchyPrimary, robustness, and exploratory specifications remain separate.

9Audit boundaryCoverage, parser quality, data limitations, and claim strength are disclosed.

Key controls

Research Controls at a Glance

Control	Implementation
Event-time alignment	Filing timestamps define information availability.
Rolling OOS design	Train / validation / test windows roll through time.
Leakage control	Embargo purge and split-leakage logs.
Model selection	Validation-only Rank IC.
TF-IDF control	Train-window-only vocabulary fitting.
Incremental text diagnostic	Industry-neutral Rank IC and feature ablation.
Statistical uncertainty	Newey-West and clustered bootstrap confidence intervals.
Data snooping	Specification registry and multiple-testing report.
Parser quality	Manual review appendix for short or malformed sections.

Feature construction

Financial Text Representations

Loughran-McDonald dictionary tone and TF-IDF/SVD are built over full filing, Business, Risk Factors, Legal Proceedings, and MD&A scopes.

Feature set	Meaning
`industry_only`	Training-window industry-mean baseline.
`dictionary_only`	Dictionary-tone text features.
`tfidf_svd_only`	TF-IDF/SVD text representations.
`combined_text`	Combined dictionary and text representation.
`industry_plus_text`	Industry features plus text features.

Industry-neutral diagnosticDoes text retain information after removing split-industry means?

Industry-neutral Rank IC is a descriptive diagnostic, not a causal decomposition.

Bootstrap inference

How Uncertainty Is Reported

Split bootstrap

Inconclusive for v4 because there are only four OOS split clusters.

Event-date bootstrap

Supports a positive raw primary Rank IC interval.

Ticker-cluster bootstrap

Supports a positive raw primary Rank IC interval.

Industry-neutral bootstrap

Positive point estimate, but not bootstrap-robust.

Parser quality review

Section Extraction Is Audited, Not Assumed

2,000

Parsed section records

144

Manual-review records

144

Excluded section-level records

494

Short but included records

Item 1A and Item 7 below 100 words are excluded from section-level features. Core sections from 100 to 499 words remain included but carry a warning.

Evidence boundary

Prediction and Trading Evidence Are Judged Separately

Preregistered primary prediction

Ridge · `realized_volatility_1_20`

Rank IC0.2395

Raw p-value 0.00067; supports exploratory volatility-ranking evidence.

Preregistered primary portfolio

Monthly sector-neutral equal-weight

Sharpe-0.8539

Raw p-value 0.1147; does not establish tradable alpha.

Formal Result Boundary

Formal empirical-finance claims are blocked by data-boundary issues, not pipeline failures: mixed FMP/Yahoo data, applied-grade market-cap estimates, fixed 50-company panel, parser-quality limitations, and a small number of missing diagnostic model-label pairs.

The project should be interpreted as an applied-grade, auditable financial NLP workflow for exploratory volatility ranking.