Results and Artifacts
A compact public summary of the latest applied-grade SEC 10-K text factor run: `50_company_public_fmp_alpha_2016_2025_v1`.
Run summary
50-Company Applied Pilot
| Run ID | 50_company_public_fmp_alpha_2016_2025_v1 |
|---|---|
| Universe | 50 U.S. large-cap firms |
| Sample | FY2016-FY2025 |
| SEC 10-K filings | 500 |
| Labels | 1,500 |
| OOS predictions | 4,716 |
| Feature records | 520,465 |
| Tested specifications | 568 |
Main Result
Ridge on `realized_volatility_1_20`, evaluated by ALL_SPLITS Rank IC.
Best Observed Prediction
XGBoost on `realized_volatility_1_20` reported as model-comparison evidence, not as the preregistered primary claim.
| Rank IC | 0.3133 |
|---|---|
| Newey-West t-stat | 6.8479 |
| RMSE | 0.00834 |
Model comparison
Rank IC by Model
A compact view of ALL_SPLITS test Rank IC on
realized_volatility_1_20. Ridge is the preregistered
primary model; XGBoost is exploratory.
0.3133
0.2952
0.2606
-0.0206
Bars are scaled to the largest displayed Rank IC. Negative values indicate the score ranks future volatility in the opposite direction.
Audit
Coverage and Controls
The audit trail separates raw label coverage from eligible OOS prediction coverage, discloses multiple testing, and records the applied-grade data boundary.
Data Boundary
SEC EDGAR provides official 10-K filings and filing timestamps. Market data uses a mixed FMP/Yahoo public-source stack, and market-cap-at-selection values are applied-grade estimates.
This is not a CRSP/WRDS-equivalent survivorship-free replication.
Interpretation Policy
Use this package as evidence of an auditable financial NLP research workflow and exploratory volatility-prediction evidence.
Portfolio outputs are diagnostic only; this package does not establish formal tradable alpha or provide investment advice.
Reproducibility
Run the Public Code Locally
The public repository can be cloned, installed, linted, and tested without private data. Full real-data runs require API keys and local private data directories.
git clone https://github.com/uiclxh/financial-10k-text-agent.git
cd financial-10k-text-agent
python -m pip install -e ".[dev]"
python -m ruff check .
python -m pytest
Open-source code is released under the MIT License.