I Rank on Page 1 -- What Gets Me Cited by AI? Position-Controlled Analysis of Page-Level and Domain-Level Predictors of AI Search Citation
Description
Generative Engine Optimization (GEO) aims to improve content visibility in AI-generated search responses. Prior observational studies have failed to isolate page-level signals because domain identity alone predicts AI citation at AUC = 0.975, confounding every between-domain comparison. We introduce a position-band matching design that controls for Google ranking position, asking: among equally-ranked pages, what page-level features predict AI citation? Using 250 queries across a balanced grid (5 intent types, 10 verticals), we collected citations from ChatGPT, Perplexity, and Google AI Mode and crawled 10,293 unique pages with 66 structural, semantic, and content-quality features. Within position bands, content features and domain identity provide comparable predictive power (content AUC = 0.673, domain AUC = 0.687 with enriched representations, combined AUC = 0.697), a convergence that contrasts sharply with the domain dominance observed without position control (AUC = 0.975). The top actionable predictors are comparison structure (d = 0.43, significant across all five intent types), query-term coverage (d = 0.42), subheading depth, statistical data density, and the absence of first-person/blog tone. Content structure provides the largest marginal lift beyond rank position (+0.021 AUC). In a second contribution, five domain-level tests reveal that SERP co-occurrence (topical breadth) is the strongest domain trust predictor (rho = 0.341, p = 2.6 x 10^-70), that cited domains are *less* lexically unique than their SERP competitors, and that a combined domain model achieves AUC = 0.921, with SERP presence accounting for 63% of importance. High-SERP-presence domains are cited more per appearance (2.04 citations per slot for 8+ appearances vs. 0.665 for single-appearance domains), confirming this is not merely an artifact of increased exposure. Data and code are publicly available.
Files
what-gets-me-cited-by-ai.pdf
Files
(389.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:044d2b5b6780c0d2cec99225b27eaade
|
389.9 kB | Preview Download |
Additional details
Identifiers
- Other
- aixiv.260403.000002
Related works
- Is supplemented by
- Dataset: 10.5281/zenodo.19398158 (DOI)