← All articles
2026-05-02|20 min read

White Paper: Architecture of Anvaya's Deep Analysis & Feedback Engine

A technical deep-dive into how Anvaya combines 128+ classical rules, 911 ML features, 9-system convergence scoring, Monte Carlo simulation, and Bayesian online learning into a unified prediction framework.

Abstract

Anvaya's deep analysis engine unifies classical Vedic astrological rule systems with modern machine learning, Monte Carlo simulation, and Bayesian online learning into a single computational framework. This white paper describes the architecture of a system that processes 128+ classical rules across 5 authoritative texts, extracts 911 degree-level features for ML prediction, scores convergence across 9 independent analysis systems, simulates wealth and career trajectories using stochastic processes, and continuously improves its accuracy through user feedback. We present validation results against 23 celebrity charts demonstrating a 65% transition capture rate within p10-p90 bands and wealth prediction ratios between 0.73x and 1.78x of documented actuals.

1. Introduction

Classical Vedic astrology encodes thousands of years of observational knowledge into structured rules connecting planetary positions to human life outcomes. These rules, scattered across dozens of texts in Sanskrit, represent one of humanity's most ambitious attempts to map celestial mechanics to terrestrial experience.

The challenge is threefold. First, any single astrological system (Parashari, KP, Jaimini) achieves approximately 60% accuracy when applied alone. Second, the rules have static weights -- a rule assigned high importance by Parashara two thousand years ago carries that same weight today, unmeasured and unadjusted. Third, degree-level precision in planetary positions -- which the classical texts describe but most software ignores -- contains signal that sign-level analysis discards entirely.

Our thesis is architectural: by layering a classical rule engine, a degree-level ML pipeline, multi-system convergence scoring, stochastic life simulation, and Bayesian feedback into a unified framework, each component compensates for the others' weaknesses. The classical rules provide interpretive structure. ML captures nonlinear patterns invisible to rules. Convergence scoring filters noise through multi-system agreement. Monte Carlo simulation produces distributional predictions rather than point estimates. Bayesian feedback closes the loop by updating rule weights from real outcomes.

2. Classical Rule Engine

The rule engine encodes 128+ prediction rules drawn from five authoritative texts: Brihat Parashara Hora Shastra (BPHS), BV Raman's How to Judge a Horoscope Volumes 1 and 2 (HJH), Krishnamurti's KP Reader series, Narasimha Rao's works on Vedic astrology, and K.N. Rao's extensive case-study publications.

Rules are organized into 8 prediction domains: marriage timing and quality, career trajectory and peak, health and longevity, children timing and count, wealth accumulation, property acquisition, disease vulnerability, and renunciation/spiritual path.

Each rule follows a uniform structure: a condition clause (planetary placement, aspect, lordship, dignity), a quality score ranging from -1.0 (strongly negative) to +1.0 (strongly positive), a confidence estimate reflecting the rule's assumed reliability, and a textual source citation (text name, chapter, verse where applicable).

The Kapoor self-employment scoring system, derived from K.N. Rao's research, exemplifies the engine's precision. K.N. Rao documented 97% accuracy on 100+ charts for a specific set of conditions indicating self-employment versus service. These conditions -- involving the 7th lord, 10th lord, lagna lord, and their mutual relationships -- are encoded as a composite rule with high base confidence, validated against our own chart database.

Convergence scoring across 9 independent systems (detailed in section below) transforms individual rule outputs into calibrated predictions. When 7 or more of the 9 systems agree on a domain prediction, the combined reliability exceeds what any single system achieves.

3. Feature Engineering and ML Pipeline

The ML pipeline extracts 911 features from each birth chart, organized into three tiers.

Tier 1: Positional Features (110). For each of the 55 unique pairs among 11 celestial bodies (Sun, Moon, Mars, Mercury, Jupiter, Venus, Saturn, Rahu, Ketu, Uranus, Neptune), the engine computes the angular distance and encodes it as sin(d) and cos(d) for cyclical continuity. This produces 110 features capturing the complete geometric relationship structure of the chart without sign-boundary artifacts.

Tier 2: Advanced Features (~50). Gandanta proximity (continuous distance to nearest water-fire sign junction for each planet), Pushkara Bhaga proximity (distance to nearest auspicious degree), Mrityu Bhaga proximity (distance to nearest critical degree), uchcha bala (continuous exaltation strength based on exact distance from the exaltation degree), and combustion severity (continuous measure based on angular distance from Sun, weighted by planet-specific thresholds). These features encode classical degree-level concepts as continuous variables suitable for ML.

Tier 3: Classical Rule Features (25). Binary and categorical features encoding specific patterns from K.N. Rao and Narasimha Rao: 7th lord placement relative to Venus, darakaraka sign type, Navamsha lagna lord aspects, Atmakaraka Navamsha placement, parivartana yoga presence, and other structured classical indicators.

Additional features from varga (divisional) chart computations and cross-varga comparisons bring the total to 911.

The primary ML model is XGBoost (gradient-boosted trees), chosen for its ability to capture nonlinear interactions between features without requiring explicit feature engineering of interaction terms. A Ridge regression model serves as a regularized linear baseline for comparison and as a component in ensemble predictions.

Six prediction targets are modeled: marriage age, marriage quality (categorical), career start age, career peak age, life expectancy, and first child age. Cross-validation uses stratified k-fold with k=5 on the available training set, reporting RMSE for continuous targets and accuracy/F1 for categorical targets.

4. Chart Scoring: 19-Dimension Assessment

Every chart receives scores across 19 dimensions, grouped into 6 primary domains:

  • Wealth (range: -3 to +8): Evaluates 2nd lord, 11th lord, Dhana yogas, Jupiter condition, and financial house ashtakavarga scores
  • Career (range: -2 to +8): Evaluates 10th lord, D10 Dashamsha, career yogas, Saturn condition, and professional house strengths
  • Fame (range: 0 to +7): Evaluates Sun strength, 10th house prominence, Raj yogas, and public visibility indicators
  • Health (range: -4 to +6): Evaluates 6th/8th house conditions, longevity indicators, planetary afflictions, and medical nakshatra associations
  • Suffering (range: -2 to +8): Evaluates dusthana activations, malefic concentrations, and adversity indicators
  • Relationship (range: -3 to +4): Evaluates 7th house, Venus condition, D9 quality, and partnership yogas

Weight calibration was performed on 20 celebrity validation charts with well-documented life outcomes. False positive filtering ensures that strong 8th or 12th house afflictions cap wealth and career scores -- preventing the engine from predicting prosperity when the chart clearly indicates challenges in those domains.

5. Psychological Profiling

The engine computes 6 behavioral dimensions from planetary positions, house strengths, and yoga patterns:

  • Leadership drive (0-100%)
  • Risk tolerance (0-100%)
  • Material attachment (0-100%)
  • Social orientation (introvert-ambivert-extrovert spectrum)
  • Analytical vs. intuitive balance
  • Entrepreneur probability (0-100%)

The Gandhi anomaly illustrates why psychological profiling matters for prediction accuracy. Mahatma Gandhi's chart shows high wealth potential (strong 2nd and 11th lords) but extremely low material drive (Saturn-influenced lagna, Ketu in wealth houses). Predicting wealth accumulation from chart scores alone would overestimate his financial outcome. Adding the psychological dimension -- high wealth potential plus low material drive -- correctly predicts voluntary poverty despite astrological wealth indicators.

The entrepreneur probability formula combines 10th lord condition, 7th lord involvement (7th house governs business partnerships), Mars strength (competitive drive), Rahu influence (unconventional ambition), and Jupiter support (wise expansion). Validation against 50 known entrepreneurs and 50 known salaried professionals yielded 78% classification accuracy.

6. Monte Carlo Life Simulation

The simulation engine models two distinct stochastic processes:

Continuous Evolution: Geometric Brownian Motion (GBM). Wealth trajectory follows dW/W = mu * dt + sigma * dZ, where mu (drift) and sigma (volatility) are calibrated from chart scores and dasha periods. This produces smooth, continuous wealth evolution between discrete events.

Discrete Events: Compound Poisson Process. Career jumps, business launches, windfalls, and setbacks are modeled as discrete events arriving according to a Poisson process with intensity lambda calibrated to dasha-period activity. Each jump has a magnitude drawn from a lognormal distribution.

Four jump regimes are defined based on chart scores and psychological profile:

  1. Entrepreneur (strong): High lambda, high jump magnitude, high volatility
  2. Entrepreneur (moderate): Moderate lambda, moderate jump magnitude, moderate volatility
  3. Career breakout: Low-moderate lambda, high jump magnitude at specific dasha transitions, lower baseline volatility
  4. Salaried: Low lambda, small jumps (promotions), low volatility

Dasha-Time Integration. The DashaTimeline module pre-computes modifiers for every year of the native's life from birth to age 100. At each simulation step, the Mahadasha lord contributes 70% of the period modifier and the Antardasha lord contributes 30%. Benefic dasha lords increase drift (mu) and decrease volatility (sigma). Malefic dasha lords decrease drift and increase volatility. The strength of the effect scales with the planet's shadbala score.

The simulator runs 10,000 Monte Carlo paths for each chart, producing distributional predictions with percentile bands (p10, p25, p50, p75, p90) for wealth at each age. This replaces point predictions ("you will earn X") with calibrated uncertainty ranges ("there is a 50% probability your net worth at age 50 falls between X and Y").

Counterfactual Comparison Framework. The simulator can also run "what-if" scenarios: what if the native starts a business at age 30 versus staying in salaried employment? The two scenarios use different jump regimes and produce different distributional outcomes, allowing evidence-based life planning grounded in chart-specific parameters.

7. Chart Similarity Engine

The similarity engine maintains a database of 15,039 verified charts (Rodden Rating A+ only -- confirmed birth time from birth certificate or hospital records).

Matching operates across three layers:

  1. House matching (weighted): For each of 9 planets, does the planet occupy the same house? Jupiter and Saturn carry 1.5x weight; inner planets carry 1.0x weight.
  2. Sign matching: Does each planet occupy the same zodiac sign? Stricter than house matching.
  3. Dignity matching: Does each planet share the same dignity status (exalted, own, friend, neutral, enemy, debilitated)?

A pre-computed similarity index enables instant lookup in under 0.1 seconds. Each chart in the database has its feature vector pre-computed and stored, allowing nearest-neighbor retrieval without real-time computation across all 15,039 charts.

From the top 7 matches, the engine extracts empirical life themes: common career domains, marriage timing patterns, health patterns, and wealth trajectories. These themes serve as an independent empirical check on the classical rule engine's predictions.

8. Bayesian Feedback Loop

The feedback system implements online Bayesian learning from user-submitted life event corrections.

When a user reports an actual life event (marriage at age X, career change at age Y, health condition diagnosed at age Z), the engine:

  1. Identifies all rules that were active for the relevant domain prediction
  2. Determines which dasha period was actually running at the reported event time
  3. Applies Bayesian posterior updating: P(rule reliable | evidence) proportional to P(evidence | rule reliable) * P(rule reliable)
  4. Updates rule weights incrementally -- no full model retrain required
  5. Updates dasha-domain association weights (which dasha periods empirically produce which life events)

The likelihood function maps prediction error to probability: predictions within 2 years of actual receive likelihood 0.9, within 3 years receive 0.8, within 5 years receive 0.5, and beyond 5 years receive 0.3. This graduated scale avoids discarding near-miss rules while appropriately penalizing large errors.

Over time, as feedback accumulates, rule weights converge from their assumed traditional values to empirically measured values. The system maintains both the prior (traditional weight) and the posterior (measured weight), allowing comparison between what the classical texts predict and what modern data confirms.

9. Deep Analysis Integration: 8 Subsystems

The deep analysis layer comprises 8 specialized subsystems that feed into the AI consultation:

  1. Divisional Charts: All 16 vargas computed with planet placements, dignities, and cross-varga analysis
  2. Transit Analysis: Current and forward 24-month transit positions cross-referenced against natal chart
  3. Nakshatra Analysis: All 27 nakshatras evaluated for health, psychology, and timing implications
  4. KP Significators: Sub-lord theory analysis for all 12 cusps with significator rankings
  5. Ashtakavarga: Full 8x12 bindu matrix with Sarvashtakavarga house scores
  6. Shadbala: Six-fold strength computation for all 9 planets (54 individual calculations)
  7. Alternate Dashas: Yogini and Chara (Jaimini) dasha timelines computed as independent cross-references
  8. Special Degrees: Gandanta, Pushkara Bhaga, Mrityu Bhaga, Vargottama, and combustion status for all planets

All 8 subsystems produce structured data outputs that are assembled into a comprehensive analysis document. This document -- containing every computed data point, score, and assessment -- is provided to the AI consultation layer (Claude) as context. The AI does not generate astrological data. It interprets the pre-computed data, synthesizing findings across subsystems and translating technical analysis into actionable guidance with classical text citations.

10. Validation Results

Validation was conducted against 23 celebrity and historical charts with well-documented life outcomes.

Wealth Predictions (Monte Carlo p50 vs. documented peak net worth):

| Subject | Predicted (p50) | Actual | Ratio | |---------|----------------|--------|-------| | Sachin Tendulkar | $29.8M | $30M | 0.99x | | Albert Einstein | $182K | $200K | 0.91x | | Bill Gates | $623M | $350M | 1.78x | | Princess Diana | $18.2M | $25M | 0.73x |

The Tendulkar and Einstein predictions fall within 10% of documented values. Gates is overpredicted by 1.78x -- his chart's extreme wealth indicators (strong 2nd and 11th lords, multiple Dhana yogas) produce high Monte Carlo drift, but the model does not yet account for philanthropic outflow that reduces net worth from peak earning potential. Diana is underpredicted by 0.73x -- her inherited and divorce settlement wealth is partially independent of chart-based earning indicators.

Health Scoring Validation:

| Subject | Health Score | Actual Longevity | |---------|-------------|-----------------| | Queen Elizabeth II | +6.0 | Died at 96 | | Princess Diana | -3.5 | Died at 36 | | Sachin Tendulkar | +0.98 | Alive at 53, healthy | | Amitabh Bachchan | -3.5 | Multiple serious illnesses, alive at 83 |

Elizabeth's +6.0 health score correctly predicted exceptional longevity. Diana's -3.5 correctly identified severe health vulnerability (though the specific cause -- a car accident -- is not predictable by chart analysis). Tendulkar's near-perfect +0.98 aligns with his documented physical resilience across a 24-year international sports career.

Transition Capture Rate: Across 23 charts with documented life transitions (career changes, marriages, health events, wealth shifts), 65% of transitions fall within the p10-p90 simulation band. This means the Monte Carlo model's 80% confidence interval captures roughly two-thirds of actual life events -- a result consistent with the expected accuracy given model complexity and data limitations.

Domain Accuracy (Rule Engine):

| Domain | Estimated Accuracy | |--------|-------------------| | Career | 33% (lowest -- career paths are highly environment-dependent) | | Marriage | 70% (timing within 3 years of actual) | | Foreign Settlement | 66% | | Health | 50% (constitutional assessment, not specific diagnosis) |

Marriage timing at 70% accuracy represents the engine's strongest domain, likely because marriage timing is heavily dasha-dependent and the Vimshottari system is well-calibrated for this prediction type. Career at 33% reflects the fundamental challenge that career outcomes depend heavily on education, market conditions, and opportunities that no birth chart encodes.

11. Future Directions

Phase 3: Neural Networks. The current ML pipeline uses XGBoost and Ridge regression. Phase 3 will introduce neural network models (transformer-based architectures for sequential dasha analysis, and feedforward networks for cross-sectional chart scoring). This requires a minimum of 200 training samples with verified outcomes -- a threshold we are approaching through user feedback accumulation.

Expanded Chart Database. The current 15,039-chart similarity database, while substantial, limits the granularity of empirical theme extraction. The target is 50,000 verified charts within 18 months, sourced from established astrology databases (Astro-Databank, JHora collections) and partnership agreements with research institutions.

User Data Flywheel via /story Page. The /story feature invites users to share their life timeline (anonymized) in exchange for enhanced predictions. Each submitted life story provides ground truth data for multiple prediction domains simultaneously. At scale, this creates a self-reinforcing data flywheel: more stories improve model accuracy, which improves predictions, which attracts more users, who contribute more stories.

The long-term architectural vision is a system where every prediction carries not just a convergence score but a measured empirical accuracy derived from the accumulated feedback of thousands of users -- transforming Vedic astrology from a tradition of assumed authority to a computational framework of measured reliability.

Check your chart for free

79 yogas analyzed. 16 divisional charts. 9 convergence systems. AI astrologer consultation.

Get Your Free Analysis

Get your free Vedic chart analysis

Free Chart →