How Yahoo Finance Broke My Live Stock Index — and the Cookie-and-Crumb Fix That Brought It Back

Goutham Shravan
Goutham Shravan
July 3, 2026
5 min read

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

A live serverless stock index built with Webflow — Everything Flow

Summarize with AIChatGPTClaudePerplexityGoogle AI

Quick Summary

  • I built a live, self-updating stock index for 47 Indian new-age companies — no backend, no database, nobody touching it by hand.
  • The architecture is one line: yfinance to a Python builder, JSON committed to GitHub, served over GitHub’s CDN, fetched by Webflow, rendered with Chart.js.
  • Yahoo Finance broke it with blanket 429s — the fix was realising it wasn’t rate-limiting, it was a new cookie-and-crumb handshake.
  • yfinance performs that handshake automatically, from any IP including GitHub runners — one line brought the whole feed back to life.
  • Serverless here is a feature: no infra, CDN-cached, git history as an audit log, and a health guard that never overwrites good data with a broken run.

This time I was building a live stock index. Not a mockup with fake numbers — a real, self-updating index that tracks 47 of India’s new-age tech, consumer and fintech companies against the NIFTY 500, live, with nobody touching it by hand.

It mostly worked on the first try. Then the data source broke in the most interesting way possible, and the fix taught me more than the rest of the build combined. This is the full story — the idea, the architecture, the wall, and the plumbing that ties GitHub to Webflow.

The idea

The brief was a public landing page for the Z47 “FortySeven” index. It had to show, always fresh:

  • The index value and returns across 1M / 3M / 6M / 1Y / YTD / since-inception
  • The same for the NIFTY 500 benchmark
  • Sector composition (47 companies split across Consumer Tech, Fintech, SaaS/AI, B2B)
  • Top gainers / laggards, largest constituents
  • USD/INR

The hard constraint: no backend to babysit, no database, no manual updates. The page should look designed (it’s Webflow) but behave like a live product.

The architecture, in one line

yfinance → Python builder → JSON committed to GitHub → served over GitHub’s CDN → fetched by Webflow → rendered with Chart.js.

There is no server anywhere in that sentence. That’s the whole trick, and I’ll come back to why it’s nice.

The GitHub repo (the “backend”)

The data side lives in one small repo. Three things matter:

  • build_z47_json.py — the builder. It fetches prices for all 47 constituents + the benchmark + FX, computes the index off a fixed divisor, and writes two files: z47_index.json (the live feed) and z47_history.csv (the daily history that anchors the index).
  • z47_index.json — the output. This single file is the API. It holds the index value, both return sets, every constituent (price, day change, 1-month return, market cap, sector), the movers, the sector rollup, and the history series for the chart.
  • .github/workflows/refresh-z47-feed.yml — the automation. A cron runs the builder on a schedule and commits the refreshed JSON straight back into the repo:
on:  
	schedule:    
		- cron: "30 4-9 * * 1-5"   # hourly, ~10:00-15:00 IST (market hours)    
		- cron: "45 10 * * 1-5"    # ~16:15 IST, the settled close  
	workflow_dispatch: {}
permissions:
	contents: write
jobs:  
	refresh:    
		runs-on: ubuntu-latest    
		steps:      
			- uses: actions/checkout@v4      
			- uses: actions/setup-python@v5        
			with: { python-version: "3.12" }     
			- run: python3 -m pip install --quiet --upgrade yfinance      
			- run: python3 build_z47_json.py --write-history     
			- run: |         
				git config user.name "z47-feed-bot"          
				git add z47_index.json z47_history.csv          
				git commit -m "data: refresh feed ($(date -u +'%Y-%m-%dT%H:%MZ'))" || echo "no changes"
				git push

Because every refresh is a commit, the git history doubles as an audit log of the index over time — I can git blame any number on the page back to the exact run that produced it.

The wall

One morning the whole thing went dark. Every request to Yahoo came back 429 – Too Many Requests.

My first move was to look for another source. I found one that was otherwise solid, but it hadn’t updated the latest IPO prices for 3 of the companies in the basket — for an index, that’s disqualifying. So that door closed too, and I went back to Yahoo by evening.

Second move: assume I was being greedy. I hardened the fetch — realistic browser headers, rotating between Yahoo’s hosts, exponential backoff that honors Retry-After, lower concurrency. Still 429. Then I ran a plain curl from my home connection — 429. Friends on completely different networks tried — also 429.

That’s when it clicked: this wasn’t rate-limiting me. Yahoo had quietly changed the rules. The public endpoints now reject bare, unauthenticated requests across the board. They expect an authenticated cookie + crumb session: you first hit Yahoo to receive a session cookie, then request a short “crumb” token, and you must send both on every subsequent call. No handshake, no data — for anyone.

That reframed the problem entirely. It wasn’t “find a cleaner IP” or “slow down.” It was “perform the handshake.”

Pro Tip

When an API slams the door, read how the rejection works before reaching for a paid alternative. A 429 across every IP isn’t rate-limiting — it’s usually a protocol change. The block turned out to be a handshake I simply hadn’t performed yet.

The fix: yfinance

Instead of reverse-engineering the cookie/crumb dance by hand (or paying for an API), I checked whether the yfinance library already did it. It does — automatically, on first use.

I tested it from the exact IP that was 429-ing on curl minutes earlier. Live prices came straight back — around 1:30 AM, after a full day of fighting it. That was the moment the project turned. So I rebuilt the fetch around yfinance:

import yfinance as yf

t  = yf.Ticker("RELIANCE.NS")
fi = t.fast_infoprice      = fi.last_price        # live price
prev_close = fi.previous_close    # real previous close
mcap       = fi.market_cap        # live market cap

hist = t.history(period="max", auto_adjust=False)  # daily series for the chart

Two nice side effects:

  • It works from any IP, including GitHub’s runners — so the automation came back to life with a single pip install yfinance step.
  • It’s the same library the source-of-truth dashboard uses, so our numbers now match theirs to the decimal.

The correctness pass (unglamorous but crucial)

Once data flowed again, I fixed everything the old bare-fetch had quietly gotten wrong:

  • Day change now comes from the real previous_close. The old code was reading a multi-year range’s opening close and reporting it as a “daily” move — which produced absurd +100%-style day changes.
  • Market cap comes from Yahoo’s live value. The two US-listed names are converted USD→INR so sector weights and the sort order are actually correct.
  • Trading calendar is rebuilt from the constituents themselves. The benchmark had occasional missing days, and using it as the calendar was silently dropping real sessions from the chart.

A health guard protects the file:

if priced_ok < len(tickers) or not benchmark_ok:
   sys.exit(1)   # abort WITHOUT writing

if priced_ok < len(tickers) or not benchmark_ok:    
	sys.exit(1)   # abort WITHOUT writing

If all 47 don’t price cleanly, the run aborts before overwriting — the page keeps the last good data instead of ever flashing something broken. A failed run is a no-op, not an outage.

The Webflow frontend fetcher

Here’s the part that makes it feel live without a backend. GitHub serves z47_index.json over its CDN via the raw URL — CORS-enabled and cached — which turns the repo into a free, fast, static API. The Webflow side just reads that URL on page load:

var FEED_URL = "https://raw.githubusercontent.com/.../z47_index.json";

fetch(FEED_URL, { cache: "no-store" })  
	.then(function (r) { if (!r.ok) throw 0; return r.json(); })  
	.then(paint)                       // draw Chart.js + fill cards/tables  
	.catch(function () { paint(FALLBACK); });  // inline snapshot fallback


Each visual is its own self-contained embed — the performance chart, the sector donut, the 1-month movement bars, the live constituents table. A few constraints shaped them:

  • Webflow caps custom embeds at 50 KB, so each one carries the Chart.js loader, its render logic, and a downsampled inline snapshot of the data as an offline fallback. If the live fetch ever fails, the embed paints the baked snapshot instead of showing an empty box.
  • Everything is attribute-driven (data-z47="..."), so the engine fills the design rather than the design hard-coding numbers.
  • It’s responsive, with the charts re-sizing on tab switches and breakpoints.

So the client-side flow is: fetch one JSON → parse → paint. No API keys in the browser, nothing to leak, nothing to rate-limit.

Want a live, self-updating product page built on Webflow — data, charts, and all? Everything Flow builds Webflow sites that behave like live products.

Book a call

Why serverless here is a feature, not a shortcut

  • No infra: GitHub runs the job and hosts the file; Webflow hosts the page. Nothing for me to keep alive.
  • Cheap and cached: the feed is a static file on a CDN. It absorbs traffic spikes for free.
  • Auditable: every data point traces to a commit.
  • Fail-safe: a broken fetch never overwrites good data, and the front end falls back to a baked snapshot.

The only moving part is a scheduled script that commits a JSON file. If you want the deeper version of this idea — running real logic alongside a Webflow site without a separate stack — I wrote about how Webflow has a built-in backend. And for the economics of leaning on a CDN, Armory’s biggest week cost us $25.

Lessons

When an API slams the door, don’t reach for the credit card first. Read how the rejection works. Half my best decisions on this build came from understanding the protocol rather than routing around it — the “block” turned out to be a handshake I simply hadn’t performed yet.

And keep the architecture boring. A cron, a JSON file, and a fetch got me a live financial product with no server to maintain.

Keep the architecture boring. A cron, a JSON file, and a fetch got me a live financial product with no server to maintain. Boring scales.

Live: z47.com/z47-forty-seven
Stack: Python · yfinance · GitHub Actions · Webflow · Chart.js

Suggested ReadWebflow has a built-in backend — you don’t need third-party dependencies

Suggested ReadHow we handled Armory’s biggest week: 6.1M requests for $25

Build a Webflow page that behaves like a live product

We design and develop Webflow sites wired to live data — dashboards, indexes, and product pages that update themselves without a server to babysit.

Get in touch

Share this post
Copied!