How much does my contrast sensitivity score normally vary between tests?

Day-to-day factors like sleep, caffeine, and lighting typically move a healthy adult's reading by about plus or minus 0.10 to 0.15 log units, and the test's own measurement procedure adds roughly another 0.15 log units of noise. As a rule of thumb, any single difference under about 0.20 log units between two readings is inside the normal noise band and shouldn't be read as a real change.

How many readings do I need before a trend means something?

Two readings tell you almost nothing. Four can start to hint at a shape, but a run of four random points can look like a trend by chance. Eight or more readings give reasonable confidence in a pattern like a slow trend or a step change, and twelve or more are needed to reliably spot something cyclical like a sawtooth pattern.

I had one bad reading and then it improved after I changed something — did that fix it?

Maybe, but be cautious of regression to the mean: an unusually low reading is statistically likely to be followed by a higher one regardless of any intervention, simply because the first reading was a tail draw from your normal range. Wait for three or more readings to see whether the improvement holds or whether it settles back into your usual range.

When should I bring my contrast sensitivity trend to a clinician?

Consider it when three or more consecutive readings show a sustained drop of around 0.30 log units below your earlier baseline, or when there's a sudden step change tied to a specific event like a head injury or new medication. Any new visual symptoms, like distortion or sudden blurring, warrant seeing an eye doctor regardless of what your tracking curve shows.

How to Track Contrast Sensitivity Over Months

You took the test once. The number told you something. You closed the tab. A few weeks later you came back, took it again, and got a different number. Now what?

A single contrast-sensitivity reading is one of those measurements where the lone data point doesn't carry as much information as it looks like it does. The information lives in the series. This post is about how to read your own data once you have one, and why the judgements people instinctively want to make from two readings ("uh-oh, it dropped") are usually statistical artefacts rather than real signal.

The everyday reflex — compare the most recent number to the one before it, ask what changed — is almost the only thing that doesn't work. The view that does work is wider. It's the curve, not the dot.

Where the variance comes from

A contrast-sensitivity reading is the sum of at least four different sources of variation, only one of which is the thing you actually want to measure. Ranked roughly by how much each tends to move a single session, from largest to smallest:

Day-to-day input noise. Caffeine, sleep, time of day, fatigue, dry eye, recent screen use, lighting, hydration. These collectively move a healthy adult's reading by roughly ±0.10–0.15 log units session-to-session under casually controlled conditions. Our separate piece on caffeine, alcohol, and sleep covers the within-week version of this story; this post is about what those wobbles look like once you've stacked thirty of them.

Psychometric noise of the test itself. Even if your visual system were perfectly stable and you took the test under perfectly identical conditions, the measurement procedure is probabilistic. An adaptive threshold estimator places a small number of trials near a stochastic perceptual boundary and infers where it sits; two honest runs rarely return the same number. For the Pelli-Robson chart¹ — the most carefully studied clinical reference — test-retest repeatability lands around ±0.15 log units.² Wichmann & Hill (2001) formalise why fitting a psychometric function from a finite number of trials inherently carries irreducible uncertainty.³

Device and calibration drift. Change devices, ambient lighting, viewing distance, or screen brightness and you've changed the measurement environment. Even on the same setup, ambient light at different times of day shifts your effective baseline by perhaps ±0.05–0.10 log units. Our calibration step standardises the most important screen variables at the start of each session, but not the room around you.

Real biological change. The slope you're actually trying to read.

Stacking these gives a useful rule of thumb. Any single difference under about ±0.20 log units between two readings is well inside the noise band and should not be interpreted as a change. That doesn't mean small differences are meaningless across enough readings — averaged over many sessions, persistent shifts emerge. But across two readings, you cannot tell whether a 0.15-log dip is a real effect or a normal Tuesday.

What patterns actually look like

Once you have eight or more readings under similar conditions, the series starts to take a shape. Most shapes fall into one of a handful of patterns.

Stable. A roughly flat line with random scatter above and below a midline, wandering within a band perhaps ±0.15 log units wide with no consistent direction. The most common pattern in healthy adults, and the most reassuring.

Trending. A clear slope, up or down, visible by eye when you sketch the line. Slow downward trends across many months can reflect age-related decline — which is gradual, weighted toward the higher (finer-detail) spatial frequencies, and typically measurable only from mid-adulthood onward rather than following a fixed yearly rate⁴ — or slowly progressing ocular conditions, or accumulating environmental factors. Upward trends are more often recovery from a known insult, or learning effects in the first few sessions before the score settles.

Step change. A flat baseline interrupted by a sudden drop or rise that persists — post-step readings cluster around a new midline rather than drifting back to the old one. Step changes tend to align with identifiable events: a concussion, an illness, a medication change, a new glasses prescription, a new screen.

Sawtooth. A repeating up-down pattern that aligns with a cycle. Hormonal cycles, sleep-debt cycles, and post-exertional symptom cycles in conditions like ME/CFS or Long COVID can all leave this fingerprint. Requires enough cycles to be visible — typically three or four full oscillations.

Recovery. A clear upward trend following a known insult, plateauing back near baseline. Common after concussion (we have a week-by-week recovery post), after eye surgery, or after acute illness.

Each pattern is informative, but only with enough data. With four points, a step change is plausibly random. With ten that cluster cleanly around two midlines either side of an event, it's not easy to dismiss.

Regression to the mean

This one is worth taking seriously because almost everyone gets it wrong intuitively.

If your first reading was unusually low — bad day, rough night, no coffee — your next reading is likely to be higher, simply because the first one sat in the tail of your distribution. The same works in reverse: an unusually high reading is more likely to be followed by a lower one, because the unusually high reading isn't where your distribution typically lives.

This is regression to the mean, and it's one of the most common sources of false causal stories in any self-tracking record. Someone tests poorly, starts taking lutein supplements, tests better a week later, and concludes the supplement worked. The simpler explanation, almost always, is that the first reading was a tail draw and the second was closer to the centre — exactly what statistics predict in the absence of any intervention.

Don't read a rebound as a recovery. Wait for the trend. If a bad reading is followed by three readings clustering around your usual range, that's regression to the mean. If the next three keep climbing past your usual range, then you have a signal to think about.

Sample-size intuition

How many points before the line means something? Some rough anchors.

Two readings. Tell you almost nothing about a trend. The difference between them is one sample of the noise distribution. That's about it.

Four readings. Start to hint at a shape. Four points all sitting on one side of the line through the first one looks like a trend — but a sequence of four random draws produces this pattern surprisingly often.

Eight or more. Enough to say "the recent ones look lower than the earlier ones" with some confidence, provided the difference between recent and earlier averages is clearly larger than the within-group scatter.

Twelve or more. Enough to see a sawtooth or a slow trend you might otherwise mistake for stable.

The cadence we suggest in our self-tracking guide — weekly for the first month, then monthly — gets you to four readings inside a month and eight inside a half-year. Take the test on a consistent schedule rather than reactively after a bad symptom day. Reactive testing oversamples your worst readings and biases the series downward in a way that has nothing to do with your visual system.

When a trend is worth bringing to a clinician

The honest version of this is more cautious than self-tracking communities sometimes want to hear, because the test isn't a diagnostic instrument and a curve isn't a clinical finding.

A working rubric, framed against the established noise floors:

Trend flat or trending up across many sessions: probably no immediate concern.
A single isolated low reading: wait. Take the next one in your normal cadence. Don't make decisions on one dot.
A trend of three or more consecutive readings clearly below your earlier baseline (roughly a 0.30 log unit drop sustained, not a single 0.20 wobble): worth mentioning at your next routine eye exam, with the data in hand. A 0.30 log unit shift is the smallest change usually considered clinically meaningful for Pelli-Robson;² your home test sits in a noisier regime, so a sustained shift of that size across multiple sessions is the kind of thing a clinician can actually act on.
A sudden step change you can tie to a specific event (a head injury, a new medication, an infection): worth investigating the trigger with a clinician on the timescale the trigger warrants. A concussion-coincident step change wants prompt attention; a step change after a routine medication start probably wants a pharmacist conversation.
Any new visual symptoms — distortion, sudden blurring, flashes, large floaters, peripheral loss, or new pain — regardless of what the curve says: see an eye doctor, ideally the same day for the more urgent symptoms in that list. The curve is a tracking tool, not a triage tool.

The structure of this list is intentional: 0.20 log units is "worth noticing the next reading"; 0.30 sustained across three readings is "worth bringing into the next conversation"; new symptoms always win over a clean-looking curve.

What it can't tell you

A few honest limits, gathered into one place.

A longitudinal record doesn't identify the cause of a drop. Reduced contrast sensitivity is consistent with refractive error, dry eye, fatigue, age-related changes, cataract, early glaucoma, MS-related changes, post-concussion changes, side effects of certain medications, and other things. A clinician examining your eye can disambiguate; a graph cannot.

A trend cannot predict the future. A flat line across a year is reassurance about the year you measured, not a guarantee about the year you haven't.

One person's curve is not comparable to another person's curve. The absolute values depend on your screen, your viewing distance, your room lighting, your age, your habitual correction, and your particular wiring.⁵ These are within-person trends. Don't compare your number to a friend's; compare your number to your own past number.

The test is a screening and tracking measurement, not a diagnostic one. The graph is one input to a conversation with a clinician — not a substitute for one.

Practical setup

Three small habits make the difference between a usable record and a noisy one.

Use the share-link generator on the results page to save each session with its date. The URL itself encodes the result, so a list of share links in a notes app is your dataset; you don't need a spreadsheet unless you want one.

Eyeball the curve every few sessions. Sketch the points on graph paper if you find that easier than reading a list of numbers. The shape is what matters — flat, trending, step, sawtooth — not the absolute values.

Don't make medication, supplement, or lifestyle decisions on a contrast sensitivity reading alone. The test is the start of a conversation with your eye doctor, not a substitute for the exam they'd do in the room.

The smallest version of this

Take the test. Schedule the next one — first Sunday of next month is a good default. Don't make the same Sunday-morning decision twice. The series you build over six months will tell you something a single reading can't.

Take the test now. Come back next month. The graph is yours.

Pelli DG, Robson JG, Wilkins AJ. The design of a new letter chart for measuring contrast sensitivity. Clin Vis Sci. 1988;2(3):187–199. The Pelli-Robson letter chart, the most widely used clinical contrast-sensitivity test. (Published in Clinical Vision Sciences, which is not indexed in PubMed and carries no registered DOI, so no stable external link is available.) ↩
Elliott DB, Sanderson K, Conkey A. The reliability of the Pelli-Robson contrast sensitivity chart. Ophthalmic Physiol Opt. 1990;10(1):21–24. The actual source of the repeatability figures a home tracking record needs to be read against: Pelli-Robson scores are repeatable to within about ±0.15 log units, so a change of roughly ±0.30 log units is the smallest generally treated as clinically meaningful rather than test-to-test noise. Often mis-cited to the 1988 design paper. PubMed. ↩ ↩²
Wichmann FA, Hill NJ. The psychometric function: I. Fitting, sampling, and goodness of fit. Percept Psychophys. 2001;63(8):1293–1313. The reference treatment of why finite-trial threshold estimates carry irreducible uncertainty — the noise floor that explains why even a perfectly controlled re-test produces a slightly different number. PubMed. ↩
Owsley C. Contrast sensitivity. Ophthalmol Clin North Am. 2003;16(2):171–177. Clinical review of contrast sensitivity in adults — the age-related trajectory that informs what a slow downward trend across years can plausibly mean even in healthy eyes. PubMed. ↩
Mäntyjärvi M, Laitinen T. Normal values for the Pelli-Robson contrast sensitivity test. J Cataract Refract Surg. 2001;27(2):261–266. Age-stratified normative Pelli-Robson values — a reminder that a "typical" baseline shifts with age, though population norms are no substitute for your own past readings. PubMed. ↩

Why one test isn't enough: tracking contrast sensitivity over months