How We Test Cleaner Apps: Our Methodology

Cleanor Labs evaluates cleaner apps against five things that actually matter: whether processing stays on-device, how accurately the app detects duplicates and similar photos, whether deletions are recoverable, how much real storage gets freed, and whether the pricing is honest. We test on real iPhones with real libraries, and we report numbers, not adjectives.

We build a cleaner, so we hold ourselves to the same bar. This page documents how we judge, including our own app, so you can decide whether our conclusions are worth anything.

TL;DR

We score five criteria: on-device privacy, detection accuracy, recoverability, real GB freed, and pricing transparency.
Tests run on real devices with mixed photo libraries, not synthetic best-case sets.
"Real GB freed" is measured after emptying Recently Deleted, not just after deletion.
We treat gimmicks, fake urgency, one-tap mass-delete, inflated "junk" numbers, as marks against an app.
We disclose that we make Cleanor, and we apply the same criteria to ourselves.

What do we actually measure?

Five criteria, weighted toward trust and accuracy because those are where cleaners most often fail users:

On-device privacy. Does analysis happen on the phone, or is the library uploaded to a server? We check stated behavior, permissions requested, and whether claims match the privacy policy.
Detection accuracy. How well does the app find true duplicates and genuinely similar shots, without flagging photos that are merely related? We look at both misses (false negatives) and over-grabs (false positives).
Recoverability. Do deletions route through Recently Deleted so they stay recoverable for ~30 days, and does the app make that clear? An app that permanently deletes without warning loses points.
Real GB freed. Not the number the app advertises, the storage actually reclaimed after Recently Deleted is emptied.
Pricing transparency. Is the price stated plainly? Are paywalls free of dark patterns and fake countdowns?

How do we test detection accuracy?

We use mixed libraries that resemble real phones: burst shots, near-duplicates from slightly different angles, edited versions of the same image, screenshots, and unrelated photos that happen to look alike. Synthetic libraries with obvious exact copies make every app look good, so we avoid them.

Then we look at two failure modes:

False negatives, real duplicates the app missed. These leave storage on the table.
False positives, distinct photos the app grouped as duplicates. These are more dangerous, because a hasty user could delete a keeper.

Good detection is conservative where it should be. We would rather an app miss a borderline pair than confidently group two photos that are not the same moment. For context on why libraries get cluttered in the first place, see iphone storage full but nothing to delete: what's actually using it.

How do we measure "real GB freed"?

This is where a lot of marketing falls apart. Deleting photos moves them to Recently Deleted, where they still occupy storage. The headline number an app shows you right after a cleanup is the potential space, not the reclaimed space.

So we record device storage before the cleanup, perform the deletions, empty Recently Deleted, and measure again. The difference is the real number. We report that, and we note when an app's in-app figure overstates what you will actually get back until the album is emptied. The detail on this two-stage behavior lives in our piece on recoverability and in our overall stance in the truth about cleaner apps: are they safe to use.

What counts as a gimmick?

We mark down patterns that exist to impress or pressure rather than help:

Inflated "junk" or "RAM" claims that do not correspond to recoverable storage.
Fake urgency, countdown timers on discounts, "your phone is at risk" scare copy.
One-tap mass-delete with no meaningful review step, which trades your safety for a fast demo.
Permission over-asking unrelated to cleaning, covered in our look at what permissions a cleaner really needs.
Buried or absent pricing, which usually means another monetization model you cannot see.

A tool earns trust by being boring in the right places: clear numbers, explicit confirmations, no theatrics.

What this methodology cannot do

We test stated behavior, observable results, and permissions. We cannot fully audit an app's servers or its private data-sharing contracts, so our privacy findings reflect strong signals, on-device claims, permission scope, policy language, rather than a forensic guarantee. We say so when a verdict rests on the developer's word.

We also will not publish fabricated comparison stats. Where we lack a reliable measured number for a competitor, we describe behavior qualitatively instead of inventing a figure. And our own involvement is a real conflict of interest. We disclose that we build Cleanor, apply the identical criteria to it, and you should weight our conclusions accordingly. Our full editorial standards live on the methodology page.

FAQ

Do you test your own app by the same standards?

Yes. We apply all five criteria to Cleanor and disclose that we make it. The point of publishing the methodology is so the standard is fixed in advance and applied equally, including to us.

Why measure GB freed after emptying Recently Deleted?

Because deleted photos sit in Recently Deleted for about 30 days and still use storage until that album is emptied. Measuring afterward gives the real reclaimed space, not the optimistic number shown right after deletion.

Do you accept payment to rank an app higher?

No. Rankings follow the measured criteria. Where we cannot measure something reliably, we describe it qualitatively rather than fabricating a number, and we never invent competitor stats.

Want to see how a cleaner built to these standards performs? Try Cleanor for iPhone and see how to free up iPhone space with numbers you can verify yourself.

We Test Cleaner Apps