Absolute scale corrupts absolutely

The Internet has gotten too big.

Growing up, I, like many computery people of my generation, was an idealist.
I believed that better, faster communication would be an unmitigated
improvement to society. “World peace through better communication,” I said
to an older co-worker, once, as the millenium was coming to an end. “If
people could just understand each others’ points of view, there would be no
reason for them to fight. Government propaganda will never work if citizens
of two warring countries can just talk to each other and realize that the
other side is human, just like them, and teach each other what’s really
true.”

[Wired.com
has an excellent article
about this sort of belief system.]

“You have a lot to learn about the world,” he said.

Or maybe he said, “That’s the most naive thing I’ve ever heard in my entire
life.” I can’t remember exactly. Either or both would have been appropriate,
as it turns out.

What actually happened

There’s a pattern that I don’t see talked about much, but which seems to
apply in all sorts of systems, big and small, theoretical and practical,
mathematical and physical.

The pattern is: the cheaper interactions become, the more intensely a
system is corrupted. The faster interactions become, the faster the
corruption spreads.

What is “corruption?” I don’t know the right technical definition, but you
know it when you see it. In a living system, it’s a virus, or a cancer, or a
predator. In a computer operating system, it’s malware. In password
authentication, it’s phishing attacks. In politics, it’s lobbyists and
grift. In finance, it’s high-frequency trading and credit default swaps. In
Twitter, it’s propaganda bots. In earth’s orbit, it’s space debris and Kessler syndrome.

On the Internet, it’s botnets and DDoS attacks.

What do all these things have in common? That they didn’t happen when the
system was small and expensive. The system was just as vulnerable back then
– usually even more so – but it wasn’t worth it. The corruption didn’t
corrupt. The attacks didn’t evolve.

What do I mean by evolve? Again, I don’t know what technical definition to
use. The evolutionary process of bacteria isn’t the same as the evolutionary
process of a spam or phishing attack, or malware, or foreign-sponsored
Twitter propaganda, or space junk in low-earth orbit. Biological infections
evolve, we think, by random mutation and imperfect natural selection. But
malware evolves when people redesign it. And space junk can turn violent
because of accidental collisions; not natural, and yet the exact opposite of
careful human design.

Whatever the cause, the corruption happens in all those cases. Intelligent
human attackers are only one way to create a new corruption, but they’re a
fast, persistent one. The more humans you connect, the more kinds
of corruption they will generate.

Most humans aren’t trying to create corruption. But it doesn’t matter, if
a rare corruption can spread quickly. A larger, faster, standardized network
lets the same attack offer bigger benefits to the attacker, without
increasing cost.

Diversity

One of the natural defenses against corruption is diversity. There’s this
problem right now where supposedly the
most common strain of bananas is dying out
because they are all
genetically identical, so the wrong fungus at the right time can kill them
all. One way to limit the damage would be to grow, say, 10 kinds of bananas; then
when there’s a banana plague, it’ll only kill, say, 10% of your crop, which
you can replace over the next few years.

That might work okay for bananas, but for human diseases, you wouldn’t want
to be one of the unlucky 10%. For computer viruses, maybe we can have 10
operating systems, but you still don’t want to be the unlucky one, and you
also don’t want to be stuck with the 10th best operating system or the 10th
best browser. Diversity
is how nature defends against corruption, but not how human engineers do.

In fact, a major goal of modern engineering is to destroy diversity. As Deming would say, reduce variation. Find
the “best” solution, then deploy it consistently everywhere, and keep
improving it.

When we read about adversarial
attacks on computer vision
, why are they worse than Magic Eye drawings
or other human optical illusions? Because they can be perfectly targeted. An
optical illusion street sign would only fool a subset of humans, only some
of the time, because each of our neural nets is configured differently from
everyone else’s. But every neural net in every car of a particular brand and
model will be susceptible to exactly the same illusion. You can take a
perfect copy of the AI, bring it into your lab, and design a perfect
illusion that fools it. Subtracting natural diversity has turned a boring
visual attack into a disproportionately effective one.

The same attacks work against a search engine or an email spam filter. If
you get a copy of the algorithm, or even query it quickly enough and
transparently enough in a black box test, you can design a message to defeat it.
That’s the SEO industry and email newsletter industry, in a nutshell. It’s why
companies don’t want to tell you which clause of their unevenly-enforced
terms of service you violated; because if you knew, you’d fine tune your
behaviour to be evil, but not quite evil enough to trip over the line.

It’s why human moderators still work better than computer moderators:
because humans make unpredictable mistakes. It’s harder to optimize an
attack against rules that won’t stay constant.

…but back to the Internet

I hope you didn’t think I was going to tell you how to fix Twitter and
Facebook and U.S. politics. The truth is, I have no idea at all. I just see
that the patterns of corruption are the same. Nobody bothered to corrupt
Twitter and Facebook until they got influential enough to matter, and then
everybody bothered to corrupt them, and we have no defense. Start
filtering out bots – which of course you must do – and people will build
better bots, just like they did with email spam and auto-generated web
content and CAPTCHA solvers. You’re not fighting against AI, you’re fighting
against I, and the I is highly incentivized by lots and lots of money and
power.

But, ok, wait. I don’t know how to fix giant social networks. But I do know a
general workaround to this whole class of problem: slow things down. Choose
carefully who you interact with. Interact with fewer people. Make sure you
are certain which people they are.

If that sounds like some religions’ advice about sex, it’s probably not a
coincidence. It’s also why you shouldn’t allow foreigners to buy political
ads in your country. And why local newspapers are better than national ones.
And why “free trade” goes so bad, so often, even though it’s also often good. And
why doctors need to wash their
hands a lot
. (Hospital staff are like the Internet of Bacteria.)

Unfortunately, this general workaround translates into “smash Facebook” or
“stop letting strangers interact on Twitter,” which is not very effective
because a) it’s not gonna happen, and b) it would destroy lots of useful
interactions. So like I said, I’ve got nothing for you there. Sorry. Big
networks are hard.

But Avery, the Internet, you said

Oh right. Me and my cofounders at Tailscale.io have been thinking about a
particular formulation of this problem. Let’s forget about Internet Scale
problems (like giant social networks) for a moment. The thing is,
only very few problems are Internet Scale. That’s what makes them
newsworthy. I hate to be the bearer of bad news, but chances are, your
problems are not Internet Scale.

Why is it so hard to launch a simple web service for, say, just your
customers or employees? Why did the Equifax
breach
happen, when obviously no outsiders at all were supposed to have access
to Equifax’s data? How did the Capital
One + AWS hack
happen, when Capital One clearly has decades of
experience with not leaking your data all over the place?

I’ll claim it again… because the Internet is too big.

Equifax’s data was reachable from the Internet even though it should have
only been accessible to a few support employees. Capital One’s data surely
used to be behind layers and layers of misconfigured firewalls, unhelpful
proxy servers, and maybe even pre-TCP/IP legacy mainframe protocols, but then they moved it
to AWS, eliminating that diversity and those ad-hoc layers of protection.
Nobody can say modernizing their systems was the wrong choice, and yet
the result was the same result we always get when we lose diversity.

AWS is bananas, and AWS permission bug exploits are banana fungus.

Attackers perfect their attack once, try it everywhere, scale it like crazy.

Back in the 1990s, I worked with super dumb database apps running on LANs.
They stored their passwords in plaintext, in files readable by everyone. But
there was never a (digital) attack on the scale of 2019 Capital One. Why?

Because… there was no Internet. Well, there was, but we weren’t on it.
Employees, with a bit of tech skill, could easily attack the database, and
surely some got into some white collar crime. And you occasionally heard
stories of kids “hacking into” the school’s grading system and giving
themselves an A. I even made fake accounts on a BBS or two. But
random people in foreign countries didn’t hack into your database. And the
kids didn’t give A’s to millions of other kids in all the other schools. It
wasn’t a thing. Each corruption was contained.

Here’s what we’ve lost sight of, in a world where everything is Internet
scale: most interactions should not be Internet scale. Most instances of
most programs should be restricted to a small set of obviously trusted
people. All those people, in all those foreign countries, should not be
invited to read
Equifax’s PII database in Argentina, no matter how stupid the password
was
. They shouldn’t even be able to connect to the database. They
shouldn’t be able to see that it exists.

It shouldn’t, in short, be on the Internet.

On the other hand, properly authorized users, who are on the Internet, would
like to be able to reach it from anywhere. Because requiring all the
employees to come to an office location to do their jobs (“physical
security”) seems kinda obsolete.

That leaves us with a conundrum, doesn’t it?

Wouldn’t it be nice though? If you could have servers, like you
did in the 1990s, with the same simple architectures as you used in the
1990s, and the same sloppy security policies developer freedom as
you had in
the 1990s, but somehow reach them from anywhere? Like… a network, but not
the Internet. One that isn’t reachable from the Internet, or even addressable on
the Internet. One that uses the Internet as a substrate, but not as a banana.

That’s what we’re working on.

Literary Afterthoughts

I’m certainly not the first to bring up all this. Various sci-fi addresses
the problem of system corruption due to excess connectivity. I liked A Fire Upon the
Deep
by Vernor Vinge, where some parts of the universe have much better
connectivity than others and it doesn’t go well at all. There’s also the Rifters Trilogy
by Peter Watts, in which the Internet of their
time is nearly unusable because it’s cluttered with machine-generated garbage.

Still, I’d be interested in hearing about any “real science” on the general
topic of systems corruption at large scales with higher connectivity. Is
there math for this? Can we predict the point at which it all falls apart?
Does this define an upper limit on the potential for hyperintelligence? Will
this prevent the technological
singularity
?

Logistical note

I’m normally an East Coast person, but I’ll be visiting the San Francisco
Bay area from August 26-30 to catch up with friends and talk to people about
Tailscale, the horrors of
IPv6
, etc. Feel free to contact me if you’re around and would
like to meet up.

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here