Replacing the Domain Name System

Introduction

The domain name system is a hierarchical naming scheme which tries to guarantee that any given name is globally unique and globally resolvable, and can therefore be translated from a name into a single, well-specified value. It is most commonly used to translate human-readable machine names, such as MIT.EDU, into IP addresses, such as 18.72.0.100, although it has other, less-common uses. It accomplishes this by dividing up the space of all possible names into a hierarchy -- a tree. Different branches of this tree may belong to different administrative entities, and each branch can further delegate responsibility. Thus, MIT.EDU can be given responsibility for most of the MIT namespace, but AI.MIT.EDU, LCS.MIT.EDU, and MEDIA.MIT.EDU can each be given responsibility for their own subtrees. Mapping a name to a value, in principle, starts at the root of this tree, following the chain of delegations down a branch, until it reaches a leaf or concludes that no such name can possibly exist.

Unfortunately, while this is a wonderful idea in a world of academia, nonprofits, and gentlemens' agreements, it's becoming obvious that in the real world of money, greed, bad actors, lawyers, and intellectual property land grabs that it really isn't a very good system at all. Frankly, it's pretty terrible.

What's broken?

Intellectual property. Companies try hard to establish branding in the real world, and are reluctant to give it up on the net. This means that they are highly motivated to grab, and hold, any domain name that seems related to them. As a result, many companies have grabbed every possible domain name (e.g., coke.com, coke.net, coke.org) regardless of the original intent of the different top-level domains. They have also grabbed similar names in other countries' registration systems, and have often very aggressively prosecuted others who appear to have already grabbed names that they desire. Many have also grabbed every possible permutation of every product they sell or might sell in the future -- many large companies have hundreds of domain names, most completely inactive, just to keep someone else from acquiring them.

A recent, very high-profile case concerned etoys.com, which, despite being created years after an artist's collective called etoy.org (note no "s"), attempted to strip etoy.org of its domain name. Current IP-resolution policies at NSI and ICANN aided and abetted this behavior; etoy.org lost its longstanding domain name essentially instantly, and nothing but a frankly amazing PR compaign (and media support) succeeded in changing etoys.com's mind about the issue. (If etoys.com had not relented, etoy.org would still be without its domain name.)

There are any number of other bad examples, most of which never make the press in the first place. In addition, such land grabs lead to very rapid exhaustion of the DNS namespace -- NSI encourages everyone to grab 3 domain names at once (.com, .net, .org), and why not? NSI will make three times as much money registering them. Finally, the land-grab mentality has lead to preemptive grabs of just about every single word in the English language, mostly by squatters hoping to sell those names to the highest bidder.

The current situation is encouraging land grabs for two reasons:

In the absence of a trademark dispute, the first comer gets everything -- no matter whether that first comer is buying one name or half of the English language.
In the presence of a trademark dispute, whoever isn't the large company loses instantly and with virtually no chance for a realistic appeal -- regardless of how long they've had the name, whether they had the name in good faith or not, whether or not they've even heard of the company with the beef, and so forth.

So we have a situation in which everyone is suing everyone else, the address space has been exhausted in a very short amount of time, and nobody is having any fun.

Political chokepoint. The DNS is currently serving as one of the most important political pinch-points on the net because it gives those with an axe to grind a central place to exert political pressure. If what comes out of your domain is sufficiently unpopular with the local government, it can probably arrange to get your domain name mapping yanked. Furthermore, it gives unwarranted authority to domain name registrars to dictate what a name may be. (For example, NSI has for years had a set of undocumented and inconsistent rules about names -- often, they can't be "obscene," in the capricious and unappealable eyes of NSI, but many exceptions appear to have slipped through the cracks. NSI also enforces rather draconian length limits -- on the order of 25 characters or so -- even though the DNS technical specs allow 256. Now that other registrars have finally been allowed to appear, these restrictions may go away -- or we may see a random patchwork of other restrictions, or even a lowest-common-denominator subset of the most restrictive policies of the biggest registrars. Anything's possible.)

Little guys. Despite all this, what happens when you're trying to find the Acme Hardware store just down the street? Unless they're affiliated with a national brand, and unless they have a store locator, you'll never figure out what their domain name is -- there are just too many other Acme's, and no obvious way to pick out the one you wanted. If you actually want to use geographical information to help find a local entity, the situation can be grim, even if the site you're looking for mentions their location in their web pages. A few search engines can sometimes help, but for the average end-user, the domain might as well not exist. Why are search engines not the answer?

Not everything you want to find is a web page! If you want to find the domain someone owns in order to send them mail, you'd better hope that they bothered to create some web pages anyway -- otherwise, search engines never know they're there, and you can't find their domain. Even though all you really wanted to do is to send them some mail, not read their web pages.
Search engines don't cover the entire web. This is a critical problem, and especially for small, poorly-connected sites (for example, there was a period when AltaVista spidered only the first 40 web pages in a site it encountered! Things are not very much better now for any of the engines.
Many search engines are many months behind. So a new, poorly-connected site might as well not exist. This problem appears to be getting worse, not better.
Too many dead-ends. By the very nature of the way people use search engines, if a site isn't in the top ten or twenty matches and can't easily be disambiguated by guessing at search terms, it might as well not exist. Metasearch engines, which search many other search engines in parallel, help a little, because each search engine has its own strengths and weaknesses. But they don't help enough -- and besides, very few random users use metasearch engines, have the patience to check more a than a page or two into search results, or the have the skill to choose search terms well, even if the site itself uses keywords and meta information well.

Anonymity. Sometimes it's desireable to be able to say things anonymously. The authors of the Federalist Papers, back around the time of the US Revolution, published many of their political tracts anonymously. Yet domain names, by virtue of their hierarchical arrangement, are easily traceable back to someone higher in the tree who is, by definition, "responsible" for the delegation farther down the tree -- and who can often be pressured to revoke the delegation, therefore effectively seizing control of the domain name and shutting it down. Thus, it's essentially impossible to protect a domain name from retribution while simultaneously advertising its existence to potential correspondents. (Of course, it's also difficult to hide how packets get routed to and from a domain name -- and hence one's upstream packet provider may also be a point of political pressure -- but it is easier to change packet routing providers than it is to change a domain name, precisely because changing packet routing is not something that end-users see, while changing a domain name requires notifying everyone who knew the old one what the new one is.)

Thus, freedom of expression has been greatly compromised by the existing DNS. Being unable to publish anonymously online means that the online world has deprived US citizens of a right that has been repeatedly upheld by courts in the real world. CFP isn't just about privacy -- it's also about other freedoms, such as freedom of speech.

Anonymous speech about unpopular subjects is thus often limited to getting an account on someone else's system (e.g., Hotmail, etc) who is large enough that you won't be noticed, and important enough that no one will turn off their domain just because of you. And you'd better pick a free service, since otherwise following the money will expose you. Certain solutions have been proposed for the special case of publishing (such as the Freedom Project), but this is only a partial solution.

A modest proposal

So how do we fix these problems? Easy. Throw away the existing domain name system and start over. Okay, maybe not so easy.

The fundamentally-anarchic polity which influenced the early design of the Internet is, alas, fundamentally incompatible with any global namespace. Why is this so? Because global namespaces require enforcement against duplication. So the first principle of a new naming system is to permit duplication of names. This is, after all, how names in the real world operate -- very few people or businesses have globally unique names. Instead, they are disambiguated locally, using several methods (geography, profession [for people] or market segment [for businesses], etc), while remaining globally ambiguous.

If we do this, then the hierarchy that is fundamental to the old DNS -- established precisely so that searches were fast and duplication was impossible -- is no longer strictly necessary. So get rid of it.

Therefore, design a name system which encourages this proliferation of names. Resolution of conflicts becomes important, as well as the principle that some names won't be resolvable. This principle of irresolution is in fact the main privacy protection in such a system. This kind of naming system foundationally must support private name systems. The enforceability of the privacy of a private group of names become a central technical question.

Names

We need a name for this new system. Calling anything "New" is a mistake -- the Common Lisp "New Error System" is now over 20 years old. So NewDNS is right out. For lack of a better name, let's call the new proposal Smoosh, since it smooshes all the hierarchy out of the existing DNS. Because this is such a terrible name, it will hopefully inspire someone to come up with a better one.

Brave new world

What will life with Smoosh be like?

Names are no longer unique. This means that, e.g., just a bare URL or machine name might be ambiguous -- at least at first -- the first time a user types it into a machine.
Names are no longer always resolveable. This means that some names just can't be resolved without some additional information. If that information is itself hard to compute (say, it is cryptographically-based in some way), then we have a domain name which requires a sort of authorization key to find again. This could have interesting implications for privacy or for anonymous speech.
Land grabs are much more difficult. After all, in the real world, we don't see powerful corporations attempting to grab the name Coke at the intersection of every street in the country -- or suing individuals who may happen to be named similarly to a product they sell. In part, this is due to the dramatic expansion of the namespace possible when global uniqueness is not the first principle of the design.
Everyone can register an unlimited number of names for free. After all, they are no longer globally unique. This means that (a) DNS registrars like NSI are out of business, and (b) there is no reason not to let users create whatever names they want, along with local hints to help in their disambiguation when the names are shared with others.
Routing is unaffected. If all we are replacing is the DNS, then when we finally do manage to resolve a name into an IP address, the packets actually flow in the same way they always did -- routers don't care about human names, but only about IP addresses. So the bread and butter of the Internet -- routing packets from one machine to another -- doesn't change at all.

Implementation

How might we implement this rather radical proposal?

Clearly, since we are disambiguating names locally, we expect that computers that are "nearby" each other in some way are probably engaging in a collaborative dialog to determine what a name "means" to a user and in the local neighborhood of machines.

So we can probably plan on a few megabytes of persistent storage in each computer to hold the (necessarily dynamic) representation of local relationships. This capacity constraint led to the centralization present in DNS. This constraint is no longer with us, and any successor system should feel free to break with that tradition of paucity. In addition, once a name is resolved, the changes are probably very high that the same name in the future should be resolved the same way -- we should cache the results locally, both to help the local machine and probably to help neighbors which might need to determine the same thing.

The resulting system might look like a patchwork of cached information about relationships between clusters of machines. Within a patch, resolution might be very fast and require very little addition information. Across a patch, something like a negotiation may be required -- followed by some cached information which (a) might tend to merge the patch (or not), and which (b) might look something like a treaty or a trade agreement, implemented at network speeds and maintained as names change and hosts move by the caches managed by the cooperating machines.

We should also plan some extensions to the user interface, since users may have a pretty good idea of how to disambiguate a reference (they may know the geographic location, etc, of the name they just typed in). Exactly what form this takes is a big question and depends a lot on how the rest of the system is implemented.

We also want to give users the ability to manipulate these mappings -- after all, if names are no longer globally unique, every user and every machine should feel free to invent any number of names in any way they want, somehow including resolving additional information that may help to disambiguate the name later. We have now empowered random end-users, effectively, to be able to register any number of domain names, for free.

We might want to ensure that any two identically-named, but different, SmooshNames can be told apart from each other, even if we don't know which one is the one the user wanted. Each one could have a unique 256-bit random bitstring associated with it, for example. (Relying on this, and not IPv4 32-bit addresses, or even IPv6 128-bit addresses, allows SmooshNames to change the mapping information that they're being used for without losing their identity if this information is changed.) So I can change my IP address (perhaps dynamically, every time I connect, via DHCP), but people who already know my SmooshName aren't affected by this.

A sample scenario or two

These sample scenarios are necessarily very rough, since the details of how this whole scheme might work have yet to be worked out. (Working them out, and figuring out whether the resulting system is better or worse than what we have now, is, after all, what we're trying to do.) But consider these sketches:

Finding an individual

Sally Smith tells Harry Jones her SmooshName -- let's say it's just SallySmith. (This is like a domain name, but since we're getting rid of the domain name system, we need new kinds of names.) It's just her personal website, and she's just a normal person. In particular, this probably means:

She wasn't the first Sally Smith to want a domain.
She can't come up with a creative, unallocated name that anyone who knows her would ever think to try.
Even if she has some web pages, it's going to be very, very difficult for someone using a search engine to find them, even if they already know a fair bit about her.

In other words, she's just like most of the millions of ordinary people currently out there on the net. We see this problem all the time with lots of other systems that must name large numbers of users -- if you ever got the first 10 names you thought of on AOL, Hotmail, or eBay, you're unusual, either because you were a very early adoptor, or because you picked a weird name that few who know you would have guessed a priori.

On the other hand, Sally gave Harry her SmooshName for a reason. Probably they're friends, or have friends or business acquaintances in common, or Harry knows some other disambiguating information about her.

So Harry tells his machine to connect to the SallySmith SmooshName. If this were the old, bad world of the DNS, he'd need more structure in that name -- that's not a top-level domain. And if he had it, his machine would (in essence) start at the DNS root and walk down a hierarchical namespace until it found it. But this is the new world of Smoosh, so instead Harry's machine engages in a series of dialogs with machines it's already talked to in the past. It sends out a limited range broadcast (which dies out after 5 or 10 hops) asking everyone it knows to (a) spread the broadcast (within that limited diameter) and (b) answer if they're ever heard of SallySmith. Since Harry and Sally share some human connections in common, chances are their respective machines do, too.

Harry's machine gets back 5 matches for SallySmith. Four of them are the same SmooshName (because it's got the same 256-bit random number associated with it, say), and one isn't. Chances are, those four that are the same are likely to be the right SallySmith, but only if she's more popular with the set of Harry and Sally's friends than, say, some large business coincidentally called SallySmith that they all do business with. We're assuming that everybody's machine is already connected via high-speed links (should be true enough by the time Smoosh could possibly be deployed), and so this whole process just took about 3 seconds or so. (Besides, if machines are constantly propagating updates, Harry's machine might already have had most or all of the information it needed without making a broadcast at all.)

Harry's machine presents him with the choices, including a little graphical map of the various connections it was able to divine about who knows SallySmith, and Harry makes a choice. Making this choice binds an IP address to the SmooshName on Harry's machine, which then caches that information and contacts Sally's machine using normal IP routing mechanisms. If Harry discovers that he guessed wrong, he tells his machine, which unbinds that cached IP address and tries another choice. If the choice was right, Harry's machine remembers the information permanently (across boots).

This process was definitely interactive -- Harry didn't just type in a hostname and be done with it. On the other hand, it was probably faster, easier, and more likely to be successful than if Harry had to try several search engines, name-lookup services, and so forth, to find Sally's domain. Obviously, Sally could have just given Harry an already-disambiguated, old-style domain name -- but chances are she would have had to give him a name like SallySmith254237 because there are so many other people with her name and because all the good words are taken. Doing that would have meant he had to write it down -- and woe be to him if he loses that scrap of paper -- the current DNS makes recovering this information much harder than calling 411 to recover a forgotten (but listed) phone number.

Creating some entries

When Sally wanted to create her SallySmith SmooshName, she simply told one of her computers about it. It propagated this information to most of the machines she's talked to in the past, and continues to do so in the future. Now, it becomes fairly easy for people who are "near" Sally in social connection space to figure out her IP address from her SmooshName. She can create any number of these names. If she chooses names are are particularly already likely to have been used by others (such as just "Sally"), well, she's making life harder for people who might like to use that SmooshName, because they're going to get lots more choices when it comes time to disambiguate. But they'll still be able to find her. And if they know she's got lots of related SmooshNames, they can tell their machines, "Use the Sally SN that's associated with the SallySmith SN you already know." So now Sally can hand out a variety of related names, perhaps one each for her large flock of machines, and people can still find individual machines with very few choices having to be made.

Anonymous speech

Joe wants to hand out anonymous leaflets. He can do this on the street corner, but life is more difficult online. He doesn't want to use Blacknet or otherwise spam the world, but he wants to make sure that his pamphlets are nonetheless available for people who come looking for them -- without necessarily tying his pamphlets to a particular sponsoring organization.

In the old, DNS world, this would be tricky. He might be able to use something like Freenet, but only if he doesn't care about being able to later talk with people who read his stuff (Freenet is only a publishing medium). And simply putting up his web pages might get him in trouble. After all, if AOL doesn't like them, they'll remove them. And if he puts his pages up at his company's web pages, (a) they might disapprove and fire him, and (b) the very fact that Joe's domain name is in the authority hierarchy of his employer makes an implicit statement that the employer has something to do with the webpages. This is not what Joe wants, and also strips him of a lot of his potential anonymity.

So instead, Joe makes a SmooshName called Pamphleteer. (He could have chosen Anonymous, but that's likely to be very common and just makes life difficult for others who want to disambiguate the name, by giving them too many choices.) He tells his machine to advertise the name. Now, anyone who is "nearby" in social space can find that mchine. The name is no longer associated with any particular company or ISP, so many of the implicit statements there were being made by that name are gone.

Of course, it's hard for Joe to spread his message both anonymously and widely, because there are probably a lot of Pamphleteers, and so if he sends a message to a big newsgroup or mailing list giving people that name, it may take them a while to find him. On the other hand, he has some choices:

He can create a number of SmooshNames and advertise them all, saying that anyone who looks up the union of several of them is likely to find him.
He can enter into agreements with certain well-known sites, like the EFF or EPIC (or the FBI -- depends what he's trying to say) such that anyone who has one of those sites in their social network is more likely to see Joe's site. (Basically, Joe's machine advertises to, say, EFF's machines, and they spread the mapping to anyone who's machines have also talked to EFF's machines. So Joe's SmooshName is more likely to be trivially disambiguated by people who already know about the EFF.)

Note that, once Joe's IP address is discovered, he's lost some anonymity because where that points can tell you a lot about who Joe might be. On the other hand, (a) Joe could simply keep changing IP addresses [his readers would never know], (b) Joe could have one of a zillion mostly-anonymous [and probably dynamic] addresses on a big hosting service, (c) Joe could be using one (or several) addresses at friends, sponsors, etc, but still not necessary reveal all of this up-front and permanently by advertising it in his domain name, or (d) the IP address that is looked up could be an explicitly-anonymous hosting service run by someone else -- and if Joe becomes unhappy with them, he just changes which one he uses, again without having to change his SmooshName in any way. SmooshNames, because they don't address routing, thus don't solve all anonymity problems -- they just solve their own piece and make anonymity a little bit easier.

A word about these scenarios

These are all very sketchy. A huge number of implementation and operational details have been omitted, and it's not even clear that even the outlines of this mechanism are the right ones. But hopefully these scenarios have given the barest taste of how this mechanism might work in practice.

Acknowledgments

The fundamental idea here, namely throwing away the current DNS hierarchy in favor of a more anarchic system, is due to personal correspondence with Eric Hughes. While he was not able to spend the time necessary to develop this idea himself for the workshop, nor to attend CFP, he has graciously allowed its development on his behalf. The potential implementation details, the scenarios, and just about everything else besides the basic one-paragraph idea of replacing the DNS with a non-unique naming system, have been generated by the workshop organizer. Objections to the way they've been worked out should be addressed at the workshop, not to Eric -- and fixed at the workshop, if we can.

[DNS] [Cash] [Business]

[Home] [Projects] [Resources]

Lenny Foner

Last modified: Thu Apr 13 18:14:55 EDT 2000