Google 2.0: Why MIT scientists are building a new search engine
The truth is a messy business, but an information revolution is coming. Danny Hillis and Peter Hopkins discuss knowledge, fake news and disruption at NeueHouse in Manhattan.
Peter Hopkins: Among other projects—you're doing lots of stuff—you get involved in some very heady questions about the origins of truth on the internet. And this is where we're getting folks because the work that Danny's describing now in theory ultimately became a venture, right? Metaweb.
Danny Hillis: So that's right. So what I really thought is that what we need to do is have a way of representing the knowledge of the world in a way that machines can get at them, and take advantage of it—and that that should be shared. Everybody should be able to get at it. That is, in some sense if the human knowledge isn't a shared resource—then what is? I mean what has civilization been doing all these years? So I created a company that built this database called Freebase. It was a free database. And the company basically took any kind of public knowledge that we could get, information about anything and put it in machine-readable format.
We were kind of creating with the idea that this is going to be useful to the world. We didn't really have a business model. And we started building it up, and then it became useful to lots of different people including particularly all the search engines. So eventually Google bought it, of course. And then I got Google to agree to keep it open for three years, but they only kept the part that was already open open, and they started building it up. And so now Google has something called the Knowledge Graph which is the evolution of this. And it probably has about 100 billion different entities. So everybody in this room is in that graph. This building is in that graph.
Peter Hopkins: Yes, I took a screenshot earlier of when you just Googled NeueHouse, and all of these different—
Danny Hillis: That's right. NeueHouse is obviously in the graph. So this event is, and yes. So anything like a person, a place, an event. Anything like that is in this huge knowledge base, and all the relationships between them are. So when you, for instance, print out a Google map, that is rendered from the Knowledge Graph; so the Knowledge Graph knows the bus schedules and it knows the address of the restaurant and the traffic.
Peter Hopkins: It's drawing all this information together around the thing that the searcher cares about.
Danny Hillis: That's right. So the map is just in some sense a custom rendering of a piece of the Knowledge Graph for your particular purpose. And also by the way, I don't know – this doesn't have any ads on it, but the other thing is that the ads are also like a lot of Knowledge Graph about what the products are about and whether—it probably has knowledge about you, specifically, and so on. So it's gone way beyond the kind of public knowledge, also again it probably has very particular private knowledge about people too.
Peter Hopkins: Now, from Google's perspective it's safe to say that this is a quantum leap in terms of the original basis of its citation-based search model. All of a sudden it is now providing this multidimensional search that is drawing in way more richness.
Danny Hillis: It still does the old kind of search. So right now when you, let's say I put in museums of New York. You know, "museums in New York." Well, it still does the old keyword search of searching for pages that have the word "museum" and the phrase "New York," but it doesn't—if you say "an exhibition in Manhattan" or something, you might have something that's a museum in New York that actually didn't use the word "museum" and "New York" on the page. But the Knowledge Graph knows that Manhattan is in New York, and it knows that exhibitions are in museums, or may know something is a museum even if it doesn't use the word museum in its title.
And so it's actually able to pick that up even though it's not, it doesn't have the keyword. So that will play into the search results that come up. It does a search that's based on the semantics. And, of course, that's very important because that kind of knowledge is completely language independent too. So the same knowledge that informs your search in English also informs somebody's search in Mandarin or Hindi or something like that.
So the good news is it's turned out to be really useful. There are these big representations of knowledge. But the bad news is the whole idea of it being this free, open thing that everybody was going to use has actually become really just something that is a competitive advantage of Google, and now other search engines and other companies will make their own I'm sure. Apple is working on it, Amazon, you know. Each of the big companies – IBM, Microsoft. They'll each work on their own database. So the world could go in one of two directions: We could either have this sort of oligarchy of big companies that have giant knowledge bases that they use for proprietary advantage, or it could flip over and say it becomes a public resource, that we could say "We want knowledge to be a public resource. And we want, in particular, knowledge that's tied to who said what," because this is not, it doesn't represent truth, remember! It represents who said stuff and that becomes then a resource for doing things like sorting out what's fake news or deciding what medical treatments, what effects are in the scientific literature, things like that that really don't align very well with commercial goals.
Peter Hopkins: And this is where Underlay comes in. Underlay in many respects is your attempt to kind of reclaim this technology as the public good that you kind of initially envisioned it as.
Danny Hillis: Yes, it's my penance for having sold the other one to Google.
Peter Hopkins: So I'm actually stuck on the screen here. I thought there was a very nice paragraph on the very simple Underlay website, which basically in written terms explains kind of what it's attempting to do. And it says The Underlay aggregates statements and reported observations, along with citations of who made and who published them. For example, it would not contain the bare assertion that "Sudan's population was 39M in 2008", but rather that "Sudan's population was 'provisionally' 39M in 2008, according to the UN's statistics division in 2011, referencing Sudan's national census, as reported by its Central Bureau of Statistics, and as contested by the Southern People's Liberation Movement."
Danny Hillis: And it would do that not in those words, but in a kind of machine-readable.
Peter Hopkins: Right. So that those could be – and ultimately this version of what you are going at becomes almost a kind of record of all of these observations over time, and then can be tracked. So if we wanted to get to the heart of, let's say, whether in one of these hearings we just watched, somebody said one or the other, we could trace it potentially back to the first recorded incidents.
Danny Hillis: Yes. And if you take a problem like that I would regard that as an application of the Underlay, just like Google Maps and say drawing a map is. But if you take sorting through fake news and recognizing when rumors are getting out of control, in order to do that you really need a very complex representation of who's saying what. So you can kind of trace whether this person said that or this person said that this person said that. Or the New York Times said that, you know, the Drudge Report said that. And so there is something that needs to be built on top of the Underlay that is essentially a network of trust for that purpose. So somebody has to say well, okay, I trust New York Times more than I trust Fox News or vice versa.
Peter Hopkins: And these would be organizations or individuals with some sort of framework of analysis that would leverage the Underlay for interpretative purposes.
Danny Hillis: And it's going to be for different purposes. I mean an awful lot of the things that people argue about—I mean, is Taiwan a province of China? Well, you know, if you're doing something with the Chinese government you've got to count it as one. If you're doing something with Taiwan you're probably not going to count it. So for some purposes it "is", for some purposes it "isn't". And so what's the truth of that? Well there isn't exactly a truth. It's, you know, what's the purpose, what's the trust in it? and so on. And many of these – so I sort of feel like the Underlay is, in some sense it's a piece of the plumbing that we need to deal with the fact that the amount of information has become overwhelming, that no human can hold it all in their heads. Nobody can be sort of familiar with all the news sources or things like that. And then that lets us build these things on top of it where computers help us be smarter in sort of navigating these networks of trust.
Peter Hopkins: And so you were conceiving of this challenge—This is in the mid, early 2000s and what was the first inklings of an approach that technology could provide to addressing this, and to kind of capturing the chain, if you will, of custody of information.
Danny Hillis: So the idea was to build something that basically said what the agreed on the things that you were talking about, the entities that you were talking about—Let people make statements about the relationships between them but then have some provenance of who made those statements, so that instead of recording that "the glass is sitting on the table," you record, "Danny said the glass is sitting on the table on such and such a day." And then once you have all that information recorded then that lets you, first of all it lets you record information without worrying to much about whether it's true. It's true that I said that, which is much easier to determine than whether it's true that the glass is actually on the table. But then it also lets you apply basically your idea of trust afterwards, after you get more information about who I am—or later you find out I'm a liar or later you find out the glass was someplace else.
Peter Hopkins: You can weigh those previous recordings against it.
Danny Hillis: Exactly. So the idea is that what we really need to do is we need to separate up two things.
We need to separate the record of what different people said and who said it—the provenance of what was said—And then separately have in some sense a network of trust which is going to be different for different purposes.
Ultimately there's lots of kinds of knowledge that I think really are fundamentally part of the public common, the public good. And I hope that those will end up in it, and I think it's not as complicated as copyright law where you're taking the expression of the individual artist and things like that. A fact is a fact. It's not copyrightable, to own truth. If somebody figures out the geographical location of this building, that's just a truth. Nobody owns that. And, really, it's to everybody's advantage to share that.
- In 2005, Danny Hillis co-founded Freebase, an open-source knowledge database that was acquired by Google in 2010. Freebase formed the foundation of Google's famous Knowledge Graph, which enhances its search engine results and powers Google Assistant and Google Home.
- Hillis is now building The Underlay, a new knowledge database and future search engine app that is meant to serve the common good rather than private enterprise. He calls it his "penance for having sold the other one to Google."
- Powerful collections of machine-readable knowledge are becoming exceedingly important, but most are privatized and serve commercial goals.
- Decentralizing knowledge and making information provenance transparent will be a revolution in the so-called "post-truth age". The Underlay is being developed at MIT by Danny Hillis, SJ Klein, Travis Rich.
One way to limit clutter is by being mindful of your spending.
- Overbuyers are people who love to buy — they stockpile things as a result. These are individuals who are prone to run out of space in trying to store their stuff and they may even lose track of what — and how much of what — they have.
- One way overbuyers can limit their waste, both money and space wise, is by storing items at the store, and then buy them when they really need them.
- Underbuyers tend to go to extraordinary lengths to not buy things. They save money and do fewer errands, however, they often make do with shabby personal items. They may also, when they finally decide to go out to buy a product, go without entirely because the item may no longer be available.
Tracking project establishes northern Argentina is wintering ground of Swainson's hawks
- Watch these six dots move across the map and be moved yourself: this is a story about coming of age, discovery, hardship, death and survival.
- Each dot is a tag attached to the talon of a Swainson's Hawk. We follow them on their very first migration, from northern California all the way down to Argentina.
- After one year, only one is still alive.
Discovered: destination Argentina
Young Swainson's hawks were found to migrate to northern Argentina
The Buteo swainsoni is a slim, graceful hawk that nests from the Great Plains all the way to northern California.
It feeds mainly on insects, but will also prey on rodents, snakes and birds when raising their young. These learn to fly about 45 days after hatching but may remain with their parents until fall migration, building up flying skills and fat reserves.
A common sight in summer over the Prairies and the West, Swainson's hawks disappear every autumn. While it was assumed they migrated south, it was long unclear precisely where they went.
A group of researchers that has been studying raptors in northern California for over 40 years has now established exactly where young Swainson's hawks go in winter. The story of their odyssey, summarised in a 30-second clip (scroll down), is both amazing and shocking.
Harnessing the hawks
A Swainson's hawk, with tracking device.
The team harnessed six Swainson's hawks in July, as they were six weeks old and just learning to fly. The clip covers 14 months, until next August – so basically, the first year of flight.
Each harness contains a solar-powered tracker and weighs 20 grams, which represents just 3% of the bird's body weight. To minimise the burden, only females were harnessed: as with most raptors, Swainson's hawk females generally are bigger than males.
The first shock occurs just one month (or about 2.4 seconds) from the start of the clip: the first dot disappears. The first casualty. A fledgling no more than two months old, who never made it further than 20 miles from its nest.
By that time, the remaining five are well on their way, clustering around the U.S.-Mexico border in Texas. Swainson's hawks usually travel at around 40 mph (65 km/h) but can almost double that speed when they're stooping (i.e. dive down, especially when attacking prey).
There's a strong genetic component to migration. As usual, the Germans have nice single word to summarise this complex concept: Zugunruhe ('tsook-n-roowa'), literally: 'migration unrest' (1). It denotes the seasonal urge of migratory animals – especially birds – to get on their way. Zugunruhe exhibits especially as restless behaviour around nightfall. The number of nights on which it occurs is apparently higher if the distance to be travelled is longer.
The birds may have the urge to go south, but genetics doesn't tell them the exact route. They have to find that out by trial and error. Hence the circling about by the specimens in this clip: they're getting a sense of where to find food and which direction to go. Their migratory paths will be refined by experience – if they're lucky enough to survive that long.
Each bird flies solo: their paths often strongly diverge, and if they seem to meet up occasionally, that's just an illusion: even when the dots are close together, they can still be dozens if not hundreds of miles apart.
Panama snack stop
The Central American isthmus is a major bird migration corridor
They generally follow the same route as it is the path of least resistance: follow mountain ranges, stay over land. Like most raptors, Swainson's hawks migration paths are land-based: not just so they can roost at night, but mainly to benefit from the thermals and updrafts to keep them aloft. That reduces the need to flap wings, and thus their energy spend – even though the trip will take longer that way.
As this clip demonstrates, the land-migration imperative means the Central American isthmus is a hotspot for bird migration. Indeed, Panama and Costa Rica are favourite destinations for bird watchers, when the season's right. A bit to the north, Veracruz in Mexico is another bird migration hotspot.
It's thought most hawks don't eat at all on migration. This clip shows an exception to that rule: on the way back, one bird takes an extended stopover of a couple of weeks in Panama, probably spending its time there foraging for food.
So, when they finally arrive in northern Argentina, after 6 to 8 weeks' migration, the hawks are pretty famished. Until a few decades ago, they fed on locusts. For their own reasons, local farmers have been getting rid of those. The hawks now concentrate on grasshoppers, and basically anything else that's edible.
For first-time visitors, finding what they need is not easy. Three of the five dots go dark. These birds probably died from starvation. But two birds thrive: they roam the region until winter rears its head in South America, and it's time to head back north again, where summer is getting under way.
Both dots make it back across the border, but unfortunately, right at the end of the clip, one of the surviving two birds expires.
Harsh, but not unusual
This old lady is 27 years old, but still nesting.
While a one-in-six survival rate may seem alarmingly harsh, it's not that unusual. First-year mortality for Swainson's Hawks is between 50% and 80%. Disease, starvation, predators and power lines – to name just a few common causes of death - take out a big number.
Only 10% to 15% of the young 'uns make it past their third or fourth year into adulthood, but from then on, annual survival rates are much better: around 90%. Adult Swainson's Hawks can expect to live into their low teens. There's one documented example of a female Swainson's Hawk in the wild who was at least 27 years old (and still nesting!)
The Californian population of Swainson's Hawks plummeted by about 90% at the end of last century but is now again increasing well. The monitoring project that produced this clip has been going for about four decades but is seeing its funding dry up. Check them out and consider supporting them (see details below).
Migration trajectory of B95, the 'Moonbird'.
Not all migrating birds shun the ocean. Here's an incredible map of an incredible migration path that's even longer than that of the Swainson's hawks.
In February 1995, a red knot (Calidris canutus rufa) in Tierra del Fuego (southern Argentina) was banded with the tag B95. That particular bird, likely born in 1993, was recaptured at least three times and resighted as recently as May 2014, in the Canadian Arctic.
B95 is more commonly known as 'Moonbird', because the length of its annual migration (app. 20,000 miles; 32,000 km) combined with its extreme longevity (if still alive, it's 25-26 years old now) means its total lifetime flight exceeds the distance from the Earth to the Moon.
As many other shorebirds do, the red knot takes the Atlantic Flyway hugging the coastline and crossing to South America via the ocean.
B95 has become the poster bird of conservationists in both North and South America. A book titled Moonbird: A Year on the Wind with the Great Survivor B95 (2012) received numerous awards, B95 has a statue in Mispillion Harbor on Delaware Bay and the City of Rio Grande on Tierra del Fuego has proclaimed B95 its natural ambassador.
Perhaps one day the nameless Swainson's Hawks in this clip, fallen in service of their ancestral instincts – against the odds of human increasing interference – will receive a similar honour.
Strange Maps #965
Got a strange map? Let me know at email@example.com.
(1) 'Zug' is a wonderfully polyvalent German word. It can mean: a train, a chess move, a characteristic, a stroke, a draft (of a plan), a gulp (of air), a drag (from a cigarette), a swig (from a bottle), and more.
International poker champion Liv Boeree teaches decision-making for Big Think Edge.
SMARTER FASTER trademarks owned by The Big Think, Inc. All rights reserved.