Stephen Wolfram is a distinguished scientist, inventor, author, and business leader. Born in London in 1959, Wolfram was educated at Eton, Oxford, and Caltech. He published his first scientific paper at the age of 15, and had received his PhD in theoretical physics from Caltech by the age of 20. Having started to use computers in 1973, Wolfram rapidly became a leader in the emerging field of scientific computing, and in 1979 he began the construction of SMP—the first modern computer algebra system—which he released commercially in 1981. In recognition of his early work in physics and computing, Wolfram became in 1981 the youngest recipient of a MacArthur Prize Fellowship.
That same year, Wolfram set out on an ambitious new direction in science aimed at understanding the origins of complexity in nature. Through the mid-1980s, Wolfram continued this work, discovering a number of fundamental connections between computation and nature, and inventing such concepts as computational irreducibility. Following his scientific work on complex systems research, in 1986 Wolfram founded the first research center and the first journal in the field, "Complex Systems."
In 1987, Wolfram launched Wolfram Research, Inc., which soon distinguished itself as a premier software company with the release of the first version of "Mathematica." A major advance in computing, "Mathematica" is a computational software program used in science, mathematics, and engineering.
By the mid-1990s his discoveries led him to develop a fundamentally new conceptual framework, which he then spent the remainder of the 1990s applying not only to new kinds of questions, but also to many existing foundational problems in physics, biology, computer science, mathematics, and several other fields. And after more than ten years of highly concentrated work, Wolfram finally described his achievements in his 1200-page book "A New Kind of Science."
Building on these previous projects, Wolfram in May 2009 launched Wolfram|Alpha—an ambitious, long-term project to make as much of the world's knowledge as possible computable, and accessible to everyone.
Question: How does Wolfram Alpha work?
Stephen Wolfram: Well first thing is that we’ve collected a huge amount of data and we’ve curated this data, so it’s data where we’ve gone typically to sort of primary sources of data and made arrangements to get either the static data or a feed of data that comes in every minute, every second, every hour, whatever and we’ve organized this data so that it is sort of set up to be clean and computable, so that is sort of the first component is all this data that exists in the system, whether it’s about chemicals of countries or foods or nuclear isotopes or whatever else. Or financial data about companies or whatever, so all this data.
Then sort of the second piece is given this data how do we figure things out? How do we compute things? It’s like you know if you ask a suitable scientist you know can you figure out for me you know something like where will the sun be at a particular time of day at a particular place on the earth. You know if they can do their physics correctly they’ll eventually be able to figure out the answer. Or if you have this particular level of some substance in you know blood, what percentile of the distribution does this correspond to or what does that mean for the probability of this or that thing.
So these are things which sort of in principle can be computed if you find the right expert. What we’ve tried to do is to actually accumulate all of the algorithms necessary to do those computations. Typically that is done by talking to the world’s experts in these areas and encapsulating the knowledge that we get in the form of algorithms. Well, then, another aspect of this is: "So okay, so we know all the stuff. We can compute all these things. How does a typical person who walks up to Wolfram Alpha…? How do they communicate with it?" So that is then another big challenge is to be able to take sort of the natural language questions, the kind of the first way that people would think of asking such and such a question and being able to automatically understand that and that is something that wasn’t clear whether it was going to possible.
People have been trying to do kind of natural language processing with computers for decades and there has only been sort of slow progress in that in general. It turned out the problem we had to solve is sort of the reverse of the problem people usually have to solve. People usually have to solve the problem of you’re given you know thousands, millions of pages of text, go have the computer understand this. Our problem was: we can compute a certain set of things and then we’re given this very short sort of utterance that the human has fed us and we have to ask the question can we understand that utterance and map it into the things that we can compute, so it’s turned out somewhat to my surprise actually that we’ve been able using a bunch of ideas about sort of computational linguistics and so on to make really great progress in being able to let one sort of type in things in whatever form one first thinks of them and having the computer understand them.
And then finally one has to figure out so there are all these things we can compute. We’ve understood a question that has been asked. There is all sorts of things that we can give as the answer. Which parts of the things that we might give as the answer should we actually present? How do we sort of present the right graphical tabular and so on results to actually communicate it in an effective way with humans who are using the system?
So you know all the pieces we’ve managed to sort of pull them together and it all sort of has to be connected to sort of a big super-computer-type system, sort of big crunchy software engineering necessary to actually deliver results quickly to people on the Web and so on, but sort of the main objective here is sort of collect all this knowledge from our civilization, make it as much as possible computable, get it to the point where, sort of, if there is question that could be answered by some expert our system will be able to automatically answer that question. So one is asking sort of a one might ask a very specific question you know, “What will be the value of this particular… I don’t know… annuity after this amount with these particular dollar values based on this thing about interest rates or whatever else it is?” So that is something very specific. It’s not sort of, give me a general essay about such and such a topic. It’s I’ve got a specific question. Give me a specific answer.
Recorded July 26, 2010
Interviewed by Max Miller
It’s a sad but true fact that most data that’s generated or collected—even with considerable effort—never gets any kind of serious analysis.